Loading...
「ツール」は右上に移動しました。
利用したサーバー: wtserver2
3738いいね 144368回再生

Data Science Project from Scratch - Part 2 (Data Collection)

This is part two of the new Data Science Project from Scratch Series. In this video I go through how to setup a github repo and collect data for your own data science project.

github repo for this project: github.com/PlayingNumbers/ds_salary_proj
How to set up data science environment:    • How to Set Up Your Data Science Environmen...  
Chrome Driver Link: chromedriver.chromium.org/

Data collection can be a tedious and frustrating process, but you don't necessarily have to start from scratch. You should search github to see if someone has already built a web scraper for the website you are looking at. You should also check to see if the website has an open API.

For this project, I found someone who had written a glassdoor web scraper and I was able to version it for our purposes. The code and the article that I used are linked here:
Code: github.com/arapfaik/scraping-glassdoor-selenium
Article: towardsdatascience.com/selenium-tutorial-scraping-…

This web scraper was written in python with the selinium package.

This is an iterative process, so in this video you will see how I go about debugging my code and making it work.

Stay tuned for part 3 where I go through and clean up the data that we collected to make it usable for our EDA and model building.

After we scrape the data, I save the code to github.

Project from scratch playlist:    • Data Science Project from Scratch - Part 1...  
My other project playlist:    • Data Science Projects  

#DataScience #KenJee #DataScienceProject

⭕ Subscribe: youtube.com/c/kenjee1?sub_confirmation=1
🎙 Listen to My Podcast: youtube.com/c/KensNearestNeighborsPodcast
🕸 Check out My Website - kennethjee.com/
✍️Sign up for My Newsletter - www.kennethjee.com/newsletter
📚 Books and Products I use - www.amazon.com/shop/kenjee (affiliate link)

Partners & Affiliates
🌟 365 Data Science - Courses ( 57% Annual Discount): 365datascience.pxf.io/P0jbBY
🌟 Interview Query - www.interviewquery.com/?ref=kenjee

MORE DATA SCIENCE CONTENT HERE:
🐤My Twitter - twitter.com/KenJee_DS
👔 LinkedIn - www.linkedin.com/in/kenjee/
📈 Kaggle - www.kaggle.com/kenjee
📑 Medium Articles - medium.com/@kenneth.b.jee
💻 Github - github.com/PlayingNumbers
🏀 My Sports Blog -www.playingnumbers.com/

Check These Videos Out Next!
My Leaderboard Project:    • I Built the FIRST EVER YouTube Subscriber ...  
66 Days of Data:    • What is the #66DaysOfData?  
How I Would Learn Data Science in 2021:    • How I Would Learn Data Science in 2021 (Wh...  

My Playlists
Data Science Beginners:    • Data Science Beginners  
Project From Scratch:    • Data Science Project from Scratch - Part 1...  
Kaggle Projects:    • Kaggle Projects  

コメント