This is part two of the new Data Science Project from Scratch Series. In this video I go through how to setup a github repo and collect data for your own data science project.
github repo for this project: github.com/PlayingNumbers/ds_salary_proj
How to set up data science environment: • How to Set Up Your Data Science Environmen...
Chrome Driver Link: chromedriver.chromium.org/
Data collection can be a tedious and frustrating process, but you don't necessarily have to start from scratch. You should search github to see if someone has already built a web scraper for the website you are looking at. You should also check to see if the website has an open API.
For this project, I found someone who had written a glassdoor web scraper and I was able to version it for our purposes. The code and the article that I used are linked here:
Code: github.com/arapfaik/scraping-glassdoor-selenium
Article: towardsdatascience.com/selenium-tutorial-scraping-…
This web scraper was written in python with the selinium package.
This is an iterative process, so in this video you will see how I go about debugging my code and making it work.
Stay tuned for part 3 where I go through and clean up the data that we collected to make it usable for our EDA and model building.
After we scrape the data, I save the code to github.
Project from scratch playlist: • Data Science Project from Scratch - Part 1...
My other project playlist: • Data Science Projects
#DataScience #KenJee #DataScienceProject
⭕ Subscribe: youtube.com/c/kenjee1?sub_confirmation=1
🎙 Listen to My Podcast: youtube.com/c/KensNearestNeighborsPodcast
🕸 Check out My Website - kennethjee.com/
✍️Sign up for My Newsletter - www.kennethjee.com/newsletter
📚 Books and Products I use - www.amazon.com/shop/kenjee (affiliate link)
Partners & Affiliates
🌟 365 Data Science - Courses ( 57% Annual Discount): 365datascience.pxf.io/P0jbBY
🌟 Interview Query - www.interviewquery.com/?ref=kenjee
MORE DATA SCIENCE CONTENT HERE:
🐤My Twitter - twitter.com/KenJee_DS
👔 LinkedIn - www.linkedin.com/in/kenjee/
📈 Kaggle - www.kaggle.com/kenjee
📑 Medium Articles - medium.com/@kenneth.b.jee
💻 Github - github.com/PlayingNumbers
🏀 My Sports Blog -www.playingnumbers.com/
Check These Videos Out Next!
My Leaderboard Project: • I Built the FIRST EVER YouTube Subscriber ...
66 Days of Data: • What is the #66DaysOfData?
How I Would Learn Data Science in 2021: • How I Would Learn Data Science in 2021 (Wh...
My Playlists
Data Science Beginners: • Data Science Beginners
Project From Scratch: • Data Science Project from Scratch - Part 1...
Kaggle Projects: • Kaggle Projects
コメント