https://github.com/PlayingNumbers/Ken_Portfolio
Example data science portfolio
https://github.com/PlayingNumbers/Ken_Portfolio
Last synced: 4 months ago
JSON representation
Example data science portfolio
- Host: GitHub
- URL: https://github.com/PlayingNumbers/Ken_Portfolio
- Owner: PlayingNumbers
- Created: 2020-05-08T15:39:05.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-05-21T10:16:56.000Z (over 2 years ago)
- Last Synced: 2024-12-04T00:32:59.000Z (12 months ago)
- Size: 37.1 KB
- Stars: 50
- Watchers: 5
- Forks: 108
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- jimsghstars - PlayingNumbers/Ken_Portfolio - Example data science portfolio (Others)
README
# Ken_Portfolio
Example data science portfolio
# [Project 1: Data Science Salary Estimator](https://github.com/PlayingNumbers/ds_salary_proj)
* Created a tool that estimates data science salaries (MAE ~ $ 11K) to help data scientists negotiate their income when they get a job.
* Scraped over 1000 job descriptions from glassdoor using python and selenium
* Engineered features from the text of each job description to quantify the value companies put on python, excel, aws, and spark.
* Optimized Linear, Lasso, and Random Forest Regressors using GridsearchCV to reach the best model.
* Built a client facing API using flask

# [Project 2: Ball Image Classifier](https://github.com/PlayingNumbers/ball_image_classifier)
For this example project I built a ball classifier to identify balls from different sports. This could be useful for someone who is new to sports from a certain country. They could take a picture of a ball and an app could serve them some information about the history and rules of the game. This is the underlying model for building something with those capabilities.
I was able to get the model to predict the sport of the ball with 94% accuracy after minimal tuning. For most of the cases this would meet the need of an end user of the app. To get these results I used transfer learning on a CNN trained on resnet34. This created time efficiencies and solid results.
