An open API service indexing awesome lists of open source software.

https://github.com/lkethridge/integrated_project_2

Integrated Project 2 from TripleTen
https://github.com/lkethridge/integrated_project_2

anomaly-detection cross-validation data-analytics data-cleaning-and-preprocessing data-science feature-engineering gold-recovery machine-learning metal-purification model-evaluation pandas portfolio-project python scikit-learn smape supervised-learning

Last synced: 2 months ago
JSON representation

Integrated Project 2 from TripleTen

Awesome Lists containing this project

README

          

# Integrated_Project_2
## *This was an Integrated skill project for TripleTen. ๐Ÿ‘ฉ๐Ÿฝโ€๐Ÿ’ป*
This project developed a machine learning solution for predicting gold recovery at the rougher and final stages of ore processing using datasets with over 80 parameters. A Multi-Output Random Forest Regression model provided the most accurate predictions during training, with Linear Regression as a viable, less computationally intensive alternative. Despite underperforming compared to constant benchmarks on the test set, the models demonstrate the potential for data-driven optimization of industrial processes.
## Skills Highlighted
๐Ÿ Python
๐Ÿ‘ฉ๐Ÿฝโ€๐Ÿ’ป Data Science
๐Ÿค– Machine Learning
๐Ÿงช Scikit Learn
โŒ Cross Validation
๐Ÿผ pandas
๐Ÿ“Š Data Analytics
๐Ÿ‘€ Supervised Learning
โš™๏ธ Feature Engineering
๐Ÿ’ฏ Model Evaluation
๐Ÿ•ต๐Ÿฝโ€โ™€๏ธ Anomaly Detection
๐Ÿงผ Data Cleaning and Preprocessing
## Installation & Usage
* This project uses pandas, numpy, RandomForestRegressor, MultiOutputRegressor, LinearRegression, mean_squared_error, mean_absolute_error, make_scorer, matplotlib.pyplot, shuffle, StandardScaler, seaborn, SimpleImputer, cross_val_score, KFold, and RandomizedSearchCV. It requires python 3.9.6. There is one additional file containing the full, unsplit test set that I was unable to upload due to upload limitations.