https://github.com/alirezasaharkhiz9/final-project-cs50p
CS50P Final Project: Automated Machine Learning
https://github.com/alirezasaharkhiz9/final-project-cs50p
automated-machine-learning machine-learning python
Last synced: 3 months ago
JSON representation
CS50P Final Project: Automated Machine Learning
- Host: GitHub
- URL: https://github.com/alirezasaharkhiz9/final-project-cs50p
- Owner: alirezasaharkhiz9
- Created: 2024-09-12T17:32:26.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-10-23T08:14:34.000Z (8 months ago)
- Last Synced: 2025-02-02T23:09:05.198Z (5 months ago)
- Topics: automated-machine-learning, machine-learning, python
- Language: Python
- Homepage:
- Size: 836 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π Automated Machine Learning with LazyClassifier π
Welcome to Automated Machine Learning with LazyClassifier, a powerful Python project designed to automate the process of selecting, training, and evaluating machine learning modelsβall with just a few lines of code! π
This project simplifies the typical data science workflow by automatically testing and ranking several machine learning models, choosing the best one for your dataset, and providing detailed evaluation metrics. Forget about manual model selectionβlet the machine do the work for you!

## Video Demo:
## π‘ Project Overview
In this project, we leverage the LazyPredict library, specifically the LazyClassifier, to automate the process of:
- Preprocessing your dataset (scaling and splitting).
- Testing and Ranking Models from a wide variety of classifiers.
- Training the best model found.
- Predicting and Evaluating the model's performance with detailed metrics like accuracy and classification reports.Whether you're a data science enthusiast, a machine learning practitioner, or a curious developer, this project will show you how quickly you can build a performant machine learning pipeline with minimal effort.
## π οΈ Key Features
- π **Automated Model Selection**: Automatically test multiple classifiers and choose the best one based on performance.
- π§ **No Manual Tuning Required**: LazyClassifier evaluates models without needing manual hyperparameter tuning.
- π **Detailed Evaluation**: Get classification reports and accuracy metrics to assess the modelβs performance.
- π **Preprocessing Pipeline**: Standardize your dataset with built-in preprocessing.## π Project Structure
```
.
βββ project.py # Main file that implements the automated machine learning pipeline
βββ test_project.py # Unit tests for the key functions in project.py
βββ requirements.txt # List of required libraries (LazyPredict, scikit-learn, etc.)
βββ README.md # This documentation file!
```## π§ How It Works
1. **Preprocessing the Dataset**\
The `preprocess` function automatically scales and splits your data into training and testing sets, making sure the model has high-quality data for training.2. **Model Selection with LazyClassifier**\
The LazyClassifier quickly evaluates multiple classifiers from scikit-learn, ranks them based on performance, and identifies the best model for your data.3. **Model Training**\
Once the best model is selected, the `train_model` function trains it on your dataset, preparing it for making predictions.4. **Making Predictions & Evaluating Performance**\
With the trained model, we use the `predict` function to classify the test set. The performance is then evaluated with the `myAccuracy` function, which provides a detailed classification report and accuracy score.## π¬ Example Workflow
``` bash
# Step 1: Clone the repository and navigate to the project folder
git clone https://github.com/yourusername/automated-ml-lazyclassifier.git
cd automated-ml-lazyclassifier# Step 2: Install required dependencies
pip install -r requirements.txt# Step 3: Run the project
python project.py# Step 4: Run tests (optional, for developers)
pytest test_project.py
```## π Demo
Here's how the project works on the famous Iris dataset:
- The LazyClassifier tests 10+ classifiers in just seconds! β‘
- The top-performing model is automatically selected.
- The model is trained on the training data.
- Finally, it predicts the target for the test data, and the results are evaluated with accuracy and classification reports.**Output Example:**
```
LazyClassifier: Evaluating Classifiers...
Accuracy Balanced Accuracy ROC AUC F1 Score Time Taken
RandomForest 0.9667 0.9649 0.9993 0.9655 0.0283
XGBoost 0.9500 0.9474 0.9987 0.9473 0.0521
AdaBoost 0.9333 0.9283 0.9967 0.9282 0.0391
...
Selected Model: RandomForestClassifier
```**Classification Report:**
```
precision recall f1-score supportIris-setosa 1.00 1.00 1.00 10
Iris-versicolor 0.92 1.00 0.96 11
Iris-virginica 1.00 0.92 0.96 9accuracy 0.97 30
macro avg 0.97 0.97 0.97 30
weighted avg 0.97 0.97 0.97 30
```## π§ͺ Unit Tests
Unit tests are provided in `test_project.py` to ensure each core function works as expected. These tests include:
- Testing the data preprocessing.
- Ensuring that the best model is selected correctly.
- Verifying that model training and prediction works without errors.## π§ Want to Learn More?
Interested in expanding the project? Here are a few ideas:
- ποΈββοΈ **Expand the Dataset**: Try using a different dataset from sklearn or your own.
- π§ͺ **Advanced Testing**: Add more complex unit tests or performance evaluations.
- π **Model Tuning**: Implement hyperparameter tuning for the selected models.## π» Requirements
To run the project, you'll need:
- Python 3.x
- Required libraries listed in requirements.txt (including LazyPredict and scikit-learn).You can install the dependencies using:
``` bash
pip install -r requirements.txt
```## π€ Contributions
Feel free to open issues or submit pull requests if you have ideas for improving this project. Let's make this project even better, together!
E-mail: [as.alirezasaharkhiz\@gmail.com](mailto:[email protected]){.email} telegram: @alirezasaharkhiz