https://github.com/alirezasaharkhiz9/final-project-cs50p

CS50P Final Project: Automated Machine Learning
https://github.com/alirezasaharkhiz9/final-project-cs50p

automated-machine-learning machine-learning python

Last synced: 3 months ago
JSON representation

CS50P Final Project: Automated Machine Learning

Host: GitHub
URL: https://github.com/alirezasaharkhiz9/final-project-cs50p
Owner: alirezasaharkhiz9
Created: 2024-09-12T17:32:26.000Z (10 months ago)
Default Branch: main
Last Pushed: 2024-10-23T08:14:34.000Z (8 months ago)
Last Synced: 2025-02-02T23:09:05.198Z (5 months ago)
Topics: automated-machine-learning, machine-learning, python
Language: Python
Homepage:
Size: 836 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# 🚀 Automated Machine Learning with LazyClassifier 🌟

Welcome to Automated Machine Learning with LazyClassifier, a powerful Python project designed to automate the process of selecting, training, and evaluating machine learning models—all with just a few lines of code! 🎉

This project simplifies the typical data science workflow by automatically testing and ranking several machine learning models, choosing the best one for your dataset, and providing detailed evaluation metrics. Forget about manual model selection—let the machine do the work for you!

![cs50p](./CS50P.png)

## Video Demo:

## 💡 Project Overview

In this project, we leverage the LazyPredict library, specifically the LazyClassifier, to automate the process of:

- Preprocessing your dataset (scaling and splitting).
- Testing and Ranking Models from a wide variety of classifiers.
- Training the best model found.
- Predicting and Evaluating the model's performance with detailed metrics like accuracy and classification reports.

Whether you're a data science enthusiast, a machine learning practitioner, or a curious developer, this project will show you how quickly you can build a performant machine learning pipeline with minimal effort.

## 🛠️ Key Features

- 🔍 **Automated Model Selection**: Automatically test multiple classifiers and choose the best one based on performance.
- 🧠 **No Manual Tuning Required**: LazyClassifier evaluates models without needing manual hyperparameter tuning.
- 📊 **Detailed Evaluation**: Get classification reports and accuracy metrics to assess the model’s performance.
- 🚀 **Preprocessing Pipeline**: Standardize your dataset with built-in preprocessing.

## 📂 Project Structure

```
.
├── project.py # Main file that implements the automated machine learning pipeline
├── test_project.py # Unit tests for the key functions in project.py
├── requirements.txt # List of required libraries (LazyPredict, scikit-learn, etc.)
└── README.md # This documentation file!
```

## 🔧 How It Works

1. **Preprocessing the Dataset**\
The `preprocess` function automatically scales and splits your data into training and testing sets, making sure the model has high-quality data for training.

2. **Model Selection with LazyClassifier**\
The LazyClassifier quickly evaluates multiple classifiers from scikit-learn, ranks them based on performance, and identifies the best model for your data.

3. **Model Training**\
Once the best model is selected, the `train_model` function trains it on your dataset, preparing it for making predictions.

4. **Making Predictions & Evaluating Performance**\
With the trained model, we use the `predict` function to classify the test set. The performance is then evaluated with the `myAccuracy` function, which provides a detailed classification report and accuracy score.

## 🎬 Example Workflow

``` bash
# Step 1: Clone the repository and navigate to the project folder
git clone https://github.com/yourusername/automated-ml-lazyclassifier.git
cd automated-ml-lazyclassifier

# Step 2: Install required dependencies
pip install -r requirements.txt

# Step 3: Run the project
python project.py

# Step 4: Run tests (optional, for developers)
pytest test_project.py
```

## 🚀 Demo

Here's how the project works on the famous Iris dataset:

- The LazyClassifier tests 10+ classifiers in just seconds! ⚡
- The top-performing model is automatically selected.
- The model is trained on the training data.
- Finally, it predicts the target for the test data, and the results are evaluated with accuracy and classification reports.

**Output Example:**

```
LazyClassifier: Evaluating Classifiers...
Accuracy Balanced Accuracy ROC AUC F1 Score Time Taken
RandomForest 0.9667 0.9649 0.9993 0.9655 0.0283
XGBoost 0.9500 0.9474 0.9987 0.9473 0.0521
AdaBoost 0.9333 0.9283 0.9967 0.9282 0.0391
...
Selected Model: RandomForestClassifier
```

**Classification Report:**

```
precision recall f1-score support

Iris-setosa 1.00 1.00 1.00 10
Iris-versicolor 0.92 1.00 0.96 11
Iris-virginica 1.00 0.92 0.96 9

accuracy 0.97 30
macro avg 0.97 0.97 0.97 30
weighted avg 0.97 0.97 0.97 30
```

## 🧪 Unit Tests

Unit tests are provided in `test_project.py` to ensure each core function works as expected. These tests include:

- Testing the data preprocessing.
- Ensuring that the best model is selected correctly.
- Verifying that model training and prediction works without errors.

## 🧠 Want to Learn More?

Interested in expanding the project? Here are a few ideas:

- 🏋️‍♂️ **Expand the Dataset**: Try using a different dataset from sklearn or your own.
- 🧪 **Advanced Testing**: Add more complex unit tests or performance evaluations.
- 🛠 **Model Tuning**: Implement hyperparameter tuning for the selected models.

## 💻 Requirements

To run the project, you'll need:

- Python 3.x
- Required libraries listed in requirements.txt (including LazyPredict and scikit-learn).

You can install the dependencies using:

``` bash
pip install -r requirements.txt
```

## 🤝 Contributions

Feel free to open issues or submit pull requests if you have ideas for improving this project. Let's make this project even better, together!

E-mail: [as.alirezasaharkhiz\@gmail.com](mailto:[email protected]){.email} telegram: @alirezasaharkhiz

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/alirezasaharkhiz9/final-project-cs50p

Awesome Lists containing this project

README