https://github.com/tansexe/ad-lab

Basics of Data Analysis & ML
https://github.com/tansexe/ad-lab

data-visualization eda ml

Last synced: 11 months ago
JSON representation

Basics of Data Analysis & ML

Host: GitHub
URL: https://github.com/tansexe/ad-lab
Owner: tansexe
License: mit
Created: 2025-01-14T06:01:13.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-03-25T14:36:02.000Z (over 1 year ago)
Last Synced: 2025-03-25T15:41:42.584Z (over 1 year ago)
Topics: data-visualization, eda, ml
Language: Jupyter Notebook
Homepage:
Size: 5.05 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Applications Development Lab

This project is part of the **Applications Development Lab** course in the 6th semester. The project explores data analysis, correlation identification, application of machine learning models, and the creation of an end-to-end machine learning pipeline.

## Project Overview

The main objective of this project is to analyze a dataset, explore different correlations, try various machine learning models, and develop a machine learning pipeline to streamline the entire process.

### Key Highlights:
- **Data Analysis**: Performed exploratory data analysis (EDA) to understand the dataset and its underlying patterns.
- **Correlation Analysis**: Investigated correlations between various features of the dataset and visualized the findings.
- **Machine Learning Models**: Tried several machine learning models to predict target variables and evaluated their performance.
- **ML Pipeline**: Developed a machine learning pipeline to automate the process from data preprocessing to model evaluation.

## Installation

To run this project locally, you need to have the following libraries installed:

- Python 3.x
- Pandas
- Numpy
- Matplotlib
- Seaborn
- Scikit-learn

You can install the required libraries using pip:

```bash
pip install -r requirements.txt
```

## Usage

1. **Data Preprocessing**:
- The dataset is loaded and cleaned.
- Missing values are handled, and categorical variables are encoded.

2. **Exploratory Data Analysis (EDA)**:
- Visualizations are created to explore the data and understand the relationships between features.

3. **Modeling**:
- Various machine learning models, such as Linear Regression, Decision Trees, and Random Forest, are tested.

4. **Machine Learning Pipeline**:
- A pipeline is created to automate data preprocessing, model training, and evaluation.

## Evaluation

The performance of the models is evaluated using appropriate metrics such as accuracy, precision, recall, and F1-score.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- Inspired by the concepts covered in the Applications Development Lab curriculum.
- Special thanks to the course instructors for their support and guidance.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tansexe/ad-lab

Awesome Lists containing this project

README