Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/synthstellar/data-preprocessing-with-python
A data preprocessing repository focused on cleaning, transforming, and preparing datasets for machine learning tasks. It includes functions for handling missing values, scaling, encoding, and feature engineering for improved model performance.
https://github.com/synthstellar/data-preprocessing-with-python
data data-cleaning feature-engineering machine-learning numpy pandas preprocessing python scikit-learn
Last synced: 2 days ago
JSON representation
A data preprocessing repository focused on cleaning, transforming, and preparing datasets for machine learning tasks. It includes functions for handling missing values, scaling, encoding, and feature engineering for improved model performance.
- Host: GitHub
- URL: https://github.com/synthstellar/data-preprocessing-with-python
- Owner: SynthStellar
- Created: 2025-01-26T15:32:49.000Z (8 days ago)
- Default Branch: main
- Last Pushed: 2025-01-27T05:23:55.000Z (8 days ago)
- Last Synced: 2025-01-27T06:24:23.155Z (8 days ago)
- Topics: data, data-cleaning, feature-engineering, machine-learning, numpy, pandas, preprocessing, python, scikit-learn
- Language: Python
- Homepage:
- Size: 5.86 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data Preprocessing with Python
### Overview
This repository includes beginner-friendly examples and exercises for data preprocessing using Python.
The focus is on understanding and applying basic techniques to prepare data for analysis or machine learning models.### Features
- Handling missing values
- Encoding categorical variables
- Normalization and standardization of data
- Feature scaling techniques
- Exploratory data analysis (EDA)
- 5 hands-on coding exercises### Datasets
- Dataset: [Data.csv] - (https://drive.google.com/file/d/1O1t4QuIQkREoVFlohkGQyIWDCSBGmgTI/view?usp=sharing)### Prerequisites
- Python 3.x installed
- Required libraries: `pandas`, `numpy`, `matplotlib`, `scikit-learn`### Skills Demonstrated
- Python basics for data manipulation
- Data preparation techniques for AI/ML