https://github.com/ismielabir/pycsvdatacleaner
A lightweight Python package to clean CSV files
https://github.com/ismielabir/pycsvdatacleaner
csv data-preprocessing machine-learning python
Last synced: 2 months ago
JSON representation
A lightweight Python package to clean CSV files
- Host: GitHub
- URL: https://github.com/ismielabir/pycsvdatacleaner
- Owner: IsmielAbir
- License: mit
- Created: 2025-04-21T07:55:56.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-21T08:05:03.000Z (about 1 year ago)
- Last Synced: 2026-04-25T03:59:48.413Z (2 months ago)
- Topics: csv, data-preprocessing, machine-learning, python
- Language: Python
- Homepage: https://pypi.org/project/PyCSVDataCleaner/
- Size: 7.81 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: License
Awesome Lists containing this project
README
# PyCSVDataCleaner
[](https://pypi.org/project/PyCSVDataCleaner/)
[](https://pypi.org/project/PyCSVDataCleaner/)
**PyCSVDataCleaner** is a simple Python package designed to clean CSV files. It helps you preprocess your data by:
- Removing duplicate rows
- Removing rows with missing values
- Removing constant columns
The package is easy to use and works with CSV files containing any kind of data. It is ideal for automating the data cleaning process during your machine learning or data analysis workflow.
---
## Features
- **Remove Duplicate Rows**: Automatically removes duplicate rows from the dataset.
- **Remove Rows with Missing Values**: Cleans your dataset by eliminating rows with empty cells.
- **Remove Constant Columns**: Removes columns that contain constant values across all rows.
## Installation
You can install **PyCSVDataCleaner** via pip:
```bash
pip install PyCSVDataCleaner
```
## Usage
```bash
from PyCSVDataCleaner import PyCSVDataCleaner
input_file = 'path_to_your_input_file.csv'
output_file = 'path_to_your_output_file.csv'
PyCSVDataCleaner(input_file, output_file)
```
## Example Output
When running the script, you'll get output in the terminal indicating how many rows and columns were removed or cleaned:
```bash
Cleaning file: fine_name.csv
--- Initial Data Info ---
Rows (excluding header): 129971
Columns: 14
Removed 0 duplicate rows.
Removed 107584 rows with missing values.
Removed 1 constant columns.
--- Cleaning Done ---
Final Rows: 22387
Final Columns: 13
```