{"id":49256151,"url":"https://github.com/ismielabir/pycsvdatacleaner","last_synced_at":"2026-04-25T04:00:55.621Z","repository":{"id":289044823,"uuid":"969953674","full_name":"IsmielAbir/PyCSVDataCleaner","owner":"IsmielAbir","description":"A lightweight Python package to clean CSV files","archived":false,"fork":false,"pushed_at":"2025-04-21T08:05:03.000Z","size":8,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-25T03:59:48.413Z","etag":null,"topics":["csv","data-preprocessing","machine-learning","python"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/PyCSVDataCleaner/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/IsmielAbir.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"License","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-21T07:55:56.000Z","updated_at":"2026-03-19T21:36:33.000Z","dependencies_parsed_at":"2025-04-21T08:50:43.465Z","dependency_job_id":null,"html_url":"https://github.com/IsmielAbir/PyCSVDataCleaner","commit_stats":null,"previous_names":["ismielabir/pycsvdatacleaner"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/IsmielAbir/PyCSVDataCleaner","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IsmielAbir%2FPyCSVDataCleaner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IsmielAbir%2FPyCSVDataCleaner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IsmielAbir%2FPyCSVDataCleaner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IsmielAbir%2FPyCSVDataCleaner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/IsmielAbir","download_url":"https://codeload.github.com/IsmielAbir/PyCSVDataCleaner/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IsmielAbir%2FPyCSVDataCleaner/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32249492,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-25T03:17:44.950Z","status":"ssl_error","status_checked_at":"2026-04-25T03:16:45.208Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","data-preprocessing","machine-learning","python"],"created_at":"2026-04-25T04:00:24.810Z","updated_at":"2026-04-25T04:00:55.606Z","avatar_url":"https://github.com/IsmielAbir.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PyCSVDataCleaner\r\n\r\n[![PyPI](https://img.shields.io/pypi/v/PyCSVDataCleaner)](https://pypi.org/project/PyCSVDataCleaner/)\r\n[![Python Version](https://img.shields.io/pypi/pyversions/PyCSVDataCleaner)](https://pypi.org/project/PyCSVDataCleaner/)\r\n\r\n\r\n**PyCSVDataCleaner** is a simple Python package designed to clean CSV files. It helps you preprocess your data by:\r\n- Removing duplicate rows\r\n- Removing rows with missing values\r\n- Removing constant columns\r\n\r\nThe package is easy to use and works with CSV files containing any kind of data. It is ideal for automating the data cleaning process during your machine learning or data analysis workflow.\r\n\r\n---\r\n\r\n## Features\r\n\r\n- **Remove Duplicate Rows**: Automatically removes duplicate rows from the dataset.\r\n- **Remove Rows with Missing Values**: Cleans your dataset by eliminating rows with empty cells.\r\n- **Remove Constant Columns**: Removes columns that contain constant values across all rows.\r\n\r\n## Installation\r\n\r\nYou can install **PyCSVDataCleaner** via pip:\r\n\r\n```bash\r\npip install PyCSVDataCleaner\r\n```\r\n\r\n## Usage\r\n\r\n```bash\r\nfrom PyCSVDataCleaner import PyCSVDataCleaner\r\n\r\ninput_file = 'path_to_your_input_file.csv'\r\n\r\noutput_file = 'path_to_your_output_file.csv'\r\n\r\nPyCSVDataCleaner(input_file, output_file)\r\n```\r\n\r\n## Example Output\r\nWhen running the script, you'll get output in the terminal indicating how many rows and columns were removed or cleaned:\r\n\r\n```bash\r\nCleaning file: fine_name.csv\r\n\r\n--- Initial Data Info ---\r\nRows (excluding header): 129971\r\nColumns: 14\r\nRemoved 0 duplicate rows.\r\nRemoved 107584 rows with missing values.\r\nRemoved 1 constant columns.\r\n\r\n--- Cleaning Done ---\r\nFinal Rows: 22387\r\nFinal Columns: 13\r\n```\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fismielabir%2Fpycsvdatacleaner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fismielabir%2Fpycsvdatacleaner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fismielabir%2Fpycsvdatacleaner/lists"}