{"id":27169175,"url":"https://github.com/codehariom/data-cleaning-in-python-of-airbnb","last_synced_at":"2025-04-09T06:31:42.064Z","repository":{"id":213468092,"uuid":"734190612","full_name":"codehariom/Data-Cleaning-In-Python-of-Airbnb","owner":"codehariom","description":"Data Cleaning In Python","archived":false,"fork":false,"pushed_at":"2023-12-21T05:05:41.000Z","size":3261,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2023-12-21T07:50:44.370Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/codehariom.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-12-21T04:57:11.000Z","updated_at":"2023-12-21T07:50:47.561Z","dependencies_parsed_at":"2023-12-21T07:50:47.239Z","dependency_job_id":"60e4abb0-091c-45a5-987a-7e3256a9bf9e","html_url":"https://github.com/codehariom/Data-Cleaning-In-Python-of-Airbnb","commit_stats":null,"previous_names":["codehariom/data-cleaning-in-python-of-airbnb"],"tags_count":0,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codehariom%2FData-Cleaning-In-Python-of-Airbnb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codehariom%2FData-Cleaning-In-Python-of-Airbnb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codehariom%2FData-Cleaning-In-Python-of-Airbnb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codehariom%2FData-Cleaning-In-Python-of-Airbnb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/codehariom","download_url":"https://codeload.github.com/codehariom/Data-Cleaning-In-Python-of-Airbnb/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247991682,"owners_count":21029753,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-09T06:31:41.507Z","updated_at":"2025-04-09T06:31:42.058Z","avatar_url":"https://github.com/codehariom.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Cleaning in Python\n\n## Overview\n\nThis repository provides a collection of Python scripts and notebooks for data cleaning tasks. Data cleaning is a crucial step in the data preprocessing pipeline, ensuring that datasets are accurate, consistent, and ready for analysis or machine learning. This collection covers common data cleaning techniques using Python and popular libraries such as Pandas.\n\n## Table of Contents\n\n1. [Installation](#installation)\n2. [Usage](#usage)\n3. [Examples](#examples)\n4. [Contributing](#contributing)\n5. [License](#license)\n\n## Installation\n\nClone the repository to your local machine:\n\n```bash\ngit clone https://github.com/your-username/data-cleaning-python.git\ncd data-cleaning-python\n```\n\nCreate a virtual environment and install the required dependencies:\n\n```bash\npython -m venv venv\nsource venv/bin/activate  # On Windows, use 'venv\\Scripts\\activate'\npip install -r requirements.txt\n```\n\n## Usage\n\nNavigate to the `scripts` or `notebooks` directory to find various data cleaning scripts and Jupyter notebooks. Each script or notebook focuses on a specific data cleaning task and provides explanations along with code comments.\n\nTo run a script, use the following command:\n\n```bash\npython script_name.py\n```\n\nTo open a Jupyter notebook, use:\n\n```bash\njupyter notebook notebook_name.ipynb\n```\n\nMake sure to adjust paths and filenames according to your dataset.\n\n## Examples\n\n### 1. Handling Missing Values\n\nThe script `handle_missing_values.py` demonstrates techniques for handling missing values in a dataset using Pandas. It covers methods such as dropping missing values, imputing values, and more.\n\n### 2. Removing Duplicates\n\nThe notebook `remove_duplicates.ipynb` provides a step-by-step guide on identifying and removing duplicate rows from a dataset using Pandas.\n\n### 3. Data Type Conversion\n\nIn the script `convert_data_types.py`, you will find examples of converting data types of columns to the appropriate format using Pandas.\n\n## Contributing\n\nContributions are welcome! If you have additional data cleaning techniques, scripts, or notebooks to share, please follow these steps:\n\n1. Fork the repository.\n2. Create a new branch for your feature: `git checkout -b feature-name`.\n3. Commit your changes: `git commit -m 'Add a new feature'`.\n4. Push to the branch: `git push origin feature-name`.\n5. Open a pull request.\n\nPlease make sure to follow the existing coding style and include comments to explain the logic of your code.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodehariom%2Fdata-cleaning-in-python-of-airbnb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodehariom%2Fdata-cleaning-in-python-of-airbnb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodehariom%2Fdata-cleaning-in-python-of-airbnb/lists"}