{"id":23557458,"url":"https://github.com/chanmeng666/water-quality-testing-data-analysis","last_synced_at":"2026-04-18T06:38:01.588Z","repository":{"id":266940855,"uuid":"807840766","full_name":"ChanMeng666/water-quality-testing-data-analysis","owner":"ChanMeng666","description":"Statistical analysis and predictive modeling of water quality parameters using Python, pandas, scikit-learn, and statsmodels","archived":false,"fork":false,"pushed_at":"2026-04-07T12:17:16.000Z","size":1365,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-07T12:23:31.946Z","etag":null,"topics":["data-analysis","data-science","data-visualization","environmental-monitoring","jupyter-notebook","machine-learning","pandas","python","scikit-learn","seaborn","statistics","water-quality"],"latest_commit_sha":null,"homepage":"https://github.com/ChanMeng666/water-quality-testing-data-analysis/blob/main/notebooks/water_quality_analysis.ipynb","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ChanMeng666.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":null,"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"lfx_crowdfunding":null,"polar":null,"buy_me_a_coffee":"chanmeng66u","thanks_dev":null,"custom":null}},"created_at":"2024-05-29T21:56:03.000Z","updated_at":"2026-04-07T12:17:24.000Z","dependencies_parsed_at":"2025-01-08T21:02:36.234Z","dependency_job_id":null,"html_url":"https://github.com/ChanMeng666/water-quality-testing-data-analysis","commit_stats":{"total_commits":8,"total_committers":1,"mean_commits":8.0,"dds":0.0,"last_synced_commit":"f2ad4a4e2417a9382e1f0c8d3060b75f2d1a2aed"},"previous_names":["chanmeng666/water-quality-testing-data-analysis"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ChanMeng666/water-quality-testing-data-analysis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChanMeng666%2Fwater-quality-testing-data-analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChanMeng666%2Fwater-quality-testing-data-analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChanMeng666%2Fwater-quality-testing-data-analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChanMeng666%2Fwater-quality-testing-data-analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ChanMeng666","download_url":"https://codeload.github.com/ChanMeng666/water-quality-testing-data-analysis/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChanMeng666%2Fwater-quality-testing-data-analysis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31959883,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T00:39:45.007Z","status":"online","status_checked_at":"2026-04-18T02:00:07.018Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-science","data-visualization","environmental-monitoring","jupyter-notebook","machine-learning","pandas","python","scikit-learn","seaborn","statistics","water-quality"],"created_at":"2024-12-26T14:30:52.213Z","updated_at":"2026-04-18T06:38:01.581Z","avatar_url":"https://github.com/ChanMeng666.png","language":"Jupyter Notebook","funding_links":["https://buymeacoffee.com/chanmeng66u"],"categories":[],"sub_categories":[],"readme":"# Water Quality Testing Data Analysis\n\nStatistical analysis and predictive modeling of water quality parameters using Python.\n\n## Overview\n\nThis project analyzes a dataset of 500 water samples across five quality parameters to explore relationships between water quality indicators and build predictive models for conductivity.\n\n**Key findings:**\n\n- Strong positive correlation (r = 0.705) between pH and dissolved oxygen levels\n- Multi-parameter linear regression model predicts conductivity from pH, temperature, turbidity, and dissolved oxygen\n- OLS regression confirms statistically significant relationships between several parameter pairs (p \u003c 0.05)\n\n## Dataset\n\nThe dataset (`data/water_quality_testing.csv`) contains 500 samples with the following parameters:\n\n| Parameter | Unit | Range |\n|---|---|---|\n| pH | pH units | 6.83 - 7.48 |\n| Temperature | °C | 20.3 - 23.6 |\n| Turbidity | NTU | 3.1 - 5.1 |\n| Dissolved Oxygen | mg/L | 6.0 - 9.9 |\n| Conductivity | µS/cm | 316 - 370 |\n\n## Project Structure\n\n```\nwater-quality-testing-data-analysis/\n├── data/\n│   └── water_quality_testing.csv       # Water quality dataset (500 samples)\n├── notebooks/\n│   └── water_quality_analysis.ipynb    # Main analysis notebook\n├── .gitignore\n├── CODE_OF_CONDUCT.md\n├── LICENSE\n├── README.md\n└── requirements.txt\n```\n\n## Getting Started\n\n### Prerequisites\n\n- Python 3.8+\n- pip\n\n### Installation\n\n```bash\ngit clone https://github.com/ChanMeng666/water-quality-testing-data-analysis.git\ncd water-quality-testing-data-analysis\npip install -r requirements.txt\n```\n\n### Usage\n\n```bash\njupyter notebook notebooks/water_quality_analysis.ipynb\n```\n\nRun all cells (Kernel \u003e Restart \u0026 Run All) to reproduce the full analysis.\n\n## Analysis Contents\n\nThe notebook covers the following topics:\n\n1. **Data Loading and Inspection** - Load dataset, examine structure and summary statistics\n2. **Distribution Analysis** - Histograms with KDE for all parameters\n3. **Correlation Analysis** - Correlation matrix heatmap and pair plots\n4. **pH vs Dissolved Oxygen** - Deep dive into the strongest correlation\n5. **Parameter Relationships** - Regression plots for multiple parameter pairs\n6. **Predictive Modeling** - Linear regression for conductivity prediction (two-feature and multi-parameter models)\n7. **Statistical Modeling (OLS)** - Ordinary least squares regression with statsmodels for statistical inference\n8. **Conclusions** - Summary of key findings\n\n## Built With\n\n- [pandas](https://pandas.pydata.org/) - Data manipulation and analysis\n- [NumPy](https://numpy.org/) - Numerical computing\n- [Matplotlib](https://matplotlib.org/) - Static plotting and visualization\n- [seaborn](https://seaborn.pydata.org/) - Statistical data visualization\n- [scikit-learn](https://scikit-learn.org/) - Linear regression modeling\n- [statsmodels](https://www.statsmodels.org/) - OLS regression and statistical testing\n\n## License\n\nThis project is licensed under the MIT License. See [LICENSE](LICENSE) for details.\n\n## Author\n\n**Chan Meng** - [GitHub](https://github.com/ChanMeng666) · [LinkedIn](https://www.linkedin.com/in/chanmeng666/) · [Website](https://chanmeng.org/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchanmeng666%2Fwater-quality-testing-data-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchanmeng666%2Fwater-quality-testing-data-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchanmeng666%2Fwater-quality-testing-data-analysis/lists"}