{"id":34501393,"url":"https://github.com/yashasgm07/averaged-regression-imputation","last_synced_at":"2026-04-24T13:32:14.744Z","repository":{"id":329917727,"uuid":"1120987684","full_name":"Yashasgm07/Averaged-Regression-Imputation","owner":"Yashasgm07","description":"A regression-based missing value imputation system that uses weighted averaging of multiple regression models, supporting both CLI and Flask-based web execution.","archived":false,"fork":false,"pushed_at":"2025-12-22T19:08:27.000Z","size":24,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-23T20:35:52.006Z","etag":null,"topics":["cli-application","data-preprocessing","machine-learning","python","regression","webapplication"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Yashasgm07.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-22T09:02:48.000Z","updated_at":"2025-12-22T19:08:30.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/Yashasgm07/Averaged-Regression-Imputation","commit_stats":null,"previous_names":["yashasgm07/averaged-regression-imputation"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/Yashasgm07/Averaged-Regression-Imputation","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yashasgm07%2FAveraged-Regression-Imputation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yashasgm07%2FAveraged-Regression-Imputation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yashasgm07%2FAveraged-Regression-Imputation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yashasgm07%2FAveraged-Regression-Imputation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Yashasgm07","download_url":"https://codeload.github.com/Yashasgm07/Averaged-Regression-Imputation/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yashasgm07%2FAveraged-Regression-Imputation/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":27992996,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-24T02:00:07.193Z","response_time":83,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli-application","data-preprocessing","machine-learning","python","regression","webapplication"],"created_at":"2025-12-24T02:01:34.806Z","updated_at":"2025-12-24T02:01:57.718Z","avatar_url":"https://github.com/Yashasgm07.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Regression-Based Missing Value Imputation System\n\n## 🎓 Degree\nM.E. Computer Science – PG Mini Project\n\n## 📌 Project Title\nRegression using Averaged Regression on Single and Multi Variable Models\n\n---\n\n## 🧠 Problem Statement\nHandling missing values is a critical challenge in data analysis. Traditional methods such as mean\nor median imputation ignore relationships between variables, leading to loss of information.\nThis project proposes a regression-based imputation approach that leverages both single-variable\nand multivariable linear regression models to produce accurate and stable imputations.\n\n---\n\n## 📄 Abstract\nThis project implements an averaged regression-based approach to handle missing values in\nstructured datasets. Single-variable and multivariable linear regression models are trained for\neach feature with missing data. Cross-validated mean squared error is used to compute weighted\npredictions, followed by a nearest-neighbor refinement to improve stability and accuracy.\nThe system supports both command-line and web-based execution.\n\n---\n\n## ⚙️ Technologies Used\n- Python\n- Pandas, NumPy\n- Scikit-learn\n- Flask (Web Interface)\n- HTML \u0026 CSS\n- VS Code\n\n---\n\n## 🔄 System Workflow\n1. Load original dataset  \n2. Inject missing values for experimentation  \n3. Apply single-variable regression  \n4. Apply multi-variable regression  \n5. Evaluate models using cross-validation (MSE)  \n6. Compute weighted averaged predictions  \n7. Refine predictions using nearest neighbor  \n8. Generate final imputed dataset  \n\n---\n\n## 📁 Output Files\nAfter execution, the following files are generated:\n\n- `student_dataset_original.csv`\n- `student_dataset_with_missing.csv`\n- `student_dataset_imputed_final.csv`\n\nThese files ensure transparency and traceability across all stages of data processing.\n\n---\n\n## ▶️ How to Run (Command Line)\n\n```bash\npip install -r requirements.txt\npython src/main.py\n\n\n## 🌐 How to Run (Web Application)\n\n```bash\npython app.py\nOpen a browser and navigate to:\n\ncpp\nCopy code\nhttp://127.0.0.1:5000\nSteps:\nUpload the CSV dataset\n\nClick Run Imputation\n\nDownload the final imputed dataset\n\n🎯 Key Features\nIntelligent handling of missing values using regression\n\nWeighted averaging based on model performance\n\nPreservation of all data processing stages\n\nSupports both CLI and web-based execution\n\nEasy to demonstrate and explain during viva\n\n📌 Conclusion\nThe averaged regression-based imputation method produces more reliable and stable results\ncompared to traditional mean-based approaches. The dual execution modes make the system\ninteractive, practical, and suitable for real-world data processing scenarios.\n\n🔮 Future Scope\nExtend to non-linear regression models\n\nApply deep learning-based imputation techniques\n\nSupport large-scale datasets\n\nAdd visualization dashboards for results\n\n👨‍💻 Developed By\nYashas G M\nM.E. Computer Science","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyashasgm07%2Faveraged-regression-imputation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyashasgm07%2Faveraged-regression-imputation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyashasgm07%2Faveraged-regression-imputation/lists"}