{"id":24634464,"url":"https://github.com/manuethomas/traffic-accident-analysis-us","last_synced_at":"2025-03-20T07:46:51.632Z","repository":{"id":272787713,"uuid":"917533618","full_name":"manuethomas/Traffic-Accident-Analysis-US","owner":"manuethomas","description":"The project provides a comprehensive analysis of traffic accidents in the US from 2016-2023 aiming to identify key factors contributing to accidents. The analysis also focussed on finding features that could be used to develop a predictive model","archived":false,"fork":false,"pushed_at":"2025-01-16T15:51:01.000Z","size":3416,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-25T09:12:46.198Z","etag":null,"topics":["exploratory-data-analysis","feature-engineering","feature-selection","matpllotlib","numpy","pandas","seaborn"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/manuethomas.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-16T07:03:02.000Z","updated_at":"2025-01-16T15:51:02.000Z","dependencies_parsed_at":"2025-01-16T17:14:32.556Z","dependency_job_id":"d5e801c1-a277-4bd1-af46-2047ddafb285","html_url":"https://github.com/manuethomas/Traffic-Accident-Analysis-US","commit_stats":null,"previous_names":["manuethomas/traffic-accident-analysis-us"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manuethomas%2FTraffic-Accident-Analysis-US","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manuethomas%2FTraffic-Accident-Analysis-US/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manuethomas%2FTraffic-Accident-Analysis-US/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manuethomas%2FTraffic-Accident-Analysis-US/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/manuethomas","download_url":"https://codeload.github.com/manuethomas/Traffic-Accident-Analysis-US/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244574798,"owners_count":20474818,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["exploratory-data-analysis","feature-engineering","feature-selection","matpllotlib","numpy","pandas","seaborn"],"created_at":"2025-01-25T09:12:49.523Z","updated_at":"2025-03-20T07:46:51.612Z","avatar_url":"https://github.com/manuethomas.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src=\"images/banner.png\"\u003e\u003cbr\u003e\u003cbr\u003e\n\n[***Click here to download detailed Report***](https://github.com/manuethomas/traffic-demo/blob/main/Report/Analysis%20of%20Traffic%20Accidents%20in%20the%20US.pdf)\n\n## Problem Statement\n\nThe primary goal of the project was to analyze traffic accident data in the US from 2016 to 2023 using the ”US Accident (2016-2023)” dataset and find out relevant features that can be used to build a predictive system in the future.\n\n- Analyse the primary causes of accidents and their associated risk levels\n- Identify key factors that most reliably indicate the occurance of a major accident\n- Find out relevant features that could help forecast both the probability and severity of accidents before they happen.\n- Share the insights with stakeholders about where accidents commonly happen, patterns in casualties, and how weather and environmental conditions influence accident occurrence.\n\n## Approach\n\n- Dataset Overview\n- Import relevant libraries\n- Load the dataset\n- Data Preprocessing \n    - Removing Unnecessary Features\n    - Transforming Datetime Feature\n    - Handling Missing Values\n    - Outlier Treatment\n- Exploratory Data Analysis \u0026 Feature Engineering\n    - Resampling\n    - Time-Based Feature Analysis\n    - Location-Based Feature Analysis\n    - Weather Feature Analysis\n    - Point of Interest Features Analysis\n- Feature Selection\n- Further Analysis\n- Key Insights and Recommendations\n\n## Exploratory Data Analysis \u0026 Feature Engineering\n\n### Time-Based Feature Analysis\n\n- When do most severe accidents occur during the year? \n- How does accident frequency vary between weekdays and weekends?\n- What time of day sees the most accidents?\n- Are night accidents different from day accidents? \n\n### Location-Based Feature Analysis\n\n- Which states have the highest accident counts?s\n- Which states have the most severe accidents?\n- Does public transit usage in a county affect accident rates?\n- Which types of roads are most dangerous?\n- Are accidents concentrated in specific cities?\n- Which timezone sees the most accidents?\n\n### Weather Feature Analysis\n\n- Which weather conditions are most dangerous for driving?\n- What role do visibility, pressure, and wind speed play?\n\n### Point of Interest (POI) Feature Analysis\n\n- How do traffic signals and crossing impact accident severity?\n- Are there any specific infrastructural impact on accident severity?\n\n## Key Takeaways\n\n- Summer months + weekends + nighttime = Highest risk for severe accidents\n- Georgia \u0026 Florida lead in severe accidents despite CA/TX/FL having most total accidents\n- Less public transit = More severe accidents in a county\n- Traffic signals \u0026 crossings cut accident severity significantly\n- Interstate highways are hotspots for severe accidents\n- Accidents spread across cities rather than concentrated in specific areas\n- Night accidents are fewer but deadlier\n- Rain/snow conditions notably increase accident severity\n\n## References\n\n- Moosavi, Sobhan, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, Radu Teodorescu, and Rajiv Ramnath. “Accident Risk Prediction based on Heterogeneous Sparse Data: New Dataset and Insights.”\n\n- Triveni Sangama Saraswathi Edla. \"Exploratory Data Analysis (EDA) of US Accidents and Prediction of Accident Severity in San Fransisco bay area\"\n\n- Medium: Ronghui Zhou, Ph.D.\"\"How You Can Avoid Car Accident in 2020\"","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmanuethomas%2Ftraffic-accident-analysis-us","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmanuethomas%2Ftraffic-accident-analysis-us","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmanuethomas%2Ftraffic-accident-analysis-us/lists"}