{"id":25059566,"url":"https://github.com/semanticdata/traffic-studies","last_synced_at":"2026-04-28T16:31:56.574Z","repository":{"id":275651928,"uuid":"926751605","full_name":"semanticdata/traffic-studies","owner":"semanticdata","description":"Comprehensive traffic analysis dashboard for Crystal, Minnesota, built with Streamlit.","archived":false,"fork":false,"pushed_at":"2025-09-25T14:11:18.000Z","size":33644,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-25T16:13:25.554Z","etag":null,"topics":["matplotlib","numpy","pandas","plotly","python","seaborn","streamlit"],"latest_commit_sha":null,"homepage":"https://traffic-studies.streamlit.app/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/semanticdata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-02-03T19:51:16.000Z","updated_at":"2025-09-25T14:11:23.000Z","dependencies_parsed_at":"2025-02-03T21:20:36.096Z","dependency_job_id":"f8e9093c-0a20-4152-8873-fbdcc3d1732c","html_url":"https://github.com/semanticdata/traffic-studies","commit_stats":null,"previous_names":["semanticdata/traffic-studies"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/semanticdata/traffic-studies","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/semanticdata%2Ftraffic-studies","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/semanticdata%2Ftraffic-studies/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/semanticdata%2Ftraffic-studies/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/semanticdata%2Ftraffic-studies/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/semanticdata","download_url":"https://codeload.github.com/semanticdata/traffic-studies/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/semanticdata%2Ftraffic-studies/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32389766,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-28T14:34:11.604Z","status":"ssl_error","status_checked_at":"2026-04-28T14:32:37.009Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["matplotlib","numpy","pandas","plotly","python","seaborn","streamlit"],"created_at":"2025-02-06T15:35:14.564Z","updated_at":"2026-04-28T16:31:56.568Z","avatar_url":"https://github.com/semanticdata.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Traffic Studies\n\nA comprehensive traffic analysis dashboard for Crystal, Minnesota, built with Streamlit. This project processes and visualizes traffic data collected from [PicoCount 2500](https://vehiclecounts.com/picocount-2500.html) traffic counters, providing detailed insights into traffic patterns, speed compliance, and vehicle classifications.\n\n## 🌟 Features\n\n- **Interactive Map**: PyDeck-powered location map with clickable traffic study locations and real-time metrics tooltips\n- **Multi-Page Navigation**: Streamlined interface with dedicated map and analysis pages\n- **Interactive Dashboard**: Real-time filtering by location, date range, and time periods\n- **Core Metrics**: Essential key performance indicators including speed compliance, peak hour analysis, and traffic volume\n- **Chart Explanations**: Interactive \"See explanation\" expanders under each visualization with detailed reading guides\n- **Vehicle Classification**: Detailed analysis of 6 vehicle classes from motorcycles to heavy trucks\n- **Speed Analysis**: Compliance monitoring, violation severity tracking, and 85th percentile calculations\n- **Temporal Patterns**: Hourly, daily, and weekly traffic pattern visualization\n- **Enhanced Data Processing**: Advanced validation, vectorized operations, and zero-traffic filtering\n- **Performance Optimization**: Memory-efficient processing with intelligent caching and loading spinners\n- **Data Quality Monitoring**: Comprehensive validation with detailed error reporting and statistics\n\n## 🏗️ Project Structure\n\n```plaintext\ntraffic-studies/\n├── main.py                           # Main Streamlit application with navigation\n├── pages/                            # Multi-page application structure\n│   ├── map_page.py                   # Interactive location map with PyDeck\n│   └── location_analysis.py          # Detailed traffic analysis dashboard\n├── utils/                            # Core processing utilities\n│   ├── data_loader.py                # Enhanced data loading with caching and validation\n│   ├── metrics.py                    # Traffic metrics and KPI calculations with caching\n│   ├── visualizations.py             # Chart generation (matplotlib \u0026 plotly)\n│   ├── parsers/                      # Specialized parsing modules\n│   │   └── traffic_parser.py         # CSV structure detection and parsing\n│   ├── transformers/                 # Data transformation modules\n│   │   └── traffic_transformer.py   # Data cleaning and enrichment\n│   └── validators/                   # Data validation modules\n│       └── data_validator.py        # Traffic data quality validation\n├── tests/                            # Comprehensive test suite\n│   ├── conftest.py                   # Test fixtures and sample data\n│   ├── test_calculation_accuracy.py  # Real-world calculation validation\n│   ├── test_data_loader.py           # Data loading and caching tests\n│   ├── test_metrics.py               # Metrics calculation tests\n│   ├── test_posted_speed.py          # Posted speed extraction tests\n│   └── test_visualizations.py        # Chart generation tests\n├── .streamlit/\n│   └── config.toml                   # Streamlit configuration settings\n├── data/                             # Directory for CSV data files\n│   ├── Locations.csv                 # Location coordinates for map display\n│   └── reports/                      # Directory for PDF reports\n├── styles.css                        # Custom dashboard styling\n├── pyproject.toml                    # Project dependencies and metadata\n└── README.md                         # Project Information\n```\n\n## 🚀 Getting Started\n\n### Prerequisites\n\n- Python 3.13 or higher\n- [uv](https://github.com/astral-sh/uv) - Fast Python package installer and resolver\n\n### Installation\n\n1. **Clone the repository**\n\n   ```bash\n   git clone https://github.com/semanticdata/traffic-studies.git\n   cd traffic-studies\n   ```\n\n2. **Install dependencies**\n\n   ```bash\n   uv sync\n   ```\n\n3. **Add your data files**\n\n   Place your CSV files from TrafficViewer Pro in the `data/` directory\n\n4. **Run the dashboard**\n\n   ```bash\n   uv run streamlit run main.py\n   ```\n\nThe dashboard will open in your web browser at `http://localhost:8501`. You'll start on the **Location Map** page where you can:\n\n- View all traffic study locations on an interactive map\n- Click locations to see instant traffic metrics in tooltips\n- Select locations and navigate to detailed analysis\n\n## 📊 Core Metrics Dashboard\n\n### Essential Key Performance Indicators\n\n- **Total Vehicle Count**: Aggregate count of all vehicles detected\n- **Average Speed**: Combined directional speed analysis\n- **Speed Compliance Rate**: Percentage of vehicles adhering to speed limits\n- **85th Percentile Speed**: Critical speed measurement for traffic engineering\n- **Peak Hour Statistics**: Busiest hour identification and vehicle counts\n- **Dominant Direction Analysis**: Traffic flow direction preferences with percentages\n\n### Traffic Analysis Visualizations\n\nThe dashboard features well-organized visualization sections with interactive explanations to help users understand and interpret the data effectively.\n\n#### 📊 Traffic Volume Analysis\n\n- **Hourly Traffic Volume**: Stacked bar chart showing average vehicles per hour by direction, ideal for identifying peak commute periods\n- **Daily Traffic Patterns**: Bar chart displaying traffic volume by day of week, useful for understanding weekly cycles and planning maintenance schedules\n\n#### 🚗 Speed Analysis\n\n- **Speed Violation Severity**: Categorizes speeding violations by severity levels (0-5, 5-10, 10-15, 15+ mph over limit) to prioritize enforcement efforts\n- **Speed Distribution by Direction**: Dual charts showing vehicle speed distributions for each direction, helping identify speeding patterns\n- **Speed Compliance Analysis**: Compares compliant vs. non-compliant vehicles by direction using green/red color coding\n- **Speeding Patterns by Hour**: Dual-axis charts combining total vehicle count with speeding percentage to optimize enforcement timing\n\n#### 🚛 Vehicle Classification\n\n- **Vehicle Distribution**: Bar chart showing the distribution of 6 FHWA vehicle classes by direction, supporting infrastructure planning and traffic composition analysis\n\n#### 📖 Interactive Chart Explanations\n\nEach visualization includes an expandable \"See explanation\" section that provides:\n\n- **How to read this chart**: Step-by-step guidance for interpreting the visualization\n- **Key patterns to look for**: Important indicators and what they mean\n- **Practical applications**: How to use the data for traffic management decisions\n- **Color coding explanations**: What different colors and elements represent\n\n### Vehicle Classifications\n\nThe dashboard analyzes six FHWA vehicle classes:\n\n- 🏍️ **Class 1**: Motorcycles\n- 🚗 **Class 2**: Passenger Cars\n- 🚐 **Class 3**: Pickups, Vans\n- 🚌 **Class 4**: Buses\n- 🚛 **Class 5**: 2 Axles, 6 Tires\n- 🚛 **Class 6**: 3 Axles\n\n## 🚀 Enhanced Data Processing\n\n### Advanced Data Loading Features\n\n#### **Zero-Traffic Filtering**\n\n- Automatically removes time periods with no traffic activity (both directions = 0)\n- Improves analysis accuracy by focusing on active traffic periods\n- Provides detailed statistics on filtered vs. original data\n\n#### **Comprehensive Data Validation**\n\n- **Volume Validation**: Detects negative values and unrealistic traffic volumes (\u003e1000 vehicles/hour)\n- **Speed Validation**: Validates speed range data for consistency and realistic values  \n- **Temporal Validation**: Checks for missing time periods and irregular intervals\n- **Classification Validation**: Ensures vehicle class data integrity\n- **Cross-Validation**: Verifies total volumes match directional sums\n\n#### **Performance Optimization**\n\n- **Vectorized Operations**: NumPy-based speed compliance calculations for 30-50% performance improvement\n- **Memory Efficiency**: Chunked processing for large datasets to prevent memory issues\n- **Memory Monitoring**: Built-in memory usage tracking and reporting\n\n#### **Enhanced Error Handling**\n\n- **Custom Exceptions**: Specific error types for different failure modes\n  - `TrafficDataError`: Base exception for traffic data processing\n  - `DataValidationError`: Data quality and validation failures\n  - `FileStructureError`: CSV format and structure issues\n- **Detailed Error Messages**: Contextual information for troubleshooting\n- **Graceful Degradation**: Handles partial data and missing columns\n\n#### **Metadata \u0026 Statistics**\n\n- **Filtering Statistics**: Tracks original vs. filtered row counts and percentages\n- **Data Quality Metrics**: Validation results with warnings and error details\n- **Memory Usage**: Real-time memory consumption monitoring\n- **Processing Metadata**: Date ranges, active hours, and data completeness\n\n### Usage Examples\n\n```python\n# Standard enhanced loading with validation\ndf, location, structure = load_data('traffic_data.csv')\n\n# Access filtering statistics\nstats = structure['filtering_stats']\nprint(f\"Removed {stats['removed_rows']} inactive periods ({stats['removal_percentage']:.1f}%)\")\n\n# Check data quality\nquality = structure['data_quality']\nif not quality['is_valid']:\n    print(f\"Data validation errors: {quality['errors']}\")\n\n# Memory-efficient loading for large files\ndf, location, structure = load_large_traffic_data('large_file.csv', chunk_size=10000)\n\n# Monitor memory usage\nmemory_info = get_memory_usage(df)\nprint(f\"Dataset using {memory_info['total_memory']} of memory\")\n```\n\n## 📁 Data Format\n\nThe application expects CSV files exported from TrafficViewer Pro with the following structure:\n\n### Supported File Types\n\n- **ALL.csv**: Complete traffic data with volume, speed, and classification\n- **VOL.csv**: Volume-only data files\n- **Total-SPD.csv**: Speed analysis files with pre-calculated metrics (Mean Speed, 85th Percentile)\n- **Directional SPD files**: Northbound/Southbound or Eastbound/Westbound speed data\n\n### File Structure\n\n- **Metadata rows**: Location, comments, and title information\n- **Date/Time column**: Timestamp for each data point (validated for consistency)\n- **Volume columns**: Directional traffic counts (automatically filtered for zero-traffic periods)\n- **Speed range columns**: Speed distribution data (e.g., \"35-39 MPH - Northbound\")\n- **Classification columns**: Vehicle class counts by direction (validated for data integrity)\n\n### Data Processing Notes\n\n- **Header correction**: Automatically fixes malformed TrafficViewer Pro headers (`\"Total\"\"Mean Speed\"` → `\"Total\",\"Mean Speed\"`)\n- **Reference file detection**: Automatically locates related SPD files for enhanced metrics\n- **Pre-calculated metrics**: Uses validated speed calculations from Total-SPD.csv when available\n- **Files are automatically validated** for structure compatibility and data quality\n- **Zero-traffic time periods** are filtered out to improve analysis accuracy\n- **Memory usage is optimized** for large datasets through chunked processing\n- **Comprehensive error handling** provides detailed feedback for data issues\n\n## 🎯 Use Cases\n\n- **Traffic Engineering**: Speed limit assessment and road safety analysis with detailed compliance metrics\n- **Urban Planning**: Peak hour identification and capacity planning using temporal pattern analysis\n- **Policy Making**: Data-driven traffic management decisions with comprehensive KPI dashboard\n- **Research**: Academic traffic pattern studies with interactive explanations for methodology understanding\n- **Compliance Monitoring**: Speed enforcement effectiveness evaluation with violation severity tracking\n- **Report Generation**: Print-friendly dashboard layout perfect for creating professional traffic reports\n- **Public Presentations**: Clear visualizations with explanations suitable for community meetings and stakeholder presentations\n\n## 🧪 Development and Testing\n\n### Testing\n\n```bash\n# Run all tests\nuv run pytest\n\n# Run tests with coverage report\nuv run pytest --cov=utils --cov-report=term-missing\n\n# Run specific test file\nuv run pytest tests/test_metrics.py\n\n# Run tests with verbose output\nuv run pytest -v\n\n# Install development dependencies (includes pytest)\nuv sync --dev\n```\n\n### Code Quality\n\n```bash\n# Run linting and formatting\nuv run ruff check .\nuv run ruff format .\n```\n\n### Test Coverage\n\nThe test suite includes comprehensive tests for:\n\n- **Metrics calculations**: All 6 core KPIs and helper functions with real data validation\n- **Calculation accuracy**: Tests using actual traffic data files with known expected results\n- **Speed metric validation**: Tests for 85th percentile and mean speed calculations using pre-calculated values\n- **CSV parsing fixes**: Validation of malformed header correction and reference file detection\n- **ADT calculation**: Tests for partial day exclusion and complete day averaging\n- **Enhanced data loading**: CSV parsing, structure detection, validation framework, and error handling\n- **Data validation**: Traffic data quality checks, negative value detection, and temporal validation\n- **Memory efficiency**: Memory usage monitoring and chunked processing\n- **Performance optimization**: Vectorized operations and speed compliance calculations\n- **Visualizations**: Chart generation and matplotlib figure validation\n- **Real-world validation**: Tests against 11 actual traffic data files from Crystal, Minnesota\n- **Edge case handling**: Zero traffic periods, single data points, and boundary conditions\n- **Cross-file consistency**: Ensures calculations are consistent across different data sources\n\n## 🔧 Technical Details\n\n### Dependencies\n\n- **Streamlit**: Multi-page web application framework with navigation\n- **PyDeck**: Interactive map visualization for location selection\n- **Pandas**: Data manipulation and analysis with enhanced validation\n- **Matplotlib**: Static plotting and visualization\n- **Plotly**: Interactive plotting and visualization\n- **Seaborn**: Statistical data visualization\n- **NumPy**: Numerical computing and vectorized operations for performance optimization\n\n### Key Enhancements\n\n- **Zero-traffic filtering**: Automatically removes inactive time periods for cleaner analysis\n- **Accurate metric calculations**: Fixed speed compliance, 85th percentile, and average speed calculations for precision\n- **Pre-calculated speed metrics**: Uses TrafficViewer Pro's validated speed calculations from Total-SPD.csv files\n- **CSV header parsing fixes**: Handles malformed TrafficViewer Pro exports with corrected header parsing\n- **ADT calculation improvements**: Excludes partial days (\u003c20 hours) for more accurate Average Daily Traffic\n- **Data validation**: Comprehensive quality checks with detailed error reporting\n- **Memory optimization**: Efficient processing for large datasets\n- **Enhanced error handling**: Custom exceptions with contextual error messages\n- **Interactive explanations**: Expandable \"See explanation\" sections for each visualization\n- **Print-friendly design**: Professional layout optimized for report generation and presentations\n- **Clean location formatting**: Automatic removal of quotes, commas, and extra whitespace from location names\n\n## 📝 Data Sources\n\nTraffic data is collected using [PicoCount 2500](https://vehiclecounts.com/picocount-2500.html) traffic counters and processed through [TrafficViewer Pro](https://vehiclecounts.com/trafficviewerpro.html) software. The dashboard provides a user-friendly interface for analyzing this data, making it accessible for traffic planning and decision-making purposes.\n\n## 📜 License\n\nThis project is licensed under the [MIT License](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsemanticdata%2Ftraffic-studies","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsemanticdata%2Ftraffic-studies","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsemanticdata%2Ftraffic-studies/lists"}