{"id":21678232,"url":"https://github.com/sharedstreets/sharedstreets-matcher","last_synced_at":"2025-04-12T05:15:39.591Z","repository":{"id":91603875,"uuid":"122989230","full_name":"sharedstreets/sharedstreets-matcher","owner":"sharedstreets","description":"SharedStreets map matching system for very large location data sets","archived":false,"fork":false,"pushed_at":"2020-04-04T23:48:45.000Z","size":4072,"stargazers_count":37,"open_issues_count":13,"forks_count":13,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-04-12T05:15:28.527Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sharedstreets.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-02-26T15:28:14.000Z","updated_at":"2025-02-11T21:33:53.000Z","dependencies_parsed_at":null,"dependency_job_id":"297e9f8e-3450-422e-a837-7d54b5c5d78c","html_url":"https://github.com/sharedstreets/sharedstreets-matcher","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sharedstreets%2Fsharedstreets-matcher","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sharedstreets%2Fsharedstreets-matcher/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sharedstreets%2Fsharedstreets-matcher/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sharedstreets%2Fsharedstreets-matcher/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sharedstreets","download_url":"https://codeload.github.com/sharedstreets/sharedstreets-matcher/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248519557,"owners_count":21117761,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-25T14:27:22.294Z","updated_at":"2025-04-12T05:15:39.565Z","avatar_url":"https://github.com/sharedstreets.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SharedStreets Matcher\n\nA tool for converting GPS traces into traffic speed and location observations linked to SharedStreets. \n\n![Traffic Map](docs/images/traffic_example.png)\n\n## Concept\n\nSharedStreets uses map-matched GPS points to generate data on roadway traffic speeds, and to aggregate events along streets (e.g. pick-ups/drop-offs, hard braking, etc.). Data points are snapped to the roadway network, recording the location along a given street and the direction of travel.  \n\nData about events are snapped to specific locations along the roadway and aggregated using SharedStreets' binned linear referencing system (see Data model below). The following image shows pick up and drop-off events grouped into 10-meter directional bins along street segments.\n\n![Traffic Map](docs/images/pickup_example.png)\n\n\n## Quick start\n\nDownload the latest pre-built binaries from the[ Github Releases](https://github.com/sharedstreets/sharedstreets-matcher/releases) tab above\n\nPrepare input data from GPX traces. Converts sample GPX traces to SharedStreets input file (`osm_gpx/event_data`):\n\n```java -jar [path/to]/ingest-1.1.jar  --input sample_data/gpx --output osm_gpx/ --type gpx -speeds```\n\nCopy map matcher configuration to local directory\n\n```\ncd osm_gpx\nwget https://raw.githubusercontent.com/sharedstreets/sharedstreets-matcher/master/tracker.properties\n```\n\nRun map matching system\n\n```\njava -jar [path/to]/sharedstreets-matcher-1.1.jar --input ./event_data --output  ./output_tiles  --debug  ./debug\n```\n\nSharedStreets map tiles will be downloaded and cached as the matcher runs. By default tiles will be cached in `/tmp/shst_tiles/` but this path can be overriden by using the `--tmpTilePath /path/to/tmp/tiles/` option. by default tiles will be sourced from the `osm/planet-2018430` build of SharedStreets data, but source and be overridden using the `--tileSource [shst-source-id]` option.  \n\nOpen debug trace files using [geojson.io](http://geojson.io/). Colored street edges show travel speed, and blue and red tick marks along edges show GPS relationship to matched point along street edge. Red tick marks are \"failed matches\" due to GPS or map errors (image below shows red ticks caused by missing OSM edges).\n\n![Traffic Map](docs/images/debug_trace.png)\n\n\n### Preparing GPS data\n\nThe SharedStreets Matcher imports GPS records containing:\n\n*  vehicle id (string ID unique to data set)\n*  timestamp ([ISO 8061](https://en.wikipedia.org/wiki/ISO_8601) date string or UNIX timestamp in milliseconds GMT)\n*  latitude (WGS84 decimal degrees)\n*  longitude (WGS84 decimal degrees)\n*  event type (optional label for snapped events e.g. `PICKUP`)\n*  event value (optional numeric value corresponding with event)\n\nBefore matching, GPS data needs to be imported and converted to the [SharedStreets GPS event format](https://github.com/sharedstreets/sharedstreets-matcher/blob/master/ingest/proto/ingest.proto). The ingest tool reads trace data from CSV, JSON, and GPX file formats and exports a normalized input data set for use by the matcher (`[output_dir]/event_data`). The ingest tool also identifies the map tiles needed to match against the input traces and generates a list of SharedStreets tile URLS (`[output_dir]/tile_set.txt`).\n\n#### CSV data\n\nIngest tool imports CSV data using the format:\n\n`[vehicle_id],[time],[lat],[lon],[optional_event_name],[optional_event_value]`.\n\nA single CSV file can contain multiple interleaved traces, as long as each vehicle is uniquely identified within the file. The order of the records in the CSV file does not impact map matching, as records are sorted by time for each group of vehicle IDs.  \n\nThere are two example CSV files in the directory `sample_data/csv`.  The file `csv_trace1` contains a complete trace with GPS speed data in the optional event data columns. The file `csv_trace2` is the same trace but contains a \"PICKUP\" and \"DROPOFF\" event. These location events are included as points in the debug trace output (see image below).\n\nCommand to load CSV data:\n\n```\njava -jar [path/to]/ingest-1.1.jar  --input sample_data/csv/csv_trace1.csv --output csv_gpx/ --type csv\n```\n\n\n![Traffic Map](docs/images/pickup_event_trace.png)\n\n#### GPX data\n\nThe ingest application can import a directory of GPX files, with each file containing a single vehicle trace. Vehicle events can be flagged in the trace by using the waypoint tag: `\u003cwpt name=\"[event_type]\"\u003e...\u003c/wpt\u003e`\n\nCommand to load GPX data: \n\n```\njava -jar [path/to]/ingest-1.1.jar  --input sample_data/gpx --output osm_gpx/ --type gpx -speeds\n```\n\n\n## Data model \nTo both protect privacy and simplify analysis applications SharedStreets does not store individual events in output data sets. Instead SharedStreets uses a variety of spatial and temporal aggregation techniques that build on the SharedStreets referencing system.\n\n### Speeds\n\nSharedStreets matcher uses histograms to store speed observations. This allows aggregation and calculation of statistics from distribution of speeds without storing individual observations. Data generated by matcher is stored in protocol buffer tiles using the [SharedStreets Speed data format](https://github.com/sharedstreets/sharedstreets-ref-system/blob/master/proto/speeds.proto).\n\n```\n   count\n    4 |       *\n    3 |     * *   * *\n    2 |   * * * * * * *\n    1 | * * * * * * * * * *\n       ________________________\n        0 1 2 3 4 5 6 7 8 9\n             Speed (km/h)\n```\n\nMean and variance can be calculated from distribution. Histograms with absolute counts can be added together to merge observations. Histograms can also be stored using scaled counts (0-100 default) to hide absolute observation counts\n\n### Linear References\n\nSharedStreets uses a linear referencing data model to describe point and segment features along street references\n\n```\n   [ SharedStreets Ref ]\n   =============*=======\n                ^ point at distance: 75 along reference\n\n   [ SharedStreets Ref ]\n   ========*******======\n           ^     ^ linear segment from distance 50 to 75 along SharedStreets reference\n```\n### \"Binned\" Linear References\n\nEvent data along street edges are aggregated using a \"binned\" linear reference. Each street edge is divided into equal length bins containing event counts and sums of event values. The default bin length is 10 meters, and can be altered using the `--binSize [meters]` flag in the matcher application.\n\n```\n\tReference Length = 100m\n\tNumber of bins = 5\n   ====|====|====|====|====\n     0    1    2    3    4   = bin position (20m/bin)\n     4    8    0    2    0   = bin value (count of grouped linear features)\n```\n\n\n### Dealing with time\n\nSharedStreets uses a weekly cycle to track periodic trends (e.g. traffic speeds, pick-up/drop-off events). Data is currently aggregated into \"hour of week\" periods, with Monday midnight as \"hour zero\" and Sunday midnight as \"hour 167.\" A complete set of statistics are kept for each time period. The periods are then aggregated as required by downstream analysis.\n\n## Build from source\n\nPrebuilt jar files for the ingest and matching tool are available in the [Github Releases](https://github.com/sharedstreets/sharedstreets-matcher/releases) tab above. Building from source requires Gradle v 3.x and JDK 1.8+. \n\nBoth jar files can be built using the command:\n\n```\ngradle build allJars\n```\n\n\n## How does it work?\n\n\nThe matcher uses a probabilistic (HMM) model to_track GPS points and find the most likely placement and route for each point. By finding likely routes between points ambiguous and imprecise GPS points can be snapped to specific road segments, indicating direction of travel and distance between points over the road network. \n\n![Traffic Map](docs/images/gps_events.png)\n\nThe SharedStreets Matcher uses SharedStreets references and geometry data to build its internal map. All statistics generated by the matcher use the SharedStreets  reference IDs to describe roadway segments and direction of travel. \n\nThe internal map matching engine is derived from [BMW's Barefoot map matching library.](https://github.com/bmwcarit/barefoot) The original library has been substantially modified to replace the original PosgreSQL loader and OpenStreetMap data model with a SharedStreets tile-based data model, and to improve performance of map matching internals.\n\n## Speed Validation\n\nSharedStreets Matcher can be validated using any GPS data source that measures roadway speeds or odometer distances. Keeping in mind instantaneous measures of vehicle speed will differ from average point-to-point speeds generated from map matching, speeds recorded by the GPS device are a useful starting point for validation of the map matching engine. Odometer data allow more precise comparison of matched traces against actual distance traveled.\n\nSpeed information contained in GPX trace data can be imported using the `-speeds` flag during the ingest step. Subsequent debug trace output from the matcher will show both the matched speed and the imported GPS speed. \n \nThe Github repository contains several GPX traces from OpenStreetMap with speed data. The chart below shows measured GPS speeds from these traces against SharedStreets matched speeds. The points in red failed to match, based on default matcher settings, due to incorrect OpenStreetMap data and were automatically excluded from SharedStreets output data. ([Complete data set](https://docs.google.com/spreadsheets/d/11VG9GoAROP1gm-EM3M5V2iQCLM3ZsPgKcNC_q32Q8Jk/edit#gid=0))\n\n![Traffic Map](docs/images/speed_plot.png)\n\n\n## Map Validation\n\nMap match failures are frequently the result of incorrect basemap data. Data about failed matches can be used to detect and fix errors in the underlying map.\n\nWhen using the `--dust` option, the SharedStreets matcher tracks the location of map match failures and aggregates failure rates by street edge. This data is stored in [\"dust\" tiles](https://github.com/sharedstreets/sharedstreets-matcher/blob/master/proto/dust.proto) for use in downstream map data quality analysis. Additional details on map validation and dust data TK.\n\n\n\n\n## Performance\n\nThe processing rate varies significantly based on input data (e.g. time interval between GPS samples, trace length, and map area), and matcher settings. However, processing rates of ~1500 GPS samples / second / CPU core are typical for SharedStreets input data in urban areas.\n\nSharedStreets Matcher is built using Apache Flink, and auto scales to use all available processors. Flink applications can be deployed on a multi-node cluster for distributed processing. \n\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsharedstreets%2Fsharedstreets-matcher","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsharedstreets%2Fsharedstreets-matcher","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsharedstreets%2Fsharedstreets-matcher/lists"}