{"id":31221126,"url":"https://github.com/cutr-at-usf/gtfsrdb","last_synced_at":"2025-09-21T19:51:04.749Z","repository":{"id":7185461,"uuid":"8488374","full_name":"CUTR-at-USF/gtfsrdb","owner":"CUTR-at-USF","description":"GTFSrDB is a tool to archive gtfs-realtime data to a database.","archived":false,"fork":false,"pushed_at":"2022-07-11T14:53:19.000Z","size":145,"stargazers_count":38,"open_issues_count":2,"forks_count":12,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-05-16T19:04:43.754Z","etag":null,"topics":["database","gtfs","gtfs-realtime","gtfs-realtime-data","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CUTR-at-USF.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-02-28T21:20:23.000Z","updated_at":"2024-05-13T20:32:56.000Z","dependencies_parsed_at":"2022-09-25T04:40:35.026Z","dependency_job_id":null,"html_url":"https://github.com/CUTR-at-USF/gtfsrdb","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/CUTR-at-USF/gtfsrdb","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CUTR-at-USF%2Fgtfsrdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CUTR-at-USF%2Fgtfsrdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CUTR-at-USF%2Fgtfsrdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CUTR-at-USF%2Fgtfsrdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CUTR-at-USF","download_url":"https://codeload.github.com/CUTR-at-USF/gtfsrdb/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CUTR-at-USF%2Fgtfsrdb/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":276297406,"owners_count":25618236,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-21T02:00:07.055Z","response_time":72,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","gtfs","gtfs-realtime","gtfs-realtime-data","python"],"created_at":"2025-09-21T19:51:03.414Z","updated_at":"2025-09-21T19:51:04.743Z","avatar_url":"https://github.com/CUTR-at-USF.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"GTFSrDB - GTFS-realtime to Database\n===================================\n\nGTFSrDB loads GTFS-realtime data to a database.  \n\nGTFSrDB supports all 3 types of GTFS-realtime feeds:\n\n1. [TripUpdates](https://developers.google.com/transit/gtfs-realtime/guides/trip-updates) - specify url with `-t` option\n2. [Service Alerts](https://developers.google.com/transit/gtfs-realtime/guides/service-alerts) - specify url with `-a` option\n3. [VehiclePositions](https://developers.google.com/transit/gtfs-realtime/guides/vehicle-positions) - specify url with `-p` option\n\nYou can process multiple types of GTFS-realtime feeds in the same execution by using multiple command line options.\n\nGTFSrDB will run and keep a database up-to-date with the latest GTFSr data. It can also be used to\narchive this data for historical or statistical purposes. GTFSrDB is designed to work in tandem \nwith [gtfsdb](https://github.com/OpenTransitTools/gtfsdb).  GTFSrDB uses SQLAlchemy, so it should work with \nmost any database system; So far its been used with SQLite, Postgres, and Microsoft SQL Server. \nJust specify a database url on the command line with `-d`.\n\n### Example Use\n\n1. **Bay Area Rapid Transit with GTFS-realtime TripUpdates:**\n\n   a. Using SQLite:\n\n       gtfsrdb.py -t http://api.bart.gov/gtfsrt/tripupdate.aspx -d sqlite:///test.db -c\n\n   b. Using Microsoft SQL Server (note you'll need [pyodbc](https://github.com/mkleehammer/pyodbc)):\n\n       gtfsrdb.py -t http://api.bart.gov/gtfsrt/tripupdate.aspx -d mssql+pyodbc://\u003cusername\u003e:\u003cpassword\u003e@\u003cpublic_database_server_name\u003e/\u003cdatabase_name\u003e -c\n\n      So, if the `username=jdoe`, `password=pswd`, `public_database_server_name=my.public.database.org`, `database_name=gtfsrdb`, the command is:\n\n       gtfsrdb.py -t http://api.bart.gov/gtfsrt/tripupdate.aspx -d mssql+pyodbc://jdoe:pswd@my.public.database.org/gtfsrdb -c\n\n2. **Massachusetts Bay Transportation Authority with GTFS-realtime VehiclePositions:**\n\n   a. Using SQLite:\n  \n       gtfsrdb.py -p http://developer.mbta.com/lib/gtrtfs/Vehicles.pb -d sqlite:///test.db -c\n\n3. **GTFS-realtime VehiclePositions stored as offline protocol buffers**\n\n   a. Using MySQL and Bash:\n\n       #!/bin/sh\n       for file in /path/to/files/*; \n       do \n         python /path/to/gtfsrdb.py --once -p file://$file -d \"mysql://\u003cusername\u003e:\u003cpassword\u003e@\u003cpublic_database_server_name\u003e/\u003cdatabase_name\u003e\" -c\n       done\n\nThe model for the data is in `model.py`; you should be able to use this \nstandalone with SQLAlchemy to process the data in Python.\n\nThe `-o` command line option instructs GTFSrDB to keep the database up-to-date by\ndeleting outdated trip updates, vehicle positions, and alerts. Omitting this option will cause\neach update to be saved forever (useful for historical purposes). Note\nthat using this option will *ERASE ALL TRIP UPDATES, ALL ALERTS, and ALL VEHICLE POSITIONS* from\nthe database on each iteration - even those that were in the database\nbefore the session was started.\n\nThis is GTFSrDB's biggest strength - if you pass the `-o` option, your\ndatabase will be perpetually up-to-date with the GTFS-realtime feed,\nso you can write scripts \u0026c that refer to it without worrying about\nthe plumbing to get the data in place.\n\nOther command line parameters:\n\n* `-1` = Only issue a request once\n* `-w` = Time to wait between requests (in seconds) (default=30s)\n* `-k` = Kill process after this many minutes\n* `-v` = Print generated SQL (verbose mode)\n* `-l` = When multiple translations are available, prefer this language\n\nIt is recommended that you run VACUUM ANALYZE frequently, as GTFSrDB\ngenerates quite a few creations and deletions.\n\nKNOWN LIMITATIONS\n=================\nCurrently, the program does not check for duplicates when inserting to\nthe database. Due to the database's relational nature, the following \nfields that are separate messages in GTFSr are collapsed into columns \nin the parent in the SQL database (to avoid creating many joined tables):\n\n* TripUpdate.trip becomes trip_id, route_id, trip_start_time, trip_start_date\n* TripUpdate.vehicle becomes vehicle_id, vehicle_label and\n  vehicle_license_plate\n* StopTimeUpdate.arrival becomes arrival_time, arrival_delay \u0026 \n  arrival_uncertainty\n* StopTimeUpdate.departure becomes departure_time \u0026c.\n* Alert.active_period is condensed to Alert.start and Alert.end; if\n  there are multiple active periods, only the first one is stored.\n* All TranslatedStrings are converted to plain strings, using a) the\n  language specified with the -l option, b) any untranslated string if\n  a string for the language is not found, or c) the only string in the \n  case of a single string in the file.\n* Position.latitude becomes position_latitude\n* Position.longitude becomes position_longitude\n* Position.bearing becomes position_bearing\n* Position.speed becomes position_speed\n* VehicleDescriptor.id becomes vehicle_id\n* VehicleDescriptor.label becomes vehicle_label\n* VehicleDescriptor.license_plate becomes vehicle_license_plate \n\nUSING IT WITH GTFSDB\n====================\n\nIt's not hard to use GTFSrDB in conjunction with GTFSDB. Simply point\nGTFSrDB at a GTFSDB database, and it will add its data into the\ndatabase alongside the static GTFS data. Use the -c option to create\ntables. You can then use SQL's relational features to mash up the data\nany way you want. (Keep in mind that GTFS uses strings for IDs, and\nSQL generally uses numbers. trip_updates and stop_time_updates, as\nwell as alerts and entity_selectors, are related on the oid column,\nwhich is a sequential integer primary key. All of the GTFS ID fields\nare left intact, for joining with the static data. You can't just cast\nthe strings to numbers; take a look at BART's stop IDs in the examples\nbelow).\n\nHere are some example queries (both designed to work with the -o option). Note \nthat the first two are for BART, which embeds stop_ids in GTFSr; other agencies \n(e.g., TriMet) specify stops as trip_updates.trip_id and \nstop_time_updates.stop_sequence; you'll need to use slightly more complex\nqueries for those.\n\nThis query shows all of the stop time updates that relate cleanly to the stops table. \nKeep in mind that trips.trip_id = trip_updates.trip_id only works for trips that are not\nfrequency-expanded (i.e. multiple trips with the same trip_id)\n\n    SELECT trips.route_id, trips.trip_id, trips.trip_headsign, trip_updates.schedule_relationship, stop_time_updates.stop_id, stop_time_updates.arrival_delay\n    FROM trip_updates, stop_time_updates, trips\n    WHERE trips.trip_id.text = trip_updates.trip_id.text AND trip_updates.oid = stop_time_updates.trip_update_id\n    ORDER BY stop_time_updates.stop_id;\n\n    route_id | trip_id |     trip_headsign     | schedule_relationship | stop_id | arrival_delay \n    ----------+---------+-----------------------+-----------------------+---------+---------------\n    04       | 66F1    | FREMONT               | SCHEDULED             | 19TH    |             0\n    12       | 67ED1   | EAST DUBLIN           | SCHEDULED             | 24TH    |             0\n    11       | 66DCM2  | DALY CITY             | SCHEDULED             | BAYF    |             0\n    12       | 65ED1   | EAST DUBLIN           | SCHEDULED             | CAST    |             0\n    02       | 83PB1   | PITTSBURG / BAY POINT | SCHEDULED             | COLM    |             0\n    03       | 67R1    | RICHMOND              | SCHEDULED             | COLS    |             0\n    02       | 80PB1   | PITTSBURG / BAY POINT | SCHEDULED             | CONC    |             0\n    12       | 69ED1   | EAST DUBLIN           | SCHEDULED             | DALY    |             0\n    12       | 68ED1   | EAST DUBLIN           | SCHEDULED             | DALY    |             0\n    04       | 67F1    | FREMONT               | SCHEDULED             | DELN    |             0\n    11       | 69DCM2  | DALY CITY             | SCHEDULED             | DUBL    |             0\n    11       | 67DCM2  | DALY CITY             | SCHEDULED             | DUBL    |             0\n    11       | 68DCM2  | DALY CITY             | SCHEDULED             | DUBL    |             0\n    03       | 69R1    | RICHMOND              | SCHEDULED             | FRMT    |             0\n    03       | 70R1    | RICHMOND              | SCHEDULED             | FRMT    |             0\n    11       | 64DCM2  | DALY CITY             | SCHEDULED             | GLEN    |             0\n    12       | 66ED1   | EAST DUBLIN           | SCHEDULED             | LAKE    |             0\n    03       | 66R1    | RICHMOND              | SCHEDULED             | MCAR    |             0\n    02       | 81PB1   | PITTSBURG / BAY POINT | SCHEDULED             | MCAR    |             0\n    02       | 85PB1   | PITTSBURG / BAY POINT | SCHEDULED             | MLBR    |             0\n    02       | 82PB1   | PITTSBURG / BAY POINT | SCHEDULED             | POWL    |             0\n    04       | 69F1    | FREMONT               | SCHEDULED             | RICH    |             0\n    04       | 68F1    | FREMONT               | SCHEDULED             | RICH    |             0\n    04       | 65F1    | FREMONT               | SCHEDULED             | SANL    |             0\n    02       | 84PB1   | PITTSBURG / BAY POINT | SCHEDULED             | SFIA    |             0\n    03       | 68R1    | RICHMOND              | SCHEDULED             | UCTY    |             0\n    11       | 65DCM2  | DALY CITY             | SCHEDULED             | WOAK    |             0\n    (27 rows)\n\nThis query gives you an overview of the entire BART system, with average delays for each stop \nwhere trains are predicted.  I may spatially enable this database and make a heatmap of where\ndelays are in a given transit system by interpolating between points.\n    \n    SELECT stops.stop_id, stops.stop_name, stops.stop_lat, stops.stop_lon, avg(stop_time_updates.arrival_delay) AS avg\n    FROM stop_time_updates, stops\n    WHERE stops.stop_id.text = stop_time_updates.stop_id.text\n    GROUP BY stops.stop_id, stops.stop_name, stops.stop_lat, stops.stop_lon\n    ORDER BY stops.stop_name;\n\n    stop_id |           stop_name           |   stop_lat   |    stop_lon    |          avg           \n    ---------+-------------------------------+--------------+----------------+------------------------\n    BALB    | Balboa Park BART              | 37.721980868 | -122.447414196 | 0.00000000000000000000\n    CIVC    | Civic Center/UN Plaza BART    | 37.779605587 | -122.413851084 | 0.00000000000000000000\n    COLS    | Coliseum/Oakland Airport BART | 37.754281380 | -122.197788821 | 0.00000000000000000000\n    DALY    | Daly City BART                | 37.706120549 | -122.469080674 | 0.00000000000000000000\n    DUBL    | Dublin/Pleasanton BART        | 37.701673617 | -121.900352519 | 0.00000000000000000000\n    EMBR    | Embarcadero BART              | 37.793022441 | -122.396813153 | 0.00000000000000000000\n    FRMT    | Fremont BART                  | 37.557334282 | -121.976395442 | 0.00000000000000000000\n    FTVL    | Fruitvale BART                | 37.774623806 | -122.224327698 | 0.00000000000000000000\n    GLEN    | Glen Park BART                | 37.732941544 | -122.434114331 | 0.00000000000000000000\n    HAYW    | Hayward Station BART          | 37.670386894 | -122.088002125 | 0.00000000000000000000\n    LAKE    | Lake Merritt BART             | 37.797602372 | -122.265498391 | 0.00000000000000000000\n    MLBR    | Millbrae BART                 | 37.600006000 | -122.386534000 | 0.00000000000000000000\n    NBRK    | North Berkeley BART           | 37.874026140 | -122.283881911 | 0.00000000000000000000\n    NCON    | North Concord/Martinez BART   | 38.002576647 | -122.025106029 | 0.00000000000000000000\n    ORIN    | Orinda BART                   | 37.878360870 | -122.183791135 | 0.00000000000000000000\n    PITT    | Pittsburg/Bay Point BART      | 38.018934339 | -121.941904488 | 0.00000000000000000000\n    RICH    | Richmond BART                 | 37.937169908 | -122.353400100 | 0.00000000000000000000\n    SFIA    | San Francisco Int BART        | 37.615900000 | -122.392534000 | 0.00000000000000000000\n    SHAY    | South Hayward BART            | 37.634799539 | -122.057550587 | 0.00000000000000000000\n    UCTY    | Union City BART               | 37.591202687 | -122.017857962 | 0.00000000000000000000\n    WDUB    | West Dublin/Pleasanton BART   | 37.699800000 | -121.928100000 | 0.00000000000000000000\n    WOAK    | West Oakland BART             | 37.804674760 | -122.294582214 | 0.00000000000000000000\n\nA demo for TriMet\n=================\n\nGTFSrDB allows you to connect GTFS-realtime with an SQL database, allowing app developers to use realtime data through SQL, just as easily as they use static data. Rather than worry about plumbing to connect GTFS and GTFS-realtime, they can focus on writing apps.\n\nIt accomplishes two primary tasks:\n\n* Keeping a database up-to-date with the latest realtime data, and\n* Archiving historic real-time data.\n\nIt’s designed to work with GTFSdb; it will coexist with static GTFS data in a database, so you can easily relate them. Keep in mind that if you update the GTFS data, you’ll lose archived GTFSr data.\nHere is an example query to find what stops have the largest delays (in seconds, for the TriMet system in Portland, OR):\n\n    SELECT stops.stop_id, stops.stop_name, stops.stop_lat, stops.stop_lon, stop_delays.avg\n    FROM stops, stop_delays\n    WHERE stops.stop_id = stop_delays.stop_id\n    ORDER BY avg DESC;\n\nThe stop_delays view looks like this:\n\n\n    SELECT stop_times.stop_id, avg(stop_time_updates.arrival_delay) AS avg\n    FROM stop_time_updates, stop_times, trip_updates\n    WHERE stop_times.trip_id.text = trip_updates.trip_id.text AND stop_times.stop_sequence = stop_time_updates.stop_sequence AND stop_time_updates.trip_update_id = trip_updates.oid\n    GROUP BY stop_times.stop_id\n    ORDER BY avg(stop_time_updates.arrival_delay) DESC;\n\n(I had to pull in the trip_updates table for TriMet because they don’t have a stop_id in their stop_time_updates; they instead specify trip_id and stop_sequence.)\n\n(I’ve removed the lat and lon columns from the following table for readability)\n\n    stop_id |            stop_name            |         avg\n    ---------+---------------------------------+----------------------\n    10853   | Parkrose/ Sumner Transit Center | 473.8260869565217391\n    7999    | NE 82nd \u0026 MAX Overpass          | 350.3050847457627119\n    9610    | Willow Creek Transit Center     | 310.2352941176470588\n    5846    | Tigard Transit Center           | 260.2093023255813953\n    12849   | 16200 Block SW Langer           | 244.6111111111111111\n. . .\n\n### Demo Project\n\nSee the [gtfsrdb-delay-demo](https://github.com/CUTR-at-USF/gtfsrdb-delay-demo) project for a sample web application that visualizes these delays.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcutr-at-usf%2Fgtfsrdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcutr-at-usf%2Fgtfsrdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcutr-at-usf%2Fgtfsrdb/lists"}