{"id":16058005,"url":"https://github.com/psfried/flow-gtfs","last_synced_at":"2025-04-05T08:15:14.295Z","repository":{"id":79497289,"uuid":"527577579","full_name":"psFried/flow-gtfs","owner":"psFried","description":"Estuary Flow specs for General Transit Feed data on Columbus municipal busses","archived":false,"fork":false,"pushed_at":"2022-10-01T16:18:15.000Z","size":46,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-10T15:50:42.965Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/psFried.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-08-22T13:29:24.000Z","updated_at":"2022-08-22T13:44:18.000Z","dependencies_parsed_at":"2023-03-12T08:19:42.431Z","dependency_job_id":null,"html_url":"https://github.com/psFried/flow-gtfs","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psFried%2Fflow-gtfs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psFried%2Fflow-gtfs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psFried%2Fflow-gtfs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/psFried%2Fflow-gtfs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/psFried","download_url":"https://codeload.github.com/psFried/flow-gtfs/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247305947,"owners_count":20917208,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-09T03:05:51.677Z","updated_at":"2025-04-05T08:15:14.266Z","avatar_url":"https://github.com/psFried.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# General Transit Feed data for Columbus, OH\n\nThis repo houses a Flow catalog that ingests both static and realtime [GTFS data](https://gtfs.org/) from the Central Ohio Transit Authority (COTA) on their municipal busses. This was built in order to demonstrate Flow's ability to work with realtime data.\n\nGTFS defines two separate data formats, a \"schedule\" feed contains relatively slow-changing information about a transit system, such as the routes, stops, and timetables. This data is provided as a collection of CSV files inside of a zip archive, which is fetched periodially.\n\nThere are also three separate realtime feeds, providing information on vehicle positions, trip updates (delays), and service alerts (stop moved). Each of these feeds an HTTP endpoint which is polled to provide the latest state as a protobuf message.\n\n\n### Capturing the schedule feed\n\nThe schedule feed is [documented here](https://gtfs.org/schedule/).\n\nTo capture the static feed, I built a `source-gtfs` connector which fetches the zip file and outputs the data for each contained file as a separate source stream. Thus there's a binding for each file in the zip archive to a separate Flow collection of that type. The connector can handle any zip archive with parseable files in it, so it's not necessarily specific to GTFS data. The intention is to polish up the connector a bit and either publish it as a `source-http-zip-archive` connector, or else incorporate the functionality into the existing `source-http-file` connector.\n\n### Capturing the realtime feed\n\nThe realtime data is [documented here](https://gtfs.org/realtime/), and I've also included the [protobuf definition file](gtfs-realtime.proto) for reference.\n\nTo capture the realtime feed, I used the `source-http-file` connector with a short (30 second) interval. There's a capture each for trip-updates, alerts, and vehicle-positions. I added support for protobuf messages to `flow-parser` to enable it to parse the responses. Protobuf encoding is common enough that it made sense to support it in our parser, anyway.\n\n## What's happening with the captured data?\n\nFor now, it's simply made available as public Flow collections. Feel free to use it for whatever you want, as long as it complies with [COTA's terms of use](https://www.cota.com/data/).\n\nI'd originally intended on putting together derivations to join the static and realtime data and create\na data product tracking the routes and stops with the most frequent delays. I may still finish that\nas I find the time.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpsfried%2Fflow-gtfs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpsfried%2Fflow-gtfs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpsfried%2Fflow-gtfs/lists"}