{"id":13857113,"url":"https://github.com/remix/partridge","last_synced_at":"2025-07-13T20:30:42.926Z","repository":{"id":25824073,"uuid":"102028727","full_name":"remix/partridge","owner":"remix","description":"A fast, forgiving GTFS reader built on pandas DataFrames","archived":false,"fork":false,"pushed_at":"2023-12-03T23:03:46.000Z","size":3107,"stargazers_count":152,"open_issues_count":6,"forks_count":22,"subscribers_count":11,"default_branch":"master","last_synced_at":"2024-11-13T13:51:46.527Z","etag":null,"topics":["gtfs","pandas","python"],"latest_commit_sha":null,"homepage":"https://partridge.readthedocs.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/remix.png","metadata":{"files":{"readme":"README.rst","changelog":"HISTORY.rst","contributing":"CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-08-31T17:51:10.000Z","updated_at":"2024-11-02T22:15:17.000Z","dependencies_parsed_at":"2023-01-14T03:30:05.731Z","dependency_job_id":"d171c90e-42e3-4de6-841a-3c547bb1c5b4","html_url":"https://github.com/remix/partridge","commit_stats":{"total_commits":182,"total_committers":9,"mean_commits":20.22222222222222,"dds":0.1648351648351648,"last_synced_commit":"ac1f9a2b3f73912de111522c3b7a79df02ea98f4"},"previous_names":[],"tags_count":17,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/remix%2Fpartridge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/remix%2Fpartridge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/remix%2Fpartridge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/remix%2Fpartridge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/remix","download_url":"https://codeload.github.com/remix/partridge/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225912591,"owners_count":17544208,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gtfs","pandas","python"],"created_at":"2024-08-05T03:01:26.230Z","updated_at":"2024-11-22T14:32:03.554Z","avatar_url":"https://github.com/remix.png","language":"Python","funding_links":[],"categories":["Producing Data","Python","Uncategorized"],"sub_categories":["GTFS","Uncategorized"],"readme":"=========\nPartridge\n=========\n\n\n.. image:: https://img.shields.io/pypi/v/partridge.svg\n        :target: https://pypi.python.org/pypi/partridge\n\n.. image:: https://img.shields.io/travis/remix/partridge.svg\n        :target: https://travis-ci.org/remix/partridge\n\n\nPartridge is a Python 3.6+ library for working with `GTFS \u003chttps://developers.google.com/transit/gtfs/\u003e`__ feeds using `pandas \u003chttps://pandas.pydata.org/\u003e`__ DataFrames.\n\nPartridge is heavily influenced by our experience at `Remix \u003chttps://www.remix.com/\u003e`__ analyzing and debugging every GTFS feed we could find.\n\nAt the core of Partridge is a dependency graph rooted at ``trips.txt``. Disconnected data is pruned away according to this graph when reading the contents of a feed.\n\nFeeds can also be filtered to create a view specific to your needs. It's most common to filter a feed down to specific dates (``service_id``) or routes (``route_id``), but any field can be filtered.\n\n.. figure:: dependency-graph.png\n   :alt: dependency graph\n\n\nPhilosophy\n----------\n\nThe design of Partridge is guided by the following principles:\n\n**As much as possible**\n\n- Favor speed\n- Allow for extension\n- Succeed lazily on expensive paths\n- Fail eagerly on inexpensive paths\n\n**As little as possible**\n\n- Do anything other than efficiently read GTFS files into DataFrames\n- Take an opinion on the GTFS spec\n\n\nInstallation\n------------\n\n.. code:: console\n\n    pip install partridge\n\n\n**GeoPandas support**\n\n.. code:: console\n\n    pip install partridge[full]\n\n\nUsage\n-----\n\n**Setup**\n\n.. code:: python\n\n    import partridge as ptg\n\n    inpath = 'path/to/caltrain-2017-07-24/'\n\n\nExamples\n--------\n\nThe following is a collection of gists containing Jupyter notebooks with transformations to GTFS feeds that may be useful for intake into software applications.\n\n* `Find the busiest week in a feed and reduce its file size \u003chttps://gist.github.com/csb19815/aadef16178dfcb5ba7a8d88fbf718749\u003e`_\n* `Combine routes by route_short_name \u003chttps://gist.github.com/csb19815/67c0247d1eed2286ca0b323a02a1179f\u003e`_\n* `Merge GTFS with shapefile geometries \u003chttps://gist.github.com/csb19815/535ddb5d36a081abac3430f1a58bd875\u003e`_\n* `Merge multiple agencies into one \u003chttps://gist.github.com/csb19815/682e0f6f30844313213fa5715e48df8c\u003e`_\n* `Rewrite a feed to clean up formatting issues \u003chttps://gist.github.com/csb19815/659c8eba4742cc3f1b8f23d66a760a0c\u003e`_\n* `If a feed has stop_code, replace the contents of stop_id with the contents of stop_code \u003chttps://gist.github.com/csb19815/5bf7923ffb1ce7ec155ac9a94a83ea70\u003e`_\n* `Diff the number of service hours in two feeds \u003chttps://gist.github.com/csb19815/476335cb299ddb3d5a1a4b898424bb35\u003e`_\n* `Investigate the the distance in meters of each stop to the closest point on a shape \u003chttps://gist.github.com/sgoel/bff9384129974967817404abe80e7c6a\u003e`_\n* `Convert frequencies.txt to an equivalent trips.txt \u003chttps://gist.github.com/invisiblefunnel/6c9f3a9b537d3f0ad192c24777b6ae57\u003e`_\n* `Calculate headway for a stop \u003chttps://gist.github.com/invisiblefunnel/6015e65684325281e65fa9339a78229b\u003e`_\n\n\nInspecting the calendar\n~~~~~~~~~~~~~~~~~~~~~~~\n\n\n**The date with the most trips**\n\n.. code:: python\n\n    date, service_ids = ptg.read_busiest_date(inpath)\n    #  datetime.date(2017, 7, 17), frozenset({'CT-17JUL-Combo-Weekday-01'})\n\n\n**The week with the most trips**\n\n\n.. code:: python\n\n    service_ids_by_date = ptg.read_busiest_week(inpath)\n    #  {datetime.date(2017, 7, 17): frozenset({'CT-17JUL-Combo-Weekday-01'}),\n    #   datetime.date(2017, 7, 18): frozenset({'CT-17JUL-Combo-Weekday-01'}),\n    #   datetime.date(2017, 7, 19): frozenset({'CT-17JUL-Combo-Weekday-01'}),\n    #   datetime.date(2017, 7, 20): frozenset({'CT-17JUL-Combo-Weekday-01'}),\n    #   datetime.date(2017, 7, 21): frozenset({'CT-17JUL-Combo-Weekday-01'}),\n    #   datetime.date(2017, 7, 22): frozenset({'CT-17JUL-Caltrain-Saturday-03'}),\n    #   datetime.date(2017, 7, 23): frozenset({'CT-17JUL-Caltrain-Sunday-01'})}\n\n\n**Dates with active service**\n\n.. code:: python\n\n    service_ids_by_date = ptg.read_service_ids_by_date(path)\n\n    date, service_ids = min(service_ids_by_date.items())\n    #  datetime.date(2017, 7, 15), frozenset({'CT-17JUL-Caltrain-Saturday-03'})\n\n    date, service_ids = max(service_ids_by_date.items())\n    #  datetime.date(2019, 7, 20), frozenset({'CT-17JUL-Caltrain-Saturday-03'})\n\n\n**Dates with identical service**\n\n\n.. code:: python\n\n    dates_by_service_ids = ptg.read_dates_by_service_ids(inpath)\n\n    busiest_date, busiest_service = ptg.read_busiest_date(inpath)\n    dates = dates_by_service_ids[busiest_service]\n\n    min(dates), max(dates)\n    #  datetime.date(2017, 7, 17), datetime.date(2019, 7, 19)\n\n\nReading a feed\n~~~~~~~~~~~~~~\n\n\n.. code:: python\n\n    _date, service_ids = ptg.read_busiest_date(inpath)\n\n    view = {\n        'trips.txt': {'service_id': service_ids},\n        'stops.txt': {'stop_name': 'Gilroy Caltrain'},\n    }\n\n    feed = ptg.load_feed(path, view)\n\n\n**Read shapes and stops as GeoDataFrames**\n\n.. code:: python\n\n    service_ids = ptg.read_busiest_date(inpath)[1]\n    view = {'trips.txt': {'service_id': service_ids}}\n\n    feed = ptg.load_geo_feed(path, view)\n\n    feed.shapes.head()\n    #       shape_id                                           geometry\n    #  0  cal_gil_sf  LINESTRING (-121.5661454200744 37.003512297983...\n    #  1  cal_sf_gil  LINESTRING (-122.3944115638733 37.776439059278...\n    #  2   cal_sf_sj  LINESTRING (-122.3944115638733 37.776439059278...\n    #  3  cal_sf_tam  LINESTRING (-122.3944115638733 37.776439059278...\n    #  4   cal_sj_sf  LINESTRING (-121.9031703472137 37.330157067882...\n\n    minlon, minlat, maxlon, maxlat = feed.stops.total_bounds\n    #  -122.412076, 37.003485, -121.566088, 37.77639\n\n\nExtracting a new feed\n~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: python\n\n    outpath = 'gtfs-slim.zip'\n\n    service_ids = ptg.read_busiest_date(inpath)[1]\n    view = {'trips.txt': {'service_id': service_ids}}\n\n    ptg.extract_feed(inpath, outpath, view)\n    feed = ptg.load_feed(outpath)\n\n    assert service_ids == set(feed.trips.service_id)\n\n\nFeatures\n--------\n\n-  Surprisingly fast :)\n-  Load only what you need into memory\n-  Built-in support for resolving service dates\n-  Easily extended to support fields and files outside the official spec\n   (TODO: document this)\n-  Handle nested folders and bad data in zips\n-  Predictable type conversions\n\nThank You\n---------\n\nI hope you find this library useful. If you have suggestions for\nimproving Partridge, please open an `issue on\nGitHub \u003chttps://github.com/remix/partridge/issues\u003e`__.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fremix%2Fpartridge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fremix%2Fpartridge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fremix%2Fpartridge/lists"}