{"id":15046877,"url":"https://github.com/mkuranowski/impuls","last_synced_at":"2026-03-04T12:31:48.988Z","repository":{"id":239144931,"uuid":"520565449","full_name":"MKuranowski/Impuls","owner":"MKuranowski","description":"Python framework for processing static public transportation data","archived":false,"fork":false,"pushed_at":"2025-04-29T06:49:19.000Z","size":9240,"stargazers_count":1,"open_issues_count":3,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-08T21:30:55.240Z","etag":null,"topics":["framework","gtfs","public-transport","python","zig"],"latest_commit_sha":null,"homepage":"https://impuls.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MKuranowski.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-08-02T16:08:15.000Z","updated_at":"2025-04-29T06:49:22.000Z","dependencies_parsed_at":"2024-06-01T08:04:02.531Z","dependency_job_id":"944237ee-6218-4998-a37b-12a80f6efc96","html_url":"https://github.com/MKuranowski/Impuls","commit_stats":null,"previous_names":["mkuranowski/impuls"],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/MKuranowski/Impuls","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MKuranowski%2FImpuls","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MKuranowski%2FImpuls/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MKuranowski%2FImpuls/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MKuranowski%2FImpuls/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MKuranowski","download_url":"https://codeload.github.com/MKuranowski/Impuls/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MKuranowski%2FImpuls/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261449754,"owners_count":23159802,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["framework","gtfs","public-transport","python","zig"],"created_at":"2024-09-24T20:53:41.778Z","updated_at":"2025-06-23T09:05:22.370Z","avatar_url":"https://github.com/MKuranowski.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Impuls\n======\n\n[GitHub](https://github.com/MKuranowski/impuls) |\n[Documentation](https://impuls.readthedocs.io/) |\n[Issue Tracker](https://github.com/MKuranowski/impuls/issues) |\n[PyPI](https://pypi.org/project/impuls/)\n\nImpuls is a framework for processing static public transportation data.\nThe internal model used is very close to GTFS.\n\nThe core entity for processing is called a _pipeline_, which is composed of multiple\n_tasks_ that do the actual processing work.\n\nThe data is stored in an sqlite3 database with a very lightweight\nwrapper to map Impuls's internal model into SQL and GTFS.\n\nImpuls has first-class support for pulling in data from external sources, using its\n_resource_ mechanism. Resources are cached before the data is processed, which saves\nbandwidth if some of the input data has not changed, or even allows to stop the\nprocessing early if none of the resources have been modified.\n\nA module for dealing with versioned, or _multi-file_ sources is also provided. It allows\nfor easy and very flexible processing of schedules provided in discrete versions into\na single coherent file.\n\nInstallation and compilation\n----------------------------\n\nImpuls is mainly written in python, however a performance-critical part of this library is written\nin zig and bundled alongside the shared library. To compile and install the library,\nfirst ensure that [zig](https://ziglang.org/learn/getting-started/) is installed, then\nrun the following, preferably inside of a\n[virtual environment](https://docs.python.org/3/library/venv.html):\n\nImpuls is mainly written in python, however a performance-critical part of this library is written\nin zig and bundled alongside the shared library. To install the library run the following,\npreferably inside of a [virtual environment](https://docs.python.org/3/library/venv.html):\n\n```\npip install impuls\n```\n\nPre-built binaries are available for most platforms. To build from source\n[zig](https://ziglang.org/learn/getting-started/) needs to be installed.\n\nThe `LoadBusManMDB` task additionally requires [mdbtools](https://github.com/mdbtools/mdbtools)\nto be installed. This package is available in most package managers.\n\nExamples\n--------\n\nSee \u003chttps://impuls.readthedocs.io/en/stable/example.html\u003e for a tutorial and a more detailed\nwalkthrough over Impuls features.\n\nThe `examples` directory contains 4 example configurations, processing data\nfrom four sources into a GTFS file. If you wish to run them, consult with the\n[Development](#development) section of the readme to set up the environment correctly.\n\n### Kraków\n\nKraków provides decent GTFS files on \u003chttps://gtfs.ztp.krakow.pl\u003e.\nThe example pipeline removes unnecessary, confusing trip data and fixes\nseveral user-facing strings.\n\nRun with `python -m examples.krakow tram` or `python -m examples.krakow bus`.\nThe result GTFS will be created in `_workspace_krakow/krakow.tram.out.zip` or\n`_workspace_krakow/krakow.bus.out.zip`, accordingly.\n\n### PKP IC (PKP Intercity)\n\nPKP Intercity provides their schedules in a single CSV table at \u003cftp://ftps.intercity.pl\u003e.\nUnfortunately, the source data is not openly available. One needs to email PKP Intercity\nthrough the contact provided in the [Polish MMTIS NAP](https://dane.gov.pl/pl/dataset/1739,krajowy-punkt-dostepowy-kpd-multimodalne-usugi-informacji-o-podrozach)\nin order to get the credentials.\n\nThe Pipeline starts by manually creating an Agency, loading the CSV data,\npulling station data from \u003chttps://github.com/MKuranowski/PLRailMap\u003e,\nadjusting some user-facing data - most importantly extracting trip legs operated by buses.\n\nRun with `python -m examples.pkpic FTP_USERNAME FTP_PASSWORD`. The result GTFS\nwill be created at `_workspace_pkpic/pkpic.zip`\n\n### Radom\n\nMZDiK Radom provides schedules in a MDB database at \u003chttp://mzdik.pl/index.php?id=145\u003e.\nIt is the first example to use the _multi-file_ pipeline support, as the source files\nare published in discrete versions.\n\nMulti-file pipelines consist of four distinct parts:\n- an _intermediate provider_, which figures out the relevant input (\"intermediate\") feeds\n- a _intermediate tasks factory_, which returns the tasks necessary to load\n    an intermediate feed into the SQLite database\n- a _final tasks factory_, which returns the tasks to perform after merging intermediate feeds\n- any additional _resources_, required by the intermediate or final tasks\n\nCaching is even more involved - not only the input feeds are kept across runs,\nbut the databases resulting from running intermediate pipelines are also preserved.\nIf 3 of 4 feeds requested by the intermediate provider have already been processed -\nthe intermediate pipeline will run only for the single new file, but the final (merging)\npipeline will be run on all of the 4 feeds.\n\nThe intermediate provider for Radom scrapes the aforementioned website to find\navailable databases.\n\nPipeline for processing intermediate feeds is a bit more complex: it involved\nloading the MDB database, cleaning up the data (removing virtual stops, generating and\ncleaning calendars) and pulling stop positions from \u003chttp://rkm.mzdik.radom.pl/\u003e.\n\nThe final pipeline simply dumps the merged dataset into a GTFS.\n\nRun with `python -m examples.radom`, the result GTFS will\nbe created at `_workspace_radom/radom.zip`.\n\n### Warsaw\n\nWarsaw is another city which requires multi-file pipelines.\nZTM Warsaw publishes distinct input files for pretty much every other day\nat \u003cftp://rozklady.ztm.waw.pl\u003e. The input datasets are in a completely custom\ntext format, requiring quite involved parsing. More details are available at\n\u003chttps://www.ztm.waw.pl/pliki-do-pobrania/dane-rozkladowe/\u003e (in Polish).\n\nThe intermediate provider picks out relevant files from the aforementioned FTP server.\n\nProcessing of intermediate feeds starts with the import of the text file into\nthe database. Rather uniquely, this step also prettifies stop names - as this\nwould be hard to do in a separate task, due to the presence of indicators\n(two-digit codes uniquely identifying a stop around an intersection) in the name field.\nThe pipeline continues by adding version meta-data, merging railway stations into a single\nstops.txt entry (ZTM separates railway departures into virtual stops) and attribute\nprettifying (namely trip_headsign and stop_lat,stop_lon - not all stops have positions\nin the input file). Last steps involve cleaning up unused entities from the database.\n\nThe final pipeline simply dumps the merged dataset into a GTFS, yet again.\n\nAdditional data for stop positions and edge-cases for prettifying stop names\ncomes from \u003chttps://github.com/MKuranowski/WarsawGTFS/blob/master/data_curated/stop_names.json\u003e.\n\nRun with `python -m examples.warsaw`, the result GTFS will\nbe created at `_workspace_warsaw/warsaw.zip`.\n\nLicense\n-------\n\nImpuls is distributed under GNU GPL v3 (or any later version).\n\n\u003e © Copyright 2022-2025 Mikołaj Kuranowski\n\u003e\n\u003e Impuls is free software: you can redistribute it and/or modify\n\u003e it under the terms of the GNU General Public License as published by\n\u003e the Free Software Foundation; either version 3 of the License, or\n\u003e (at your option) any later version.\n\u003e\n\u003e Impuls is distributed in the hope that it will be useful,\n\u003e but WITHOUT ANY WARRANTY; without even the implied warranty of\n\u003e MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\n\u003e GNU General Public License for more details.\n\u003e\n\u003e You should have received a copy of the GNU General Public License\n\u003e along with Impuls. If not, see \u003chttp://www.gnu.org/licenses/\u003e.\n\nImpuls source code and pre-built binaries come with [sqlite3](https://sqlite.org/),\nwhich [is placed in the public domain](https://www.sqlite.org/copyright.html).\n\nDevelopment\n-----------\n\nImpuls uses [meson-python](https://meson-python.readthedocs.io/en/latest/index.html). The\nproject layout is quite unorthodox, as Impuls in neither a pure-python module, nor a project\nwith a bog-standard C/C++ extension. Instead, the zig code is compiled into a shared library\nwhich is bundled alongside the python module.\n\nZig allows super easy cross-compilation, while using a shared library allows a single wheel\nto be used across multiple python versions and implementations.\n\nDevelopment requires [python](https://python.org/), [zig](https://ziglang.org/learn/getting-started/)\nand [mdbtools](https://github.com/mdbtools/mdbtools/) (usually all 3 will be available in your\npackage manager repositories) to be installed. To set up the environment on Linux, run:\n\n```terminal\n$ python -m venv --upgrade-deps .venv\n$ . .venv/bin/activate\n$ pip install -Ur requirements.dev.txt\n$ pip install --no-build-isolation -Cbuild-dir=builddir --editable .\n$ ln -s ../../builddir/libextern.so impuls/extern\n```\n\nOn MacOS, change the shared library file extension to `.dylib`. On Windows, change the extension\nof the shared library to `.dll`.\n\nTo run python tests, simply execute `pytest`. To run zig tests, run `meson test -C builddir`.\n\nTo run the examples, install their dependencies first (`pip install -Ur requirements.examples.txt`),\nthen execute the example module, e.g. `python -m examples.krakow`.\n\nmeson-python will automatically recompile the zig library whenever an editable impuls install is\nimported; set the `MESONPY_EDITABLE_VERBOSE` environment variable to `1` to see meson logs for build\ndetails.\n\nBy default, the extern zig library will be built in debug mode. To change that, run\n`meson configure --buildtype=debugoptimized builddir` (buildtype can also be set to `debug` or\n`release`). To recompile the library, run `meson compile -C builddir`.\n\nUnfortunately, meson-python requires all python and zig source files in meson.build. Python\nfiles need to be listed for packaging to work, while zig source files need to be listed for\nthe build backend to properly detect whether libextern needs to be recompiled.\n\n### Building wheels\n\nZig has been chosen for its excellent cross-compilation support. Thanks to this, building\nall wheels for a release does not require tools like [cibuildwheel](https://github.com/pypa/cibuildwheel),\nvirtual machines, or even any containers. As long as Zig is installed, all wheels can be\nbuild on that machine.\n\nBefore building wheels, install a few extra dependencies in the virtual environment:\n`pip install -U build wheel`.\n\nTo build the wheels, simply run `python build_wheels.py`.\n\nSee `python build_wheels.py --help` for all available options. To debug failed builds, run\n`python build_wheels.py --verbose --jobs 1 FAILED_CONFIG_NAME`.\n\nSee [CONFIGURATION in build_wheels.py](/build_wheels.py#L32) for available configurations.\n\nTo build the source distribution, run `python -m build -so dist`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmkuranowski%2Fimpuls","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmkuranowski%2Fimpuls","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmkuranowski%2Fimpuls/lists"}