{"id":20757310,"url":"https://github.com/piebro/openstreetmap-statistics","last_synced_at":"2025-10-05T18:09:01.091Z","repository":{"id":63732765,"uuid":"570134642","full_name":"piebro/openstreetmap-statistics","owner":"piebro","description":"Monthly updated interactive statistics about OpenStreetMap.","archived":false,"fork":false,"pushed_at":"2024-03-23T13:46:35.000Z","size":27839,"stargazers_count":24,"open_issues_count":9,"forks_count":4,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-04-15T07:51:33.499Z","etag":null,"topics":["data-visualization","openstreetmap","openstreetmap-data","plotlyjs","python","statistics"],"latest_commit_sha":null,"homepage":"https://piebro.github.io/openstreetmap-statistics/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/piebro.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-11-24T12:17:35.000Z","updated_at":"2024-05-07T05:35:39.796Z","dependencies_parsed_at":"2024-03-23T11:51:05.579Z","dependency_job_id":"f4d40d5d-9e36-4152-90c7-14bf6560e8d5","html_url":"https://github.com/piebro/openstreetmap-statistics","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piebro%2Fopenstreetmap-statistics","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piebro%2Fopenstreetmap-statistics/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piebro%2Fopenstreetmap-statistics/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piebro%2Fopenstreetmap-statistics/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/piebro","download_url":"https://codeload.github.com/piebro/openstreetmap-statistics/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251495358,"owners_count":21598475,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-visualization","openstreetmap","openstreetmap-data","plotlyjs","python","statistics"],"created_at":"2024-11-17T09:41:26.905Z","updated_at":"2025-10-05T18:09:01.079Z","avatar_url":"https://github.com/piebro.png","language":"Jupyter Notebook","readme":"# OpenStreetMap Statistics\n\nMonthly updated statistics of OpenStreetMap. There is a [website](https://piebro.github.io/openstreetmap-statistics) to browse the generated plots and tables.\n\nThe plots and tables are organized in topics and questions I asked myself about OpenStreetMap. My motivation for this project was that I couldn't find some statistics I was interested in or that the data was outdated. That's why I created these statistics, which are easily updatable with a simple script run locally or with GitHub actions.\n\nThere is also a notebook to create [custom plots](https://piebro.github.io/openstreetmap-statistics/jupyter_lite/retro/notebooks/?path=custom_plots_browser.ipynb) with the data in a browser. You can use [this](https://github.com/piebro/openstreetmap-statistics/blob/master/src/custom_data_and_plots.ipynb) notebook if you want to create custom data with custom plots locally.\n\nI'm experimenting with a [website](https://piebro.github.io/openstreetmap-statistics/src/questions) to show the statistics. Many plots are still missing, but I might migrate them in the future and change it as the default starting page.\n\n## Methodology\n\nAll data is gathered from an OpenStreetMap [changeset file](https://planet.openstreetmap.org/planet/).\nAccording to the OSM wiki, a [changeset](https://wiki.openstreetmap.org/wiki/Changeset) is a group of edits to the database by a single user over a short period.\nBesides who made the changes and how many edits were made, each changeset can contain additional information, for example about which editor was used, source of edit, it may also list used imagery.\n\nThe Methodology used is the same as in https://wiki.openstreetmap.org/wiki/Editor_usage_stats and uses the same terms.\nOne important term which is used a lot is `edits`.\nIn these statistics, an edit is a change made to a node, way or relation.\n\nThat means changing one or multiple tags of one element always counts as one edit.\nIt also means that changing the geometry of a way or relation count as many edits since the position of many nodes changed.\nThis leads to an overrepresented of changes in the geometry of ways and relations compared to edits that add or change information to existing nodes.\nIt's important to keep this in mind looking and interpreting the data.\n\nAnother aspect is that the `created_by`, `imagery` and `source` tag use filters to determine the editing software and imagery.\nSome categories are opinionated (e.g., should stats for Android and iOS editing apps be counted separately?), and other categories could be very reasonable, depending on the purpose.\nThe filtering process is done with simple rules to make it as transparent as possible and easily extendable by anyone.\nThe rules are defined at [src/replace_rules_created_by.json](src/replace_rules_created_by.json) and [src/replace_rules_imagery_and_source.json](src/replace_rules_imagery_and_source.json).\n\n### Editing Software\n\nMost changesets have a `created_by` tag which indicates which editing software was used to make the changes.\nMany `created_by` tags also include the version number or additional irrelevant information for determining the editing software and are therefore filtered.\n\n### Imagery Software\n\nOne optional tag for changesets is the `imagery` tag, which iD, Vespucci and Go Map!! use to add an image source if aerial or other imagery is used.\nMany `imagery` tags also include irrelevant information for determining the used imagery and are therefore filtered.\n\n### Organised Teams\n\nMost mapping is done by individual hobby mappers mapping independently, but there are also organized mapping activities where several people edit the map under specific instructions of others.\nA list of all organized editing teams can be found [here](https://wiki.openstreetmap.org/wiki/Category:Organised_Editing_Teams).\nThe teams list all users (including inactive ones) who are mapping for them for transparency reasons.\n\nThe teams are added to [src/save_organised_teams_contributors.py](src/save_organised_teams_contributors.py), which extracts all user names and saves them in the assets folder.\nThe statistics are gathered with the list of users working at each team.\nIncorrect and out-of-date user lists could be a source of error for this data.\n\n\n## Usage\n\n### Update data\n\nThe code is tested on Ubuntu 20.04 but should work on every Linux distro. I'm not sure about Windows or Mac.\n\n```bash\n# Install dependencies for downloading and handling the latest changeset and showing a progress bar\nsudo apt install aria2 osmium-tool pv\n\n# create a virtual environment\npython3 -m venv .venv\nsource .venv/bin/activate\n\n# install python dependencies\npip3 install -r requirements.txt\n```\n\nRun the following commands to get the latest OSM changeset file.\n```bash\nrm $(ls *.osm.bz2)\nwget -N https://planet.openstreetmap.org/planet/changesets-latest.osm.bz2.torrent\naria2c --seed-time 0 --check-integrity changesets-latest.osm.bz2.torrent\n```\n\nNext, you can extract the data and save it in a compressed CSV file like this. `pv` is used to generate a progress bar. The extraction can take some time (on my laptop this takes about 1:30h).\n```bash\nrm -r -d temp\nosmium cat --output-format opl $(ls *.osm.bz2) | pv -s 140M -l | python3 src/changeset_to_parquet.py temp\n```\n\nIf you want to add new topics, plots or tables and iterate faster with a subset of all data, you can use every 500th changeset like this.\n```bash\nosmium cat --output-format opl $(ls *.osm.bz2) | pv -s 140M -l | sed -n '0~500p' | python3 src/changeset_to_parquet.py temp_dev\n```\n\nNext, you can generate the plots and tables like the following command or with `temp_dev` instead of `temp` for the folder name. On my laptop this takes also about 0:30h and it runs with less then 8GB of RAM.\n```bash\npython3 src/parquet_to_json_stats.py temp\n```\n\n### Update notebooks\n\nThere are multiple question in `src/questions` and each one has a jupyter notebook to compute the relevant data for the question. To Execute all notebooks run:\n\n```bash\nfor notebook in $(find src/questions -name calculations.ipynb); do\n    jupyter nbconvert --to notebook --execute \"$notebook\" --output calculations.ipynb\ndone\n```\n\n### Update Organised Teams user names\n\nYou can update the list of Organised Teams with their osm user names in assets/organised_teams_contributors.json with the following command.\n```bash\npython3 src/save_corporation_contributors.py\n```\n\n### Update Jupyter Lite Notebook\n\n```bash\npip install jupyterlite-core==0.1.0 jupyterlab~=3.5.1 jupyterlite-pyodide-kernel==0.0.6\njupyter lite build --contents src/custom_plots_browser.ipynb --output-dir jupyter_lite\n```\n\n### Update background map\n\nYou can update the background map in assets/background_map.png with the following command after installing two additional python dependencies like this `pip3 install geopandas pillow` and with a shapefile from https://www.naturalearthdata.com/downloads/110m-physical-vectors/.\n```bash\npython3 save_background_map.py \u003cpath-to-ne-110m-land-shape-file.shp\u003e\n```\n\n### Update plotly-custom.min.js\n\nPlotly custom is generated with these instructions https://github.com/plotly/plotly.js/blob/master/CUSTOM_BUNDLE.md using the following command.\n```bash\nnpm run custom-bundle -- --traces scatter,bar,histogram2d --transforms none\n```\nThis has the advantage of having a smaller plotly file while still being able to generate all needed plots.\n\n\n## Contributing\n\nIf there are other topics and questions about OpenStreetMap you think are interesting and that can be abstracted from the changeset, feel free to open an issue or create a pull request.\nAlso, if you see any typos or other mistakes, feel free to correct them and create a pull request.\n\nAnother valuable way to contribute is to add editing software or imagery sources to [src/replace_rules_created_by.json](src/replace_rules_created_by.json) and [src/replace_rules_imagery_and_source.json](src/replace_rules_imagery_and_source.json).\nThe cmd `python3 src/finde_new_replace_rule_candidates.py temp` can be used to find new impactful candidates to add to the rules.\nAdding rules can make the statistics more accurate and links help with the usability.\n[JSON sorter](https://r37r0m0d3l.github.io/json_sort/) with `four spaces` can be used to sort and format the json correctly.\n\nThe Projected uses [Ruff](https://github.com/astral-sh/ruff) for linting and formatting. Run `ruff check` and `ruff format` in the project root directory tu use it.\n[Prettier](https://prettier.io/playground/) is used for linting the javascript code with a `print-width` of 120, `tab-width` of 4 and [Stylelint](https://stylelint.io/demo/) is used for linting css code.\nFurthermore, [Codespell](https://github.com/codespell-project/codespell) is used to find spelling mistakes and can be used with this command `codespell src README.md index.html assets/statistic_website.js`.\n\n## Website Statistics\n\nThere is lightweight tracking with [Plausible](https://plausible.io/about) for the [website](https://piebro.github.io/openstreetmap-statistics/) to get infos about how many people are visiting. Everyone who is interested can look at these stats here: https://plausible.io/piebro.github.io%2Fopenstreetmap-statistics?period=30d. Only users without an AdBlocker are counted, so these statistics are under estimating the actual count of visitors. I would guess that quite a few people (including me) visiting the site have an AdBlocker.\n\n\n## License\n\nAll code in this project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. All data, maps and plots in this project are licensed under [Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpiebro%2Fopenstreetmap-statistics","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpiebro%2Fopenstreetmap-statistics","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpiebro%2Fopenstreetmap-statistics/lists"}