{"id":44130686,"url":"https://github.com/pintergreg/reverse-engineering-yjmob100k-grid","last_synced_at":"2026-02-08T22:10:06.285Z","repository":{"id":250188339,"uuid":"784273762","full_name":"pintergreg/reverse-engineering-YJMob100K-grid","owner":"pintergreg","description":"Revealing urban area from mobile positioning data","archived":false,"fork":false,"pushed_at":"2025-07-24T08:21:02.000Z","size":43594,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-05T00:46:28.878Z","etag":null,"topics":["humob2023-challenge","mobile-positioning-data","reverse-engineering","urban-mobility","yjmob100k"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pintergreg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-04-09T14:26:29.000Z","updated_at":"2025-07-24T08:21:08.000Z","dependencies_parsed_at":"2025-04-03T15:37:39.955Z","dependency_job_id":null,"html_url":"https://github.com/pintergreg/reverse-engineering-YJMob100K-grid","commit_stats":null,"previous_names":["pintergreg/reverse-engineering-yjmob100k-grid"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/pintergreg/reverse-engineering-YJMob100K-grid","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pintergreg%2Freverse-engineering-YJMob100K-grid","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pintergreg%2Freverse-engineering-YJMob100K-grid/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pintergreg%2Freverse-engineering-YJMob100K-grid/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pintergreg%2Freverse-engineering-YJMob100K-grid/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pintergreg","download_url":"https://codeload.github.com/pintergreg/reverse-engineering-YJMob100K-grid/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pintergreg%2Freverse-engineering-YJMob100K-grid/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29246444,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-08T21:42:34.334Z","status":"ssl_error","status_checked_at":"2026-02-08T21:41:38.468Z","response_time":57,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["humob2023-challenge","mobile-positioning-data","reverse-engineering","urban-mobility","yjmob100k"],"created_at":"2026-02-08T22:10:05.621Z","updated_at":"2026-02-08T22:10:06.272Z","avatar_url":"https://github.com/pintergreg.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Revealing urban area from mobile positioning data\n\n## Usage\n\nThe [`pyproject.toml`](pyproject.toml) document the required dependencies. It's suggested to use the [Poetry](https://python-poetry.org/) packaging tool. In this case, just issue the `poetry install` command to set up a virtual environment with all the necessary dependencies.\n\nAfter the development environment has set up, run the notebooks in the following order to reproduce the results.\n\n1. [plot_heatmaps.ipynb](src/plot_heatmaps.ipynb)\n    - this will reproduce the heatmaps [Figure 6] from the [data description paper](https://arxiv.org/abs/2307.03401),\n    - do the inverse-transformed plots, and\n    - some related plots for the paper\n2. [generate_grid.ipnyb](src/generate_grid.ipynb)\n   - this will locate the observation area within Japan and generate the grid\n\nThese two notebooks contain the main work. The [detect_homes.ipynb](src/detect_homes.ipynb), [validate_home_detection.ipynb](src/validate_home_detection.ipynb), [calculate_grid_complexity.ipynb](src/calculate_grid_complexity.ipynb), and the [src/plot_grid.ipynb](src/plot_grid.ipynb) are optional steps to reproduce the figure in the technical validation section of the paper.\n\n### Upscaling grid\n\n- [Upscale heatmap as a template](scale_grid.ipynb)\n    - merges the neighboring cells while summing the activity in the four cells resulting lower resolution template heatmaps\n- [Locate upscaled observation area](locate_rescaled_observation_area.ipynb)\n    - plots the land area of the selected six prefectures proportionally to the upscaled heatmap (grid) and applies template matching\n\n### Other cities\n\n1. [Helsinki](src/helsinki.ipynb)\n    - the Helsinki notebook processes the data for different grid sizes in one run\n2. [London](src/london.ipynb)\n    - `rx` and `ry` parameters are for the grid size, use either 500, 1000, 2000, or 4000\n3. [Toronto](src/london.ipynb)\n    - it is the same notebook as for London, because the dataset is the same\n    - enable the Toronto parameter block\n4. [Dallas--Fort Worth](src/dallas.ipynb)\n    - `RES` parameter is for the H3 resolution, vales between 6 and 10 were applied\n\n### User identifiability\n\nA user is considered k-identifiable if the most frequently visited k location are distinguishable [^zang2011anonymization].\nThe top four location have been determined for every user, then the grid cell were upscaled to 1, 2, 4, 8, and 16 km.\n\nThe following table compares the top-four-location identifiable users by upscaled grids.\nThe relevant notebook is [here](src/top_cell_identifiability.ipynb).\n\n|   distinguishable cells |   1 km x 1 km |   2 km x 2 km |   4 km x 4 km |   8 km x 8 km |   16 km x 16 km |\n|------------------------:|--------------:|--------------:|--------------:|--------------:|----------------:|\n|                       4 |         35469 |         12882 |          5090 |          1810 |             470 |\n|                       3 |         48228 |         42323 |         28457 |         16752 |            7438 |\n|                       2 |         15582 |         38548 |         50987 |         52608 |           44939 |\n|                       1 |           721 |          6247 |         15466 |         28830 |           47153 |\n\n[^zang2011anonymization]: Hui Zang and Jean Bolot. 2011. Anonymization of location data does not work: a large-scale measurement study. In Proceedings of the 17th annual international conference on Mobile computing and networking (MobiCom '11). Association for Computing Machinery, New York, NY, USA, 145–156. https://doi.org/10.1145/2030613.2030630\n\n## Results\n\nThe results are included to be available without executing the code.\nMost notably, the [reproduced grid](output/grid_bl_2449.geojson) (in [EPSG:2449](https://spatialreference.org/ref/epsg/2449/) projection).\n\n### Choropleth maps using the reproduced grid\n\nThe spatial distribution of the activity (first) and the number of unique users (second) per cell using the reproduced grid.\n\n\u003cimg src=\"figures/activity_terrain_2449.png\" alt=\"spatial distribution of activity\" title=\"spatial distribution of activity\" width=\"300\"\u003e\n\u003cimg src=\"figures/user_count_terrain_2449.png\" alt=\"spatial distribution of unique users\" title=\"spatial distribution of unique users\" width=\"300\"\u003e\n\n## Citation\n\nUse the following BibTeX entry to cite the paper.\n\n\u003cdetails\u003e\n  \u003csummary\u003eBibTeX\u003c/summary\u003e\n  \u003cpre\u003e\n@article{pinter2024revealing,\n  title={Revealing urban area from mobile positioning data},\n  author={Pint{\\'e}r, Gerg{\\H{o}}},\n  journal={Scientific Reports},\n  volume={14},\n  number={1},\n  pages={30948},\n  year={2024},\n  publisher={Nature Publishing Group UK London}\n}\n  \u003c/pre\u003e\n\u003c/details\u003e\n\nThe code can be cited via [GitHub](https://github.com/pintergreg/reverse-engineering-YJMob100K-grid).\n\n## Data sources\n\n1. Mobility data: [YJMob100K](https://zenodo.org/records/10836269)\n    - [details](data/yjmob100k/README.md) about how to prepare it\n2. OpenStreetMap data\n    - Copyrighted by OpenStreetMap contributors. It is available under the Open Database License (ODbL).\n    - Administrative data is from OpenStreetMap\n        - downloaded from [OSM-Boundaries](https://osm-boundaries.com/)\n            - prefectures (admin level 4), then filtered manually\n            - municipalities (admin level 7), then filtered manually\n            - wards (admin level 8), then filtered to Nagoya\n    - Coastline is downloaded from https://osmdata.openstreetmap.de/data/land-polygons.html\n        - the islands of Japan was extracted using the prefecture boundaries\n4. Census data\n    - The [Population Census 2020, Population, Households, Sex, Age and Marital status, Table 1-1](https://www.e-stat.go.jp/en/stat-search/files?page=1\u0026layout=datalist\u0026toukei=00200521\u0026tstat=000001136464\u0026cycle=0\u0026year=20200\u0026month=24101210\u0026tclass1=000001136466) was downloaded from the\n     Portal Site of Official Statistics of Japan website (https://www.e-stat.go.jp/)\n\n## License\n\n- The code is licensed under [BSD-3-Clause](LICENSE)\n- The documentation and figures are [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)\n- The shape files are from OpenStreetMap and licensed under the Open Data Commons Open Database License ([ODbL](https://opendatacommons.org/licenses/odbl/1-0/))\n- The census data was downloaded from the Portal Site of Official Statistics of Japan website (https://www.e-stat.go.jp/)\n\nMore details in the [REUSE.toml](REUSE.toml), based on the [REUSE definition](https://reuse.software/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpintergreg%2Freverse-engineering-yjmob100k-grid","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpintergreg%2Freverse-engineering-yjmob100k-grid","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpintergreg%2Freverse-engineering-yjmob100k-grid/lists"}