{"id":16857650,"url":"https://github.com/brawer/cadaref","last_synced_at":"2025-10-07T04:50:36.762Z","repository":{"id":255382902,"uuid":"848846579","full_name":"brawer/cadaref","owner":"brawer","description":"Automatic georeferencing of cadastral maps","archived":false,"fork":false,"pushed_at":"2024-11-03T22:47:20.000Z","size":2398,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-18T12:14:48.699Z","etag":null,"topics":["cadastre","georeferencing","geospatial"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/brawer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-28T14:10:20.000Z","updated_at":"2024-11-03T22:46:53.000Z","dependencies_parsed_at":"2024-10-13T14:13:27.890Z","dependency_job_id":"2b19c344-9993-46c7-b3d6-bfcf6c92cae8","html_url":"https://github.com/brawer/cadaref","commit_stats":null,"previous_names":["brawer/cadaref"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/brawer/cadaref","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brawer%2Fcadaref","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brawer%2Fcadaref/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brawer%2Fcadaref/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brawer%2Fcadaref/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/brawer","download_url":"https://codeload.github.com/brawer/cadaref/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brawer%2Fcadaref/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278722770,"owners_count":26034461,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-07T02:00:06.786Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cadastre","georeferencing","geospatial"],"created_at":"2024-10-13T14:08:54.414Z","updated_at":"2025-10-07T04:50:36.744Z","avatar_url":"https://github.com/brawer.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Cadaref\n\nCadaref is a key component of a [larger\npipeline](https://github.com/brawer/cadaref-zurich), built\nfor the City of Zürich to automatically\n[georeference](https://en.wikipedia.org/wiki/Georeferencing)\nhistorical [cadastral maps](https://en.wikipedia.org/wiki/Cadastre).\nCadaref matches cartographic symbols against geographic points\nto produce cloud-optimized GeoTIFF images.\n\nIn the Zürich project, the cartographic symbols were found\non archived maps by means of Computer Vision, and the geographic\nlocations were taken from a database of survey points. It does\nnot really matter though; Cadaref is just matching point sets.\n\nCadaref’s matching algorithm is resilient to noisy data.\nThis helps to overcome the limitations of computer vision\nin symbol recognition, and it also helps to bridge the time gap\nbetween decade-old historical maps and the earliest digitally\navailable data.\n\n![scan](./doc/sample.png)\n\n\n## Context\n\nFor most of the 20th century, each change to a land parcel in Zürich\ngot documented on a paper map. Likewise, as in the image above,\nchanges to building footprints were documented in a similar way.\nToday, this record keeping is fully digital, but the City of Zürich\nstill has about 100K paper maps in its archive. This maps collection\ndocuments a significant part of the city’s construction history.\n\nTo preserve this heritage, the archive was scanned to PDF, and then\neach scan got processed by a computer system that tries to find the\nprecise geographic location for each map. As its output, Cadaref\nproduces [Cloud-Optimized GeoTIFF](http://cogeo.org/), an\nindustry-standard file format that is understood by Geographic\nInformation Systems and similar tools.\n\n\n## Pipeline\n\nFor the Zürich project, we built a [driver\npipeline](https://github.com/brawer/cadaref-zurich) whose codebase is\nunlikely to be useful in other projects. Just for context, here’s how\nthis driver pipeline works. Ultimately, it invokes Cadaref, the tool\nyou’re looking at right now.\n\n1. The driver pipeline starts by rendering the archival files,\nwhich are supplied as compressed files in [PDF/A format](httpsa://en.wikipedia.org/wiki/PDF/A), to raster images in TIFF format.\n\n2. The pipeline applies image processing techniques\nsuch as resolution enhancement and\n[Ōtsu thresholding](https://en.wikipedia.org/wiki/Otsu's_method).\n\n3. The pipeline uses Computer Vision to recognize\n[cartographic symbols](https://github.com/brawer/cadasym) and stores\nthem in a CSV file [like this](testdata/symbols.csv).\n\n4. Separately, the processing pipeline also tries to find the\napproximate geographic area of the map in question. For example,\n[Optical Character\nRecognition](https://en.wikipedia.org/wiki/Optical_character_recognition)\nis used to extract parcel numbers. This pipeline uses this to find\nwhat fixed points and [survey\nmarkers](https://en.wikipedia.org/wiki/Survey_marker) might have\nexisted in the approximate area of the historical map, at the time the\nmap was drawn. The result is another CSV file [like\nthis](testdata/points.csv).\n\n5. Again utilizing Optical Character Recognition, the pipeline tries\nto extract the map scale, which was often printed on the historical maps.\nIf OCR doesn’t give a credible result, the pipeline falls back to a set\nof common map scales.\n\n6. The pipeline passes the results of its earlier steps\n(the rasterized historical map,\nthe recognized cartographic symbols, the survey markers and fixed points\nlikely to have been depicted by the map, and the detected map scale)\nto Cadaref, which tries to find a suitable projection. If successful,\nCadaref generates a Cloud-Optimized GeoTIFF image with embedded tags\nfor georeferencing.\n\n\n## Algorithm\n\nCadaref’s matching alorithm is described [here](doc/algorithm.md).\n\n\n## Development\n\nPlease feel free to contribute to this project; simply send\na pull request. To set up your development environment,\nhave a look at the [Continuous Build](.github/workflows/ci.yml)\nwhich builds and tests every change. The codebase gets automatically\ntested on two Linux distributions, Ubuntu and Alpine Linux. However,\ndevelopment on other platforms such as macOS or Windows should\nbe straightforward.\n\n\n\n## Usage\n\nCadaref is a command-line tool that takes the following arguments:\n\n* `--image` File path to the input image, in TIFF format\n    like [this](testdata/HG3099.tif).\n\n* `--page` Page number to process, in case the input TIFF has multiple pages.\n\n* `--scales` Comma-separated list of map scales, such as `1:200,1:500`.\n  For the Zürich project, we use OCR to extract the map scale that was\n  printed on the map, with a fallback in case the map scale is missing.\n\n* `--symbols` A set of map symbols, in CSV format like\n  [this](testdata/symbols.csv). Typically detected by Computer Vision.\n  Symbol locations are passed in (possibly fractional) pixel coordinates,\n  relative to the top left of the image being processed.\n\n* `--points` A set of points on the globe, in CSV format like\n  [this](testdata/points.csv). Typically extracted from a database\n  of survey markers, or whatever else the paper maps may depict.\n  Locations are passed in geographic coordinates. (The tool currently\n  assumes the Swiss CH1903+/LV95 spatial reference system, but this would\n  be trivial to change).\n\n* `--output` File path to the output image that will be written.\n  The output will be a Cloud-Optimized GeoTIFF with embedded transformation\n  parameters.\n\nIf everything goes well, the tool returns with status code 0.\nIn case of failures, in particular if the matching algorithm could\nnot find a good enough transformation, the tool returns with a non-zero\nexit code.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrawer%2Fcadaref","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrawer%2Fcadaref","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrawer%2Fcadaref/lists"}