Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/brawer/cadasym
Image corpus for Computer Vision on symbols in Swiss cadastral maps
https://github.com/brawer/cadasym
cadastral cadastre computer-vision corpus
Last synced: 29 days ago
JSON representation
Image corpus for Computer Vision on symbols in Swiss cadastral maps
- Host: GitHub
- URL: https://github.com/brawer/cadasym
- Owner: brawer
- License: cc0-1.0
- Created: 2024-03-26T16:39:14.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-07-24T15:36:06.000Z (5 months ago)
- Last Synced: 2024-10-13T14:14:43.705Z (2 months ago)
- Topics: cadastral, cadastre, computer-vision, corpus
- Language: Python
- Homepage:
- Size: 703 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Cadasym
Cadasym is a corpus for Computer Vision on symbols in cadastral maps.
**Background:** Whenever a Swiss parcel or building changes its
geometry, land surveyors are required to submit a so-called “mutation
plan” to the local authorities. Today, this is done in a completely
digital workflow, but for most of the 20th century, plans were
submitted on paper. By analyzing the archived plans, we would like to
eventually reconstruct the history how buildings have developed over
time. At the moment, the images in the corpus were all taken from
cadastral mutation plans supplied by the [City of
Zürich](https://www.stadt-zuerich.ch/ted/de/index/geoz.html). In other
Swiss municipalities, the plans should look identical, but they will
likely not have used the same equipment for scanning paper plans to
electronic images.**Purpose:** The images from this corpus are useful for testing,
evaluating and training computer vision systems. The symbol
recognition task appears ideal for training Convolutional Neural
Networks with synthetic training data; or maybe it’s enough to go with
“old-school” algorithmic computer vision. Whatever solution we end up
using, we’ll need to evaluate its quality.**Corpus building:** To build the corpus, we wrote an ad-hoc [desktop
application](./corpus_builder) that extracts image snippet from
scanned plans. Human users manually classified the image snippets into
one of the categories shown below.**Data download:** To download the corpus data, see the ZIP file
in [Releases](https://github.com/brawer/cadasym/releases/).## Structure
The [released ZIP file](https://github.com/brawer/cadasym/releases/) contains
PNG images, 256×256 pixel in size, where the symbol in question
is located at the exact **center of the image.** Quite often, there are
other symbols drawn nearby, or there is an overlapping line. That complication
is what makes this an interesting problem. The PNG files are currently in one of these
folders:| Category | Sample |
| --------------------- | ------------------------------------------------------------------------------------------------------------------- |
| `white_circle` | [](./doc/samples/white_circle.png) |
| `double_white_circle` | [](./doc/samples/double_white_circle.png) |
| `black_dot` | [](./doc/samples/black_dot.png) |
| `double_black_circle` | [](./doc/samples/double_black_circle.png) |
| `small_cross` | [](./doc/samples/small_cross.png) |
| `large_cross` | [](./doc/samples/large_cross.png) |
| `triangle` | [](./doc/samples/triangle.png) |
| `other` | [](./doc/samples/other.png) |Note: We’ll likely split the `white_circle` category into several categories by circle size. Because this is rather trivial for a computer (we can just measure
the circle radius), we’ll do this later. Also, we’ll likely add more categories over time.## License
[Public Domain (CC0-1.0)](https://creativecommons.org/publicdomain/zero/1.0/): To the
extent possible under law, we have waived all copyright and related or
neighboring rights to this work. This work is published from Switzerland.![Public Domain](https://mirrors.creativecommons.org/presskit/buttons/88x31/svg/cc-zero.svg)