https://github.com/postmalloc/skeletonide
Skeletonide is a parallel implementation of Zhang-Suen morphological thinning algorithm written in Halide-lang. Use it for fast skeletonization of binary masks on the GPU.
https://github.com/postmalloc/skeletonide
gpu halide-lang halide-pipeline image-processing skeletonization zhang-suen
Last synced: about 2 months ago
JSON representation
Skeletonide is a parallel implementation of Zhang-Suen morphological thinning algorithm written in Halide-lang. Use it for fast skeletonization of binary masks on the GPU.
- Host: GitHub
- URL: https://github.com/postmalloc/skeletonide
- Owner: postmalloc
- License: mit
- Created: 2020-10-12T10:51:09.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-10-21T17:44:23.000Z (about 5 years ago)
- Last Synced: 2023-08-19T03:40:52.130Z (over 2 years ago)
- Topics: gpu, halide-lang, halide-pipeline, image-processing, skeletonization, zhang-suen
- Language: C++
- Homepage:
- Size: 139 KB
- Stars: 11
- Watchers: 2
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# skeletonide

Skeletonide is a parallel implementaion of Zhang-Suen morphological
thinning algorithm written in Halide-lang. It can be used for fast
skeletonization of binary masks. Can also be run on the GPU.
When you build the project, it generates an ahead-of-time
compiled static library from the halide pipeline. It is then
linked with the caller code to generate a single binary.
Note: The halide pipeline represents a single pass of the
Zhang-Suen method. The iterations have to be handled by the
caller code - see `spook.cpp` for an example. The number of
iterations is hardcoded right now. It should depend on the
completion flags returned by the halide pipeline.
## Usage
See `spook.cpp` for an example. The example benchmarks the time
taken to skeletonize a large image on the GPU. Pipeline code is in
`src/pipeline.cpp`.
## Benchmarks
We get the best performance when it is run on a GPU. The tests are
run on the scikit-image's horse mask, but tiled 10x10 to create a large
(3280, 4000) shaped test image. The time is averaged over 100 runs.
| Implementation | CPU (i7-7700HQ) | GPU (GTX 1050m) |
| ------------------------------------- | --------------- | --------------- |
| Scikit-image `morphology.skeletonize` | **2073 ms** | NA |
| Skeletonide | 3786 ms | **210 ms** |
The scheduling of the Halide pipeline can be further tweaked through
trial-and-error to achieve better CPU times. The slow performance on the CPU
is partly explained by the fact that the pipeline only represents a single pass
of the thinning algorithm. There is significant time penalty in handling the
iterations from the outside as illustrated in `spook.cpp`.
However, on the GPU, Skeletonide performs roughly 10x faster than the scikit-image's
CPU implemetation without major modifications. Not a level playing field, obviously.
The output on the scikit-image's horse mask:
Mask:

Skeleton:

## Build
```sh
mkdir build
cd build
cmake ..
cmake --build .
# this will build the static library and its
# header in `build/` and a single test binary
# called `spook` in `skeletonide` - which can
# be run `./spook` to see it in action.
```
## License
MIT License