https://github.com/codesoap/pbf-reblob
Reduce the size of OSM's PBF files by increasing blob sizes
https://github.com/codesoap/pbf-reblob
compression open-street-map osm pbf
Last synced: 3 months ago
JSON representation
Reduce the size of OSM's PBF files by increasing blob sizes
- Host: GitHub
- URL: https://github.com/codesoap/pbf-reblob
- Owner: codesoap
- License: mit
- Created: 2024-10-21T16:12:09.000Z (12 months ago)
- Default Branch: master
- Last Pushed: 2024-10-21T16:49:07.000Z (12 months ago)
- Last Synced: 2025-05-17T09:11:22.148Z (5 months ago)
- Topics: compression, open-street-map, osm, pbf
- Language: Go
- Homepage:
- Size: 37.1 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
pbf-reblob allows you to easily increase the blob size within open
streetmap's PBF files. This reduces the size of PBF files. It works best
with small extracts.# Demo
Here are the results of file sizes reduced with pbf-reblob; zstd
compression seems to work better than the default zlib compression with
larger blobs:
# Installation
You can download the tool from the releases page:
https://github.com/codesoap/pbf-reblob/releasesIf you have the Go toolchain installed and prefer to build pbf-reblob
yourself, you can get it by running this:```bash
go install github.com/codesoap/pbf-reblob@latest
# The binary is now at ~/go/bin/pbf-reblob.
```# Usage
```console
$ # Change blob size to roughly 16MiB (while not surpassing this blob size):
$ pbf-reblob serbia-latest.osm.pbf serbia-latest-16M.osm.pbf$ # Change blob size to roughly 32MiB, with zstd compression:
$ pbf-reblob -s 32M -c zstd serbia-latest.osm.pbf serbia-latest-32M.zstd.osm.pbf$ # Let's see the results:
$ du -h serbia-latest*
190M serbia-latest-16M.osm.pbf
186M serbia-latest-32M.zstd.osm.pbf
194M serbia-latest.osm.pbf$ pbf-reblob -h
Usage:
pbf-reblob [-v] [-s ] [-c ]
Options:
-c string
output compression; either 'raw', 'zlib' or 'zstd' (default "zlib")
-s string
uncompressed blob size limit; suffixes 'k' and 'M' allowed (default "16M")
-v verbose
```# How It Works
PBF files contain numerous blobs of OSM entities. The popular tool
[osmium](https://osmcode.org/osmium-tool/) usually puts one group of
~8000 OSM entities into each blob. However the file format allows for
uncompressed blob sizes of up to 32MiB, while recommending sizes up to
16MiB. These sizes are not reached with ~8000 OSM entities.By moving multiple entity groups into a single blob, the amount of blobs
in a PBF file can be significantly reduced. This reduces file size, because
each blob contains a list of used strings for the entities in the blob. If
there are multiple small blobs, the same strings will often be stored
multiple times (once for each block that uses it). By reducing the
amount of blobs, the amount of duplicate strings can be reduced.This seems to be most effective with small PBF files. I assume this is
because within a smaller area, there is a higher chance for the same
strings to be reused.# Side Effects
While no data is lost with this method of compression, the changed blob
size might affect the tools working with PBF files. Most prominently,
many tools will likely use more memory when working with larger blobs.