https://github.com/cheind/hue-depth-encoding

Depth map compression by colorization in vectorized form
https://github.com/cheind/hue-depth-encoding

Last synced: 5 months ago
JSON representation

Depth map compression by colorization in vectorized form

Host: GitHub
URL: https://github.com/cheind/hue-depth-encoding
Owner: cheind
License: mit
Created: 2024-10-17T03:03:33.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-10-30T19:41:50.000Z (over 1 year ago)
Last Synced: 2025-09-10T01:33:51.370Z (10 months ago)
Language: Python
Homepage:
Size: 2.81 MB
Stars: 13
Watchers: 1
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          [![DOI](https://zenodo.org/badge/873968632.svg)](https://doi.org/10.5281/zenodo.14014715)

# hue-depth-encoding

This project provides efficient vectorized Python code to perform depth `<->` color encoding based on

> Sonoda, Tetsuri, and Anders Grunnet-Jepsen.

"Depth image compression by colorization for Intel RealSense depth cameras." Intel, Rev 1.0 (2021).

with the following significant changes:

 1. We stop the hue range at 300° (default) in order to avoid having close and far ranges assigned redish color.

 1. We detect de-valued and de-saturized colors in the decoding and assign it a pre-defined failure depth.

The encoding is designed to transform 16bit single channel images to RGB color images that can be processed by standard (lossy) image codec with *minimized* compression artefacts. This leads to a compression factor of up to 80x while maintaining acceptable depth accuracy for many RGBD systems.

## Method

The underlying method is described in 

> Sonoda, Tetsuri, and Anders Grunnet-Jepsen.

"Depth image compression by colorization for Intel RealSense depth cameras." Intel, Rev 1.0 (2021).

However, the formulae presented appear to be inaccurate and specific important details seem to be left out. Here we describe the base encoding plus our adaptations.

The base method relies on encoding depth as fully saturated/valued colors and assigns increasing depth values increasing hues. For bitdepth $b$ this allows for

```math

n = \lfloor \textrm{Hue}_{\textrm{max}}/ 60\rfloor (2^b - 1) + 1

```

unique colors. In the original method $`\textrm{Hue}_{\textrm{max}}=360°`$, whereas in this implementation $`\textrm{Hue}_{\textrm{max}}=300°`$ to avoid ambiguitis between close and far depth values that would otherwise get assigned a reddish color.

Limiting $`\textrm{Hue}_{\textrm{max}}`$ limits the number of available bits to $`\log_2 n`$ (~10.3bit).

### Depth transformations

The encoding allows for linear and disparity depth normalization. In linear mode, equal depth ratios are preserved in the encoding range, whereas in disparity mode more emphasis is put on closer depth values than on larger ones, leading to more accurare depth resolution closeup.

![](etc/compare_encoding.svg)

## Implementation

This implementation is vectorized using numpy and can handle any image shapes `(*,H,W) <-> (*,H,W,3)`. For improved performance, we precompute encoder and decoder lookup tables to reduce encoding/decoding to a simple lookup. The lookup tables require ~32MB of memory. Use `use_lut=False` switch to rely on the pure vectorized implementation. See benchmarks below for effects.

## Installation

With Python >= 3.11 execute

```shell

git clone https://github.com/cheind/hue-depth-encoding.git

pip install -e .

# or development install adds more packages to run tests/analysis

pip install -e '.[dev]'

# or with gpu codec support in dev mode

pip install -e '.[dev]' --no-binary=av 

```

## Usage

Basic usage

```python

# Import

import huecodec as hc

# Random float depths

d = rng.random((5,240,320), dtype=np.float32)*0.9 + 0.1

# Encode

rgb = hc.depth2rgb(d, zrange=(0.1,1.0), inv_depth=False)

# (5,240,320,3), uint8

# Decode

depth = hc.rgb2depth(rgb, zrange=(0.1,1.0), inv_depth=False)

# (5,240,320), float32

```

## Evaluation

### Encoding/Decoding Roundtrips

The script `python analysis.py` compares encoding and decoding characteristics for different standard video codecs using hue depth encoding. The reported figures have the following meaning:

 - **rmse** [m] root mean square error per depth pixel between groundtruth and transcoded depthmaps

 - ** color -> depth transcoding without any video codecs involved.

```

------------- benchmark: 8 tests ------------

Name (time in ms)              Mean          

---------------------------------------------

enc_perf[LUT-(640x480)]        2.1851 (1.0)    

dec_perf[LUT-(640x480)]        2.2124 (1.01)   

enc_perf[LUT-(1920x1080)]     17.9139 (8.20) 

dec_perf[LUT-(1920x1080)]     16.4741 (7.54)   

enc_perf[noLUT-(640x480)]     22.3938 (10.25)  

dec_perf[noLUT-(640x480)]      6.9320 (3.17)   

dec_perf[noLUT-(1920x1080)]   74.6038 (34.14)  

enc_perf[noLUT-(1920x1080)]  158.0871 (72.35)  

---------------------------------------------

```

```shell

# run tests and benchmarks  

pytest

```

### Note on Video Codecs

When tuning the prameters for a video codec you need to take into account lossy compression that happens on  different levels of encoding:

 - **spatial/temporal** quantization based how images change over time. Usually controlled by the codec `preset/profile` (lossless, low-latency,...) and quality parameters such as `crf`, `qp`.

 - **color space** quantization due to converting RGB to target pixel format. Different codecs support different pixel color formats of which most perform a lossy compression from rgb to target space. 

Print encoder supported options and pixel formats

```shell

# print encoder options

ffmpeg -h encoder=h264_nvenc

```

See packing and compression of color formats

https://github.com/FFmpeg/FFmpeg/blob/master/libavutil/pixfmt.h

### Takeaways

Here are some take aways to consider

 - adjust zrange as tightly as possible to your use-case. This increases filesize but reduces reconstruction loss.

 - prefer loss-less codecs if affordable

 - when using lossy codecs ensure that your application does not rely on depth-edges.

The following plot shows a lossy h264 encoding on real data. Most of the errors are sub-mm, only few > 1cm. 

![](etc/h264-tuned-gpu.2.hist.png)

Turns out these errors are located on depth edges where the lossy codec starts interpolating values.

 

## Notes

The original paper referenced is potentially inaccurate in its equations. This has been noted in varios posts [#10415](https://github.com/IntelRealSense/librealsense/issues/10145),[#11187](https://github.com/IntelRealSense/librealsense/issues/11187),[#10302](https://github.com/IntelRealSense/librealsense/issues/10302).

This implementation is based on the original paper and code from

https://github.com/jdtremaine/hue-codec/.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cheind/hue-depth-encoding

Awesome Lists containing this project

README