Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mrousavy/vision-camera-resize-plugin

A VisionCamera Frame Processor plugin for fast buffer resizing and colorspace (YUV <> RGBA) conversions
https://github.com/mrousavy/vision-camera-resize-plugin

buffer camera conversion convert react-native react-native-vision-camera resize rgb tensorflow tflite vision yuv

Last synced: 2 days ago
JSON representation

A VisionCamera Frame Processor plugin for fast buffer resizing and colorspace (YUV <> RGBA) conversions

Awesome Lists containing this project

README

        

# vision-camera-resize-plugin

A [VisionCamera](https://github.com/mrousavy/react-native-vision-camera) Frame Processor Plugin for fast and efficient Frame resizing, cropping and pixel-format conversion (YUV -> RGB) using GPU-acceleration, CPU-vector based operations and ARM NEON SIMD acceleration.

## Installation

1. Install [react-native-vision-camera](https://github.com/mrousavy/react-native-vision-camera) (>= 3.8.2) and [react-native-worklets-core](https://github.com/margelo/react-native-worklets-core) (>= 0.2.4) and make sure Frame Processors are enabled.
2. Install vision-camera-resize-plugin:
```sh
yarn add vision-camera-resize-plugin
cd ios && pod install
```

## Usage

Use the `resize` plugin within a Frame Processor:

```tsx
const { resize } = useResizePlugin()

const frameProcessor = useFrameProcessor((frame) => {
'worklet'

const resized = resize(frame, {
scale: {
width: 192,
height: 192
},
pixelFormat: 'rgb',
dataType: 'uint8'
})

const firstPixel = {
r: resized[0],
g: resized[1],
b: resized[2]
}
}, [])
```

Or outside of a function component:

```tsx
const { resize } = createResizePlugin()

const frameProcessor = createFrameProcessor((frame) => {
'worklet'

const resized = resize(frame, {
// ...
})
// ...
})
```

## Pixel Formats

The resize plugin operates in RGB colorspace.

Name
0
1
2
3

rgb
R
G
B
R

rgba
R
G
B
A

argb
A
R
G
B

bgra
B
G
R
A

bgr
B
G
R
B

abgr
A
B
G
R

## Data Types

The resize plugin can either convert to uint8 or float32 values:

Name
JS Type
Value Range
Example size

uint8
Uint8Array
0...255
1920x1080 RGB Frame = ~6.2 MB

float32
Float32Array
0.0...1.0
1920x1080 RGB Frame = ~24.8 MB

## Cropping

When scaling to a different size (e.g. 1920x1080 -> 100x100), the Resize Plugin performs a center-crop on the image before scaling it down so the resulting image matches the target aspect ratio instead of being stretched.

You can customize this by passing a custom `crop` parameter, e.g. instead of center-crop, use the top portion of the screen:

```ts
const resized = resize(frame, {
scale: {
width: 192,
height: 192
},
crop: {
y: 0,
x: 0,
// 1:1 aspect ratio because we scale to 192x192
width: frame.width,
height: frame.width
},
pixelFormat: 'rgb',
dataType: 'uint8'
})
```

### Performance

If possible, use one of these two formats:

- `argb` in `uint8`: Can be converted the fastest, but has an additional unused alpha channel.
- `rgb` in `uint8`: Requires one more conversion step from `argb`, but uses 25% less memory due to the removed alpha channel.

All other formats require additional conversion steps, and `float` models have additional memory overhead (4x as big).

When using TensorFlow Lite, try to convert your model to use `argb-uint8` or `rgb-uint8` as it's input type.

## react-native-fast-tflite

The vision-camera-resize-plugin can be used together with [react-native-fast-tflite](https://github.com/mrousavy/react-native-fast-tflite) to prepare the input tensor data.

For example, to use the [efficientdet](https://www.kaggle.com/models/tensorflow/efficientdet/frameworks/tfLite) TFLite model to detect objects inside a Frame, simply add the model to your app's bundle, set up VisionCamera and react-native-fast-tflite, and resize your Frames accordingly.

From the model's description on the website, we understand that the model expects 320 x 320 x 3 buffers as input, where the format is uint8 rgb.

```ts
const objectDetection = useTensorflowModel(require('assets/efficientdet.tflite'))
const model = objectDetection.state === "loaded" ? objectDetection.model : undefined

const { resize } = useResizePlugin()

const frameProcessor = useFrameProcessor((frame) => {
'worklet'

const data = resize(frame, {
scale: {
width: 320,
height: 320,
},
pixelFormat: 'rgb',
dataType: 'uint8'
})
const output = model.runSync([data])

const numDetections = output[0]
console.log(`Detected ${numDetections} objects!`)
}, [model])
```

## Benchmarks

I benchmarked vision-camera-resize-plugin on an iPhone 15 Pro, using the following code:

```tsx
const start = performance.now()
const result = resize(frame, {
scale: {
width: 100,
height: 100,
},
pixelFormat: 'rgb',
dataType: 'uint8'
})
const end = performance.now()

const diff = (end - start).toFixed(2)
console.log(`Resize and conversion took ${diff}ms!`)
```

And when running on 1080x1920 yuv Frames, I got the following results:

```
LOG Resize and conversion took 6.48ms
LOG Resize and conversion took 6.06ms
LOG Resize and conversion took 5.89ms
LOG Resize and conversion took 5.97ms
LOG Resize and conversion took 6.98ms
```

This means the Frame Processor can run at up to ~160 FPS.

## Adopting at scale


This library helped you? Consider sponsoring!

This library is provided _as is_, I work on it in my free time.

If you're integrating vision-camera-resize-plugin in a production app, consider [funding this project](https://github.com/sponsors/mrousavy) and contact me to receive premium enterprise support, help with issues, prioritize bugfixes, request features, help at integrating vision-camera-resize-plugin and/or VisionCamera Frame Processors, and more.

## Contributing

See the [contributing guide](CONTRIBUTING.md) to learn how to contribute to the repository and the development workflow.

## License

MIT

---

Made with [create-react-native-library](https://github.com/callstack/react-native-builder-bob)