Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rhysdg/whisper-onnx-python
A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph
https://github.com/rhysdg/whisper-onnx-python
ai chatbot cuda machine-learning onnxruntime speech-to-text whisper
Last synced: 2 months ago
JSON representation
A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph
- Host: GitHub
- URL: https://github.com/rhysdg/whisper-onnx-python
- Owner: rhysdg
- License: mit
- Created: 2024-06-24T18:26:50.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-06-30T21:50:30.000Z (6 months ago)
- Last Synced: 2024-09-27T06:22:14.802Z (3 months ago)
- Topics: ai, chatbot, cuda, machine-learning, onnxruntime, speech-to-text, whisper
- Language: Python
- Homepage:
- Size: 3.67 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[![Contributors][contributors-shield]](https://github.com/rhysdg/whisper-onnx-python/contributors)
[![Apache][license-shield]][license-url]
[![LinkedIn][linkedin-shield]][linkedin-url]
Whisper ONNX: An Optimized Speech-to-Text Python Package
Explore the docs »
Report Bug
.
Request Feature
## Table of Contents
* [About the Project](#about-the-project)
* [Built With](#built-with)
* [The Story so Far](#the-story-so-far)
* [Getting Started](#getting-started)
* [Prerequisites](#prerequisites)
* [Scripts and Tools](#scripts-and-tools)
* [Supplementary Data](#supplementary-data)
* [Proposed Updates](#proposed-updates)
* [Contact](#contact)## About The Project
### Built With
* [Onnxruntime](https://onnxruntime.ai/)
### The Story So Far
**Coming soon**
## Getting Started:
- Right now getting started is as simple as either a pip install from root or the upstream repo:
```bash
pip install .#or
pip install git+https://github.com/rhysdg/whisper-onnx-python.git
```
- For Jetpack 5 support with Python 3.11 go ahead and run the installation script first to grab a pre-built `onnxruntime-gpu` wheel for `aarch_64` and a few extra dependencies:
```bash
sh jetson_install.shpip install .
```
## Example usage:
- Currently usage closely follows the official package but with a trt swicth (currently being debugged, False is recommended as a result) and expects either an audio file or a numy array:
```python
import numpy as np
import whisperargs = {"language": 'English',
"name": "small.en",
"precision": "fp32",
"disable_cupy": False}temperature = tuple(np.arange(0, 1.0 + 1e-6, 0.2))
model = whisper.load_model(trt=False, **args)
result = model.transcribe(
'data/test.wav',
temperature=temperature,
**args
)
```- You can also find an example voice transcription assistant at `examples/example_assistant.py`
- Go ahead and hold in your space bar from the command line in order to start recording
- Release to start transcription
- This has been tested on Ubuntu 22.04 and Jetpack 5 on a AGX Xavier but feel free to open an issue so we can work through any issues!```bash
python examples/example_assistant.py
```## Customisation:
- **Coming soon**
### Notebooks
- **Coming soon**### Tools and Scripts
- **Coming soon**### Testing
- Ubuntu 22.04 - RTX 3080, 8-core, Python 3.11 - **passing**
- AGX Xavier, Jetpack 5.1.3, Python 3.11 - **Passing**- CI/CD will be expanded as we go - all general instantiation test pass so far.
### Models & Latency benchmarks
- **Coming soon**
### Similar projects
- Inspired by the work over at:
- [whisper-onnx-tensorrt](https://github.com/PINTO0309/whisper-onnx-tensorrt)
- [The original implementation](https://github.com/openai/whisper)## Latest Updates
- Finished the core Python package
- Added an example assistant
- Added Jetpack support## Future updates
- CI/CD
- Pypi release
- Becnhmarks for Jetson devices## Contact
- Project link: https://github.com/rhysdg/whisper-onnx-python
- Email: [Rhys]([email protected])[build-shield]: https://img.shields.io/badge/build-passing-brightgreen.svg?style=flat-square
[contributors-shield]: https://img.shields.io/badge/contributors-2-orange
[license-shield]: https://img.shields.io/badge/License-GNU%20GPL-blue
[license-url]: LICENSE.txt
[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=flat-square&logo=linkedin&colorB=555
[linkedin-url]: https://www.linkedin.com/in/rhys-williams-b19472160/