https://github.com/rhysdg/whisper-onnx-python

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph
https://github.com/rhysdg/whisper-onnx-python

ai chatbot cuda machine-learning onnxruntime speech-to-text whisper

Last synced: 5 months ago
JSON representation

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph

Host: GitHub
URL: https://github.com/rhysdg/whisper-onnx-python
Owner: rhysdg
License: mit
Created: 2024-06-24T18:26:50.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-06-30T21:50:30.000Z (about 1 year ago)
Last Synced: 2025-02-08T14:34:16.925Z (5 months ago)
Topics: ai, chatbot, cuda, machine-learning, onnxruntime, speech-to-text, whisper
Language: Python
Homepage:
Size: 3.67 MB
Stars: 1
Watchers: 1
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        
[![Contributors][contributors-shield]](https://github.com/rhysdg/whisper-onnx-python/contributors)

[![Apache][license-shield]][license-url]

[![LinkedIn][linkedin-shield]][linkedin-url]




  
 Whisper ONNX: An Optimized Speech-to-Text Python Package

  


     


    Explore the docs »

    


    


    

    


    


    Report Bug

    .

    Request Feature

  


## Table of Contents

* [About the Project](#about-the-project)

  * [Built With](#built-with)

  * [The Story so Far](#the-story-so-far)

* [Getting Started](#getting-started)

  * [Prerequisites](#prerequisites)

  * [Scripts and Tools](#scripts-and-tools)

  * [Supplementary Data](#supplementary-data)

* [Proposed Updates](#proposed-updates)

* [Contact](#contact)

## About The Project

### Built With

* [Onnxruntime](https://onnxruntime.ai/)

### The Story So Far

**Coming soon**

## Getting Started:

- Right now getting started is as simple as either a pip install from root or the upstream repo:

  ```bash

  pip install .

  #or 

  pip install git+https://github.com/rhysdg/whisper-onnx-python.git

  ```

- For Jetpack 5 support with Python 3.11 go ahead and run the installation script first to grab a pre-built `onnxruntime-gpu` wheel for `aarch_64` and a few extra dependencies:

  ```bash

  sh jetson_install.sh 

  pip install .

  ```

## Example usage:

- Currently usage closely follows the official package but with a trt swicth (currently being debugged, False is recommended as a result) and expects either an audio file or a numy array:

  ```python

  import numpy as np

  import whisper

  args = {"language": 'English',

          "name": "small.en",

          "precision": "fp32",

          "disable_cupy": False}

  temperature = tuple(np.arange(0, 1.0 + 1e-6, 0.2))

  model = whisper.load_model(trt=False, **args)

  result = model.transcribe(

                      'data/test.wav', 

                      temperature=temperature,

                      **args

                      )

    ```

- You can also find an example voice transcription assistant at `examples/example_assistant.py`

  - Go ahead and hold in your space bar from the command line in order to start recording

  - Release to  start transcription

  - This has been tested on Ubuntu 22.04 and Jetpack 5 on a AGX  Xavier but feel free to open an issue so we can work through any issues!

  ```bash

  python examples/example_assistant.py

  ```

## Customisation:

- **Coming soon**

### Notebooks

 

- **Coming soon**

### Tools and Scripts

-  **Coming soon**

### Testing

- Ubuntu 22.04 - RTX 3080, 8-core, Python 3.11 - **passing**

- AGX Xavier, Jetpack 5.1.3, Python 3.11 - **Passing**

 - CI/CD will be expanded as we go - all general instantiation test pass so far.

### Models & Latency benchmarks

- **Coming soon**

### Similar projects

- Inspired by the work over at:

  - [whisper-onnx-tensorrt](https://github.com/PINTO0309/whisper-onnx-tensorrt)

  - [The original implementation](https://github.com/openai/whisper)

## Latest Updates

- Finished the core Python package

- Added an example assistant

- Added Jetpack support

## Future updates

- CI/CD

- Pypi release

- Becnhmarks for Jetson devices

## Contact

- Project link: https://github.com/rhysdg/whisper-onnx-python

- Email: [Rhys]([email protected])

[build-shield]: https://img.shields.io/badge/build-passing-brightgreen.svg?style=flat-square

[contributors-shield]: https://img.shields.io/badge/contributors-2-orange

[license-shield]: https://img.shields.io/badge/License-GNU%20GPL-blue

[license-url]: LICENSE.txt

[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=flat-square&logo=linkedin&colorB=555

[linkedin-url]: https://www.linkedin.com/in/rhys-williams-b19472160/

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rhysdg/whisper-onnx-python

Awesome Lists containing this project

README

Whisper ONNX: An Optimized Speech-to-Text Python Package