Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/chetanxpro/nodejs-whisper

Introducing NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as initially crafted in C++ by ggerganov.
https://github.com/chetanxpro/nodejs-whisper

ai cpp ml nodejs-whisper openai speech-recognition speech-to-text timestamp whisper whisper-nodejs

Last synced: about 1 month ago
JSON representation

Introducing NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as initially crafted in C++ by ggerganov.

Awesome Lists containing this project

README

        

# nodejs-whisper

Node.js bindings for OpenAI's Whisper model.

[![MIT License](https://img.shields.io/badge/License-MIT-green.svg)](https://choosealicense.com/licenses/mit/)

## Features

- Automatically convert the audio to WAV format with a 16000 Hz frequency to support the whisper model.
- Output transcripts to (.txt .srt .vtt .json .wts .lrc)
- Optimized for CPU (Including Apple Silicon ARM)
- Timestamp precision to single word
- Split on word rather than on token (Optional)
- Translate from source language to english (Optional)
- Convert audio format to wav to support whisper model

## Installation

1. Install make tools

```bash
sudo apt update
sudo apt install build-essential
```

2. Install nodejs-whisper with npm

```bash
npm i nodejs-whisper
```

3. Download whisper model

```bash
npx nodejs-whisper download
```

- NOTE: user may need to install make tool

## Usage/Examples

```javascript
import path from 'path'
import { nodewhisper } from 'nodejs-whisper'

// Need to provide exact path to your audio file.
const filePath = path.resolve(__dirname, 'YourAudioFileName')

await nodewhisper(filePath, {
modelName: 'base.en', //Downloaded models name
autoDownloadModelName: 'base.en', // (optional) autodownload a model if model is not present
verbose: false, // (optional) output more dubugging information
removeWavFileAfterTranscription: false, // (optional) remove wav file once transcribed
withCuda: false // (optional) use cuda for faster processing
whisperOptions: {
outputInCsv: false, // get output result in csv file
outputInJson: false, // get output result in json file
outputInJsonFull: false, // get output result in json file including more information
outputInLrc: false, // get output result in lrc file
outputInSrt: true, // get output result in srt file
outputInText: false, // get output result in txt file
outputInVtt: false, // get output result in vtt file
outputInWords: false, // get output result in wts file for karaoke
translateToEnglish: false, // translate from source language to english
wordTimestamps: false, // word-level timestamps
timestamps_length: 20, // amount of dialogue per timestamp pair
splitOnWord: true, // split on word rather than on token
},
})

// Model list
const MODELS_LIST = [
'tiny',
'tiny.en',
'base',
'base.en',
'small',
'small.en',
'medium',
'medium.en',
'large-v1',
'large',
]
```

## Types

```
interface IOptions {
modelName: string
verbose?: boolean
removeWavFileAfterTranscription?: boolean
withCuda?: boolean
autoDownloadModelName?: string
whisperOptions?: WhisperOptions
}

interface WhisperOptions {
outputInCsv?: boolean
outputInJson?: boolean
outputInJsonFull?: boolean
outputInLrc?: boolean
outputInSrt?: boolean
outputInText?: boolean
outputInVtt?: boolean
outputInWords?: boolean
translateToEnglish?: boolean
timestamps_length?: number
wordTimestamps?: boolean
splitOnWord?: boolean
}

```

## Run locally

Clone the project

```bash
git clone https://github.com/ChetanXpro/nodejs-whisper
```

Go to the project directory

```bash
cd nodejs-whisper
```

Install dependencies

```bash
npm install
```

Start the server

```bash
npm run dev
```

Build project

```bash
npm run build
```

## Made with

- [Whisper OpenAI (using C++ port by: ggerganov)](https://github.com/ggerganov/whisper.cpp)

## Feedback

If you have any feedback, please reach out to us at [email protected]

## Authors

- [@chetanXpro](https://www.github.com/chetanXpro)