Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/transcriptionstream/transcriptionstream

turnkey self-hosted offline transcription and diarization service with llm summary
https://github.com/transcriptionstream/transcriptionstream

automation diarization llm mistral-7b ollama speaker-diarization speech-recognition transcription whisper whisperx

Last synced: about 1 month ago
JSON representation

turnkey self-hosted offline transcription and diarization service with llm summary

Awesome Lists containing this project

README

        

### README #######
# transcription stream 11/2023 - https://transcription.stream ******
This project creates a ssh and web accessible platform setup for trasnscribing and diarizing audio files. Files dropped via ssh into transcribe or
diarize are treated to their respective process, with the output placed into a dated folder named after the audio file within transcribed. Likewise,
uploading files via the ts-web web interface places files into the same folders, reading the results from the transcribed folder.

Expects an nvidia gpu.

### Build and run #######
-Create the transcriptionstream volume
docker volume create -—name=transcriptionstream

-Then create the ts-web and ts-gpu images from their folders respectively:
#ts-web (very small very fast minimal build)
docker build -t ts-web:latest .
#ts-gpu (this is going to take a while and end up around 13.8GB - while large, it contains the needed models to run offline)
docker build -t ts-gpu:latest .

-Run the service by kicking off the docker-compose. The console will provide plenty of updates for running jobs, and lots of noisy
info from ts-web.
docker-compose -p transcriptionstream up

### Notes #######
ports: 22222/ssh 5006/http

Access the ssh server with on port 22222, placing the audio files you'd like to have transcribed into the transcribe
folder, and diarized into the diarize folder. Completed files are placed into a dated, named, folder under transcribed.
user: transcriptionstream
pass: nomoresaastax

Access the web front end at http://dockerip:5006 Please understand I don't know flask, python, or javascript but was able to put this
together with our friend chatgpt and many questions. It was a fun excercise that I still have future plans for. This version provides:
- audio file upload/download
- actionable task completion alerts (you can click on the alert and the transcription loads in the player)
- html 5 web player with speed control and transcription highlighting
- in transcription time scrubbing/scrolling synced to audio via the time slider
- lots of things done incorrectly

You shouldn't run this in production. This is functional example code. Did you see the part about ts-gpu taking a while to build?
It's going to take a while, for me it's about 15 minutes, again, with the benefit of having the models available for offline use.

You should change the password for transcriptionstream in the ts-gpu Dockerfile, and update the secret in ts-web app.py

The transcription option was changed to whisperx from openai's whisper - mostly because it was already there for diarize.py and whisper was breaking
the buidld - all that to say the raw text output for transcriptionsdoesn't display correctly in the console and probably ts-web, but the other generated
files are good. The transcription option is also set to use the large-v2 model which is not downloaded during build. First transcription probably
delayed by that. RUN line can be added to the ts-gpu Dockerfile so it's included in the image build, or you can modify transcribe_example_d.sh to use
the medium model as solutions.