Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aryanxxvii/lark
Speech Assessment API in FastAPI with Hugging Face π€
https://github.com/aryanxxvii/lark
fastapi huggingface llm machine-learning phoneme-recognition pronunciation speech-recognition wav2vec2
Last synced: 6 days ago
JSON representation
Speech Assessment API in FastAPI with Hugging Face π€
- Host: GitHub
- URL: https://github.com/aryanxxvii/lark
- Owner: aryanxxvii
- Created: 2023-08-03T11:27:54.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-17T08:02:42.000Z (20 days ago)
- Last Synced: 2025-01-17T09:19:00.180Z (20 days ago)
- Topics: fastapi, huggingface, llm, machine-learning, phoneme-recognition, pronunciation, speech-recognition, wav2vec2
- Language: JavaScript
- Homepage: https://larkapi.vercel.app/
- Size: 183 KB
- Stars: 10
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Lark API Readme
## What is it?
- Lark API is a speech assessment REST API built using FastAPI in Python.
- It provides accuracy scores, speech to text transcription, and the projected IELTS pronunciation band.
- It allows English learning apps and websites to assess and provide real-time feedback on the usersβ pronunciation.## How does it work?
### ML:
![]()
- Lark utilizes the Wav2Vec2 model from Meta for analyzing the speech sample.
- It converts the speech to itβs phonetic transcription (S2P) using zero-shot cross-lingual recognition.
- After recognizing the phonetics of the speech, it compares it with the ideal pronunciation of the transcribed speech using the Jaro-Winkler string similarity algorithm.### Backend:
- The API is written completely in FastAPI with MongoDB as the database.
### The Frontend part:
- The Frontend is written using ReactJS and TailwindCSS.
### ML Models used
- [facebook/wav2vec2-xlsr-53-espeak-cv-ft](https://huggingface.co/facebook/wav2vec2-xlsr-53-espeak-cv-ft)
- [facebook/wav2vec2-base-960h](https://huggingface.co/facebook/wav2vec2-base-960h)## References
- [[2109.11680] Simple and Effective Zero-shot Cross-lingual Phoneme Recognition (arxiv.org)](https://arxiv.org/abs/2109.11680)
- [luozhouyang/python-string-similarity (github.com)](https://github.com/luozhouyang/python-string-similarity#python-string-similarity)
- [Hugging Face JS libraries](https://huggingface.co/docs/huggingface.js/index)