Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/seanghay/kfa
A fast Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus
https://github.com/seanghay/kfa
alignment cambodia forced-alignment khmer wav2vec2
Last synced: 30 days ago
JSON representation
A fast Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus
- Host: GitHub
- URL: https://github.com/seanghay/kfa
- Owner: seanghay
- Created: 2024-02-16T06:37:09.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-05-02T03:27:29.000Z (8 months ago)
- Last Synced: 2024-11-05T17:55:12.562Z (about 2 months ago)
- Topics: alignment, cambodia, forced-alignment, khmer, wav2vec2
- Language: Python
- Homepage: https://pypi.org/project/kfa
- Size: 10.1 MB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-khmer-language - kfa
README
## KFA
[[Google Colab]](https://colab.research.google.com/drive/1-aRxWOzqqsL7Qbgp95dlwN-_cvI41INf?usp=sharing)
A fast Khmer Forced Aligner powered by **Wav2Vec2CTC** and **Phonetisaurus**.
- [ ] Built-in Speech Enhancement
- [x] Word-level Alignment```shell
pip install kfa
```#### CLI
> [!Note]
> `audio.wav` Input audio sample rate should be in 16kHz. Use ffmpeg or any other tools to resample the audio before processing.
>
> `ffmpeg -i audio_orig.wav -ac 1 -ar 16000 audio.wav````shell
kfa -a audio.wav -t text.txt -o alignments.jsonl# Output as Whisper style JSON format
kfa -a audio.wav -t text.txt --format whisper -o alignments.json
```#### Python
```python
from kfa import align, create_session
import librosawith open("test.txt") as infile:
text = infile.read()y, sr = librosa.load("text.wav", sr=16000, mono=True)
session = create_session()for alignment in align(y, sr, text, session=session):
print(alignment)
```#### References
- [MMS: Scaling Speech Technology to 1000+ languages](https://github.com/facebookresearch/fairseq/tree/main/examples/mms)
- [CTC FORCED ALIGNMENT API TUTORIAL](https://pytorch.org/audio/main/tutorials/ctc_forced_alignment_api_tutorial.html)
- [Phonetisaurus](https://github.com/AdolfVonKleist/Phonetisaurus)
- [Fine-Tune Wav2Vec2 for English ASR with 🤗 Transformers](https://huggingface.co/blog/fine-tune-wav2vec2-english)
- [Thai Wav2vec2 model to ONNX model](https://pythainlp.github.io/tutorials/notebooks/thai_wav2vec2_onnx.html)#### License
`Apache-2.0`