Projects in Awesome Lists tagged with forced-alignment

https://github.com/readbeyond/aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

alignment audio cli dtw espeak espeak-ng festival ffmpeg forced-alignment linux macos nlp python smil speech srt text text-to-speech tts windows

Last synced: 14 May 2025

https://github.com/montrealcorpustools/montreal-forced-aligner

Command line utility for forced alignment using Kaldi

acoustic-model forced-alignment grapheme-to-phone kaldi pronunciation-dictionary python

Last synced: 02 Feb 2026

https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi

acoustic-model forced-alignment grapheme-to-phone kaldi pronunciation-dictionary python

Last synced: 08 May 2025

https://github.com/echogarden-project/echogarden

Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.

command-line forced-alignment language-detection language-identification node-js source-separation speech speech-alignment speech-recognition speech-synthesis speech-to-text speech-translation text-to-speech voice-isolation

Last synced: 15 May 2025

https://github.com/r4victor/syncabook

📖🎧 A tool for creating ebooks with synchronized text and audio (EPUB3 with Media Overlays)

audiobooks ebooks epub3 forced-alignment librivox

Last synced: 09 Apr 2025

https://github.com/mozilla/dsalign

DeepSpeech based forced alignment tool

deepspeech forced-alignment

Last synced: 17 Mar 2025

https://github.com/mozilla/DSAlign

DeepSpeech based forced alignment tool

deepspeech forced-alignment

Last synced: 14 Jul 2025

https://github.com/saurabhshri/CCAligner

🔮 Word by word audio subtitle synchronisation tool and API. Developed under GSoC 2017 with CCExtractor.

aligner api ccextractor cli closed-captions cpp forced-alignment google-summer-of-code gsoc gsoc-2017 karaoke phonetic-transcriptions pocketsphinx speech-recognition subtitle-alignment subtitles transcription word-level-alignment

Last synced: 21 Apr 2025

https://github.com/saurabhshri/ccaligner

🔮 Word by word audio subtitle synchronisation tool and API. Developed under GSoC 2017 with CCExtractor.

aligner api ccextractor cli closed-captions cpp forced-alignment google-summer-of-code gsoc gsoc-2017 karaoke phonetic-transcriptions pocketsphinx speech-recognition subtitle-alignment subtitles transcription word-level-alignment

Last synced: 19 Aug 2025

https://github.com/tabahi/bournemouth-forced-aligner

Extract phoneme-level timestamps from speeh audio.

alignment forced-alignment phonemes speech speech-processing speech-recognition text-to-speech timestamps tts tts-dataset word

Last synced: 24 Feb 2026

https://github.com/feldberlin/timething

Timething is a library for aligning text transcripts with their audio recordings.

alignment audio cli forced-alignment huggingface nlp python speech speech-recognition tts

Last synced: 17 Mar 2025

https://github.com/r4victor/afaligner

📈 A forced aligner intended for synchronization of narrated text

forced-alignment

Last synced: 30 Apr 2025

https://github.com/bunyaminergen/callytics

Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.

denoising diarization forced-alignment llama3 llm openai opensource sentiment-analysis speech-emotion-recognition speech-processing speech-recognition speech-to-text summary topic-modeling transcription voice-activity-detection voice-recognition

Last synced: 03 Apr 2025

https://github.com/mahtafetrat/manatts-persian-speech-dataset

ManaTTS is the largest open Persian speech dataset with 100+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.

data-collection data-preprocessing dataset-preparation forced-alignment mana-tts persian persian-speech speech-corpus speech-data-collection speech-dataset speech-processing speech-synthesis text-to-speech text-to-speech-dataset tts tts-dataset

Last synced: 08 Apr 2025

https://github.com/proger/uk

Фонограми та синтагми: інструменти обробки

dataset-generation forced-alignment hmm kaldi speech-recognition ukrainian ukrainian-language

Last synced: 20 Jan 2026

https://github.com/mahmoudashraf97/ctc-forced-aligner

Text to speech alignment using CTC forced alignment

forced-alignment

Last synced: 05 Apr 2025

https://github.com/MahtaFetrat/ManaTTS-Persian-Speech-Dataset

ManaTTS is the largest open Persian speech dataset with 86+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.

data-collection data-preprocessing dataset-preparation forced-alignment mana-tts persian persian-speech speech-corpus speech-data-collection speech-dataset speech-processing speech-synthesis text-to-speech text-to-speech-dataset tts tts-dataset

Last synced: 01 Mar 2025

https://github.com/dcavar/elan2split

Split ELAN Annotation Files and corresponding speech files into a corpus format for common ASR and Forced Aligners

cpp11 elan forced-alignment sox speech-corpus speech-recognition xerxes xml

Last synced: 13 Apr 2025

https://github.com/samuelbradshaw/text-to-timestamps

Python and command-line utility for aligning audio to a transcript.

batch-processing captioning command-line forced-alignment karaoke machine-learning mlx mps python speech-recognition speech-to-text subtitles transcription webvtt

Last synced: 06 Mar 2026

https://github.com/dbklim/webrtcvad_wrapper

A simple Python wrapper to simplify working with WebRTC VAD and its rougher analogue based on RMS and ZCR (useful for processing audio recordings before using them with neural networks).

audio audio-processing dsp forced-alignment python silence-suppression vad vad-detection voice-activity-detection webrtc webrtc-tools webrtc-vad webrtcvad-wrapper

Last synced: 19 Jul 2025

https://github.com/mahtafetrat/gptinformal-persian-speech-dataset

A free licensed Persian TTS dataset including 6+ hours of audio-text pairs with subject

data-collection data-preprocessing dataset-preparation forced-alignment mana-tts manatts persian persian-speech speech-corpus speech-data-collection speech-dataset speech-processing speech-synthesis text-to-speech text-to-speech-dataset tts tts-dataset

Last synced: 19 Jan 2026

https://github.com/mahtafetrat/mana-forced-aligner

A robust forced alignment tool for low-resource languages using multiple ASR models and CER-based matching. Built for noisy data and imperfect transcripts.

asr forced-alignment low-resource-languages mana-tts manatts open-source speech-alignment speech-dataset tts

Last synced: 19 Jun 2025

https://github.com/seanghay/kfa

A fast Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus

alignment cambodia forced-alignment khmer wav2vec2

Last synced: 13 Jul 2025

https://github.com/wxjiao/bert-text-features

BERT-Text-Features for Tokenized Transcripts from P2FA.

bert-embeddings forced-alignment p2fa text-features

Last synced: 22 Jul 2025

https://github.com/mahtafetrat/virgoolinformal-speech-dataset

A dataset of informal Persian audio and text chunks, along with a fully open processing pipeline, suitable for ASR and TTS tasks. Created from crawled content on virgool.io.

asr asr-evaluation forced-alignment persian persian-speech-corpus persian-speech-dataset persian-speech-recognition persian-text-to-speech speech-data-collection speech-dataset speech-processing tts

Last synced: 15 Apr 2025

https://github.com/krmanik/tiny-aligner

Accurate, fast, tiny. A CTC-based acoustic model (930K parameters) for Chinese + English audio forced alignment, running in browser.

forced-alignment grapheme-to-phone pronunciation-dictionary python

Last synced: 23 Jun 2026

https://github.com/morehardy/echoalign-asr-mlx

Local Apple Silicon CLI for ASR, subtitles, WebVTT/SRT, and timestamp-aligned JSON with MLX + Qwen3

apple-silicon asr automatic-speech-recognition cli forced-alignment local-ai mlx python qwen3 speech-recognition srt subtitles webvtt

Last synced: 25 Jun 2026

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome