Projects in Awesome Lists tagged with speech-processing

https://github.com/speechbrain/speechbrain

A PyTorch-based Speech Toolkit

asr audio audio-processing deep-learning huggingface language-model pytorch speaker-diarization speaker-recognition speaker-verification speech-enhancement speech-processing speech-recognition speech-separation speech-to-text speech-toolkit speechrecognition spoken-language-understanding transformers voice-recognition

Last synced: 13 Jan 2025

https://github.com/pyannote/pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

overlapped-speech-detection pretrained-models pytorch speaker-change-detection speaker-diarization speaker-embedding speaker-recognition speaker-verification speech-activity-detection speech-processing voice-activity-detection

Last synced: 14 Jan 2025

https://github.com/snakers4/silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

onnx onnx-runtime onnxruntime pytorch speech speech-processing vad voice-activity-detection voice-commands voice-control voice-detection voice-recognition

Last synced: 16 Jan 2025

https://github.com/microsoft/torchscale

Foundation Architecture for (M)LLMs

computer-vision machine-learning multimodal natural-language-processing pretrained-language-model speech-processing transformer translation

Last synced: 15 Jan 2025

https://github.com/r9y9/wavenet_vocoder

WaveNet vocoder

neural-vocoder python pytorch speech speech-processing speech-synthesis wavenet wavenet-vocoder

Last synced: 17 Jan 2025

https://github.com/linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

asr attention-is-all-you-need attention-mechanism attention-model attention-network attention-seq2seq attention-visualization deep-learning machine-learning multilingual-models python python3 pytorch speaker-diarization speech speech-processing speech-recognition speech-to-text transformers whisper

Last synced: 14 Jan 2025

https://github.com/r9y9/deepvoice3_pytorch

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

end-to-end machine-learning multi-speaker python pytorch speech-processing speech-synthesis tts

Last synced: 17 Jan 2025

https://github.com/resemble-ai/resemble-enhance

AI powered speech denoising and enhancement

denoise speech-denoising speech-enhancement speech-processing

Last synced: 15 Jan 2025

https://github.com/coqui-ai/open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

speech-emotion-recognition speech-processing speech-recognition speech-separation speech-synthesis speech-to-text stt text-to-speech tts voice-activity-detection voice-cloning voice-recognition

Last synced: 03 Dec 2024

https://github.com/mravanelli/sincnet

SincNet is a neural architecture for efficiently processing raw audio samples.

artificial-intelligence asr audio audio-processing cnn convolutional-neural-networks deep-learning digital-signal-processing filtering neural-networks python pytorch signal-processing speaker-identification speaker-recognition speaker-verification speech-processing speech-recognition timit waveform

Last synced: 19 Jan 2025

https://github.com/mravanelli/SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.

artificial-intelligence asr audio audio-processing cnn convolutional-neural-networks deep-learning digital-signal-processing filtering neural-networks python pytorch signal-processing speaker-identification speaker-recognition speaker-verification speech-processing speech-recognition timit waveform

Last synced: 11 Nov 2024

https://github.com/midas-research/audino

Open source audio annotation tool for humans

annotation-tool audio-annotation audio-processing datasets machine-learning python speech-processing

Last synced: 17 Jan 2025

https://github.com/haoheliu/voicefixer

General Speech Restoration

declipping denoise dereverberation mel speech speech-analysis speech-enhancement speech-processing speech-synthesis super-resolution tts vocoder

Last synced: 14 Jan 2025

https://github.com/ictnlp/streamspeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

all-in-one asr audio-processing machine-translation non-autoregressive seamless simultaneous-translation speech speech-enhancement speech-processing speech-recognition speech-synthesis speech-to-text speech-translation streaming-audio text-to-audio text-to-speech translation tts voice

Last synced: 17 Jan 2025

https://github.com/Ryuk17/SpeechAlgorithms

You can find the speech algorithms you want here

speech-processing

Last synced: 01 Nov 2024

https://github.com/drethage/speech-denoising-wavenet

A neural network for end-to-end speech denoising

deep-learning end-to-end machine-learning neural-networks speech speech-denoising speech-processing wavenet

Last synced: 22 Nov 2024

https://github.com/x-lance/slam-llm

Speech, Language, Audio, Music Processing with Large Language Model

audio-processing large-language-model multimodal-large-language-models music-processing peft speech-processing

Last synced: 18 Jan 2025

https://github.com/X-LANCE/SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

audio-processing large-language-model multimodal-large-language-models music-processing peft speech-processing

Last synced: 06 Jan 2025

https://github.com/huawei-noah/speech-backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

speech-processing speech-recognition speech-synthesis

Last synced: 18 Jan 2025

https://github.com/Audio-WestlakeU/FullSubNet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

audio band denoising full-band narrow-band noise-reduction paper pretrained-model pytorch reproducible-research single-channel speech speech-enhancement speech-processing speech-separation sub-band

Last synced: 22 Nov 2024

https://github.com/nyrahealth/crisperwhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

asr audio detection filler recognition speech speech-processing speech-recognition timestamps transcription verbatim whisper

Last synced: 17 Jan 2025

https://github.com/ddlbojack/speech-resources

语音方向实验室/公司/资源/实习等，欢迎推荐或自荐

speech speech-processing

Last synced: 21 Nov 2024

https://github.com/ddlBoJack/Speech-Resources

语音方向实验室/公司/资源/实习等，欢迎推荐或自荐

speech speech-processing

Last synced: 02 Nov 2024

https://github.com/pliang279/multibench

[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning

computer-vision deep-learning healthcare machine-learning multimodal-learning natural-language-processing representation-learning robotics speech-processing

Last synced: 12 Jan 2025

https://github.com/pliang279/MultiBench

[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning

computer-vision deep-learning healthcare machine-learning multimodal-learning natural-language-processing representation-learning robotics speech-processing

Last synced: 15 Nov 2024

https://github.com/arjo129/uspeech

Speech recognition toolkit for the arduino

arduino signal speech-processing speech-recognition

Last synced: 24 Nov 2024

https://github.com/arjo129/uSpeech

Speech recognition toolkit for the arduino

arduino signal speech-processing speech-recognition

Last synced: 17 Nov 2024

https://github.com/microsoft/unispeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

diarization pytorch speaker-verification speech speech-diarization speech-processing speech-recognition speech-separation

Last synced: 19 Jan 2025

https://github.com/r9y9/pysptk

A python wrapper for Speech Signal Processing Toolkit (SPTK).

digital-signal-processing dsp python python-wrapper speech speech-processing speech-synthesis sptk

Last synced: 20 Jan 2025

https://github.com/santi-pdp/pase

Problem Agnostic Speech Encoder

deep-learning multi-task-learning pytorch self-supervised-learning speech-processing unsupervised-learning waveform-analysis

Last synced: 25 Oct 2024

https://github.com/SuperKogito/spafe

:sound: spafe: Simplified Python Audio Features Extraction

audio audio-analysis beat dsp features-extraction filterbank frequencies frequency frequency-analysis gammatone-filterbanks mfcc music music-information-retrieval pitch python signal-processing sound speech-processing time-frequency-analysis voice

Last synced: 22 Nov 2024

https://github.com/gemengtju/Tutorial_Separation

This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.

deep-learning deep-neural-networks signal-processing speech-analysis speech-processing speech-separation

Last synced: 02 Nov 2024

https://github.com/novoic/surfboard

Novoic's audio feature extraction library

alzheimers-disease audio audio-processing feature-extraction healthcare machine-learning parkinsons-disease python signal-processing speech-processing

Last synced: 04 Nov 2024

https://github.com/r9y9/nnmnkwii

Library to build speech synthesis systems designed for easy and fast prototyping.

machine-learning python speech-processing speech-synthesis text-to-speech voice-conversion

Last synced: 20 Jan 2025

https://github.com/speechbrain/speechbrain.github.io

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

beamforming deep-learning deeplearning librispeech neural-network neural-networks speaker-identification speaker-recognition speaker-verification speech speech-analysis speech-api speech-emotion-recognition speech-processing speech-recognition speech-recognizer speech-separation speech-to-text speechrecognition timit

Last synced: 13 Nov 2024

https://github.com/seanwood/gcc-nmf

Real-time GCC-NMF Blind Speech Separation and Enhancement

cross-correlation dictionary-learning gcc gcc-nmf generalized-cross-correlation ipython-notebook low-latency machine-learning nmf real-time real-time-processing speaker speech speech-enhancement speech-processing speech-separation tdoa unsupervised-machine-learning

Last synced: 15 Jan 2025

https://github.com/haoxiangsnr/A-Convolutional-Recurrent-Neural-Network-for-Real-Time-Speech-Enhancement

A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorch

cnn cnn-rnn pytorch real-time rnn speech-enhancement speech-processing

Last synced: 22 Nov 2024

https://github.com/nvidia/cleanunet

Official PyTorch Implementation of CleanUNet (ICASSP 2022)

noise-reduction speech-denoising speech-enchacement speech-processing

Last synced: 14 Jan 2025

https://github.com/Yuan-ManX/audio-development-tools

This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.

artificial-intelligence audio audio-generation audio-processing deep-learning dsp machine-learning music music-generation signal-processing speech speech-processing speech-synthesis

Last synced: 27 Oct 2024

https://github.com/haoheliu/voicefixer_main

General Speech Restoration

machine-learning speech speech-analysis speech-enhancement speech-processing speech-synthesis speech-to-text tts

Last synced: 14 Jan 2025

https://github.com/gtreshchev/runtimespeechrecognizer

Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.

audio-processing openai speech-detection speech-processing speech-recognition speech-to-text ue4 ue4-plugin ue5 ue5-plugin unreal-engine unreal-engine-4 unreal-engine-5 voice-recognition whis whisper whisper-ai whisper-cpp

Last synced: 15 Jan 2025

https://github.com/gtreshchev/RuntimeSpeechRecognizer

Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.

audio-processing openai speech-detection speech-processing speech-recognition speech-to-text ue4 ue4-plugin ue5 ue5-plugin unreal-engine unreal-engine-4 unreal-engine-5 voice-recognition whis whisper whisper-ai whisper-cpp

Last synced: 06 Nov 2024

https://github.com/r9y9/ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

attention-mechanism book deep-learning digital-signal-processing dnn neural-networks python python-tts seq2seq speech speech-processing speech-synthesis text-to-speech tts wavenet wavenet-vocoder

Last synced: 20 Jan 2025

https://github.com/AkojimaSLP/Beamforming-for-speech-enhancement

simple delaysum, MVDR and CGMM-MVDR

beamforming cgmm-mvdr delay-sum mvdr python signal-processing speech-enhancement speech-processing speech-recognition

Last synced: 02 Nov 2024

https://github.com/jtkim-kaist/Speech-enhancement

Deep neural network based speech enhancement toolkit

speech-enhancement speech-processing

Last synced: 02 Nov 2024

https://github.com/tomchang25/whisper-auto-transcribe

Auto transcribe tool based on whisper

asr deep-learning gradio gradio-interface language-model pytorch speech-processing speech-recognition speech-to-text text-to-speech video-captioning voice-activity-detection

Last synced: 20 Nov 2024

https://github.com/innfactory/react-native-dialogflow

A React-Native Bridge for the Google Dialogflow (API.AI) SDK

api-ai apiai dialogflow google react-native speak speech speech-processing speech-to-function text-recognition voice

Last synced: 15 Jan 2025

https://github.com/innFactory/react-native-dialogflow

A React-Native Bridge for the Google Dialogflow (API.AI) SDK

api-ai apiai dialogflow google react-native speak speech speech-processing speech-to-function text-recognition voice

Last synced: 07 Dec 2024

https://github.com/jefflai108/pytorch-kaldi-neural-speaker-embeddings

A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.

kaldi learnable-dictionary-encoding pytorch speaker-identification speaker-recognition speaker-verification speech-processing

Last synced: 27 Nov 2024

https://github.com/albertaparicio/tfg-voice-conversion

Deep Learning-based Voice Conversion system

deep-learning deep-neural-networks gplv3 keras numpy python speaker speech speech-processing tensorflow voice-conversion

Last synced: 27 Oct 2024

https://github.com/ahkarami/great-deep-learning-books

A Great Collection of Deep Learning (e)Books

books convolutional-neural-networks deep-learning deep-neural-networks ebooks keras machine-learning mxnet natural-language-processing pytorch recurrent-neural-networks reinforcement-learning speech-processing tensorflow

Last synced: 20 Jan 2025

https://github.com/haoheliu/torchsubband

Pytorch implementation of subband decomposition

deep-learning music-source-separation signal-processing speech-enhancement speech-processing speech-recognition

Last synced: 20 Jan 2025

https://github.com/r9y9/sptk

A modified version of Speech Signal Processing Toolkit (SPTK)

speech-processing

Last synced: 03 Dec 2024

https://github.com/vocalpy/vak

A neural network framework for researchers studying acoustic communication

animal-communication animal-vocalizations bioacoustic-analysis bioacoustics birdsong python python3 pytorch spectrograms speech-processing torch torchvision vocalizations

Last synced: 17 Jan 2025

https://github.com/ga642381/SpeechGen

《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》

deep-learning large-language-models prompt speech-generation speech-llm speech-processing

Last synced: 28 Nov 2024

https://github.com/mwv/vad

Voice Activity Detector

python speech-processing

Last synced: 14 Nov 2024

https://github.com/grausof/keras-sincnet

Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)

artificial-intelligence asr audio audio-processing cnn convolutional-neural-networks deep-learning digital-signal-processing filtering keras machine-learning neural-network speaker-recognition speaker-verification speech-processing speech-recognition tensorflow timit waveform

Last synced: 10 Dec 2024

https://github.com/SIP-Lab/CNN-VAD

A Convolutional Neural Network based Voice Activity Detector for Smartphones

deep-learning deep-neural-networks digital-signal-processing smartphone speech-processing

Last synced: 14 Nov 2024

https://github.com/inevolin/discordearsbot

A speech-to-text framework and bot for Discord. Take control of your Discord server using speech and voice commands. Can also be useful for hearing impaired and deaf people.

discord discord-bot discord-js hearing-aids hearing-impaired speech speech-processing speech-recognition speech-synthesis speech-to-text stt

Last synced: 13 Jan 2025

https://github.com/wq2012/simpleder

A lightweight library to compute Diarization Error Rate (DER).

diarization machine-learning metrics speaker-diarization speech-processing speech-recognition

Last synced: 07 Nov 2024

https://github.com/fulldecent/formant-analyzer

iOS application for finding formants in spoken sounds

app application ios language language-learning mature speech-processing speech-recognition speech-therapy swift

Last synced: 16 Jan 2025

https://github.com/markparker5/stark

S.T.A.R.K. - Speech And Text Algorithmic Recognition Kit

cross-platform framework natural-language natural-language-processing natural-language-understanding python python3 speech-processing speech-recognition voice voice-assistant voice-commands voice-control voice-interface voice-recognition

Last synced: 11 Nov 2024

https://github.com/clement-pages/gryannote

Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.

annotation-processing annotation-tool audio gradio gradio-custom-component interspeech2024 pyannote speaker-diarization speech-processing

Last synced: 17 Jan 2025

https://github.com/spokestack/spokestack-ios

Spokestack: give your iOS app a voice interface!

asr hacktoberfest ios natural-language-understanding speech-api speech-processing speech-recognition speech-synthesis speech-to-text swift tensorflow text-to-speech vad voice-activity-detection voice-assistant voice-recognition voice-synthesis wakeword wakeword-activation

Last synced: 28 Sep 2024

https://github.com/vectominist/spin

Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering"

clustering disentanglement self-supervised-learning speech-processing speech-recognition

Last synced: 02 Dec 2024

https://github.com/montrealcorpustools/polyglotdb

Language data store and linguistic query API

acoustics database influxdb neo4j rest-api speech-analysis speech-processing

Last synced: 20 Jan 2025

https://github.com/declare-lab/speech-adapters

Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient transfer learning approaches on SURE benchmark for speech understanding

adapter asr speech-processing speech-recognition speech-synthesis speech-to-text tts

Last synced: 08 Nov 2024

https://github.com/ardauzunoglu/rte-speech-generator

Natural Language Processing to generate new speeches for the President of Turkey.

natural-language-processing nlp politics python speech-processing tensorflow turkce turkish turkish-nlp

Last synced: 12 Nov 2024

https://github.com/k2kobayashi/shifter

Pitch shifter using WSOLA and resampling implemented by Python3

signal-processing speech speech-processing voice-control voice-conversion

Last synced: 03 Dec 2024

https://github.com/aydinnyunus/linuxvoiceassistant

Linux Voice Assistant for to Make Your Work Easier

assistant assistant-chat-bots google google-assistant google-assistant-apps google-assistant-desktop python python3 speech-processing speech-recognition speech-to-text tkinter tkinter-graphic-interface tkinter-gui tkinter-python voice voice-assistant voice-commands voice-control voice-conversion

Last synced: 11 Nov 2024

https://github.com/aydinnyunus/LinuxVoiceAssistant

Linux Voice Assistant for to Make Your Work Easier

assistant assistant-chat-bots google google-assistant google-assistant-apps google-assistant-desktop python python3 speech-processing speech-recognition speech-to-text tkinter tkinter-graphic-interface tkinter-gui tkinter-python voice voice-assistant voice-commands voice-control voice-conversion

Last synced: 07 Nov 2024

https://github.com/bunyaminergen/callytics

Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.

denoising diarization forced-alignment llama3 llm openai opensource sentiment-analysis speech-emotion-recognition speech-processing speech-recognition speech-to-text summary topic-modeling transcription voice-activity-detection voice-recognition

Last synced: 08 Jan 2025

https://github.com/navalnica/be_nlp_speech_resources

Links to Belarusian NLP and Speech resources

asr belarus belarusian belarusian-language natural-language-processing nlp speech speech-processing speech-recognition speech-synthesis speech-to-text stt text-to-speech tts

Last synced: 12 Jan 2025

https://github.com/mycrazycracy/tf-kaldi-speaker

Neural speaker recognition/verification system based on Kaldi and Tensorflow

kaldi kaldi-asr machine-learning neural-network speaker-identification speaker-recognition speaker-verification speech-processing tensorflow

Last synced: 13 Nov 2024

https://github.com/ryota-komatsu/speaker_disentangled_hubert

Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"

self-supervised-learning speech speech-processing

Last synced: 14 Jan 2025

https://github.com/r9y9/world.jl

A lightweight julia wrapper for WORLD - a high-quality speech analysis, modification and synthesis system

julia julia-wrapper speech-processing

Last synced: 03 Dec 2024

https://github.com/tabahi/formantanalyzer.js

Extract formant features such as frequency, power, energy, and bandwidth of formants at syllable or word level from audio sources in a web browser using WebAudio API.

audio-analysis audio-processing feature feature-engineering feature-extraction formant formant-detection music music-visualizer signal-processing spectrum-analyzer speech-processing

Last synced: 20 Dec 2024

https://github.com/bhattbhavesh91/wav2vec2-huggingface-demo

Speech to Text with self-supervised learning based on wav2vec 2.0 framework using Hugging Face's Transformer

facebook-wav2vec self-supervised-learning speech speech-processing speech-recognition speech-to-text unsupervised-learning wav2vec

Last synced: 16 Nov 2024

https://github.com/liamdugan/speech-to-speech

Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"

simultaneous-translation speech speech-processing speech-to-speech speech-translation

Last synced: 27 Oct 2024

https://github.com/farzadforuozanfar/speech-recognition

I recorded 10 voices with the same words from myself and compared them with another 10 words from another person. I was able to find a threshold level that acknowledges and recognizes my own voice.

distance dtw dtw-algorithm jupyter-notebook python3 speech-processing speech-recognition speech-to-text

Last synced: 25 Nov 2024

https://github.com/gogyzzz/beamformit_matlab

A MATLAB implementation of CHiME4 baseline Beamformit

beamforming beamformit beamformit-step matlab speech-enhancement speech-processing speech-recognition

Last synced: 02 Nov 2024

https://github.com/ringabout/scim

[wip]Speech recognition tool-box written by Nim. Based on Arraymancer.

arraymancer audio digital-signal-processing mfcc nim scientific-computing speech-analysis speech-processing speech-recognition wav

Last synced: 25 Nov 2024

https://github.com/r9y9/melgeneralizedcepstrums.jl

Mel-Generalized Cepstrum analysis

julia speech-processing

Last synced: 03 Dec 2024

https://github.com/tabahi/webspeechanalyzer

JS speech analyzer for fast speech analysis and labeling

audio-analysis audio-processing feature feature-engineering feature-extraction formant-detection music music-information-retrieval music-visualizer phonemes signal-processing spectrum spectrum-analyzer speech speech-analysis speech-processing speech-recognition

Last synced: 14 Nov 2024

https://github.com/shunsukeaihara/pyssp

python speech signal processing library

python2 python3 signal-processing speech-processing

Last synced: 07 Nov 2024

https://github.com/inevolin/discordspeechbot

A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.

discord discord-bot discord-js music music-player speech speech-processing speech-recognition speech-to-text stt

Last synced: 12 Nov 2024

https://github.com/shaqayeql/Notq

persian-speech-processing persian-speech-recognition persian-speech-to-text speech-processing speech-recognition speech-to-text

Last synced: 20 Nov 2024

https://github.com/alecokas/bilatticernn-confidence

Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks https://arxiv.org/abs/1910.11933 or https://ieeexplore.ieee.org/document/9053264

asr attention confidence-estimates confidence-estimation confidence-scores confusion-networks lattice latticernn lattices low-resource-languages lstm pytorch pytorch-implementation speech-processing speech-recognition

Last synced: 13 Oct 2024

https://github.com/wq2012/vb_diarization

VB Diarization with Eigenvoice and HMM Priors, refactored

machine-learning speaker-diarization speech-processing speech-recognition

Last synced: 14 Oct 2024

https://github.com/pprablanc/ppsrt

A python algorithm to change the pitch of the voice in real time

lpc pitch pitch-shift python python-algorithm real-time signal-processing speech-processing voice

Last synced: 10 Nov 2024

https://github.com/r9y9/sptk.jl

A thin Julia wrapper for Speech Signal Processing Toolkit (SPTK) API

julia julia-wrapper speech-processing

Last synced: 03 Dec 2024

https://github.com/viig99/esolafast

Fast C++ implementation of ESOLA using KFRLib, can be used for online time-stretch augmentation during SpeechToText training.

asr esola kfr pybind11 python-bindings speech speech-augmentation speech-processing speech-recognition speech-to-text time-stretch

Last synced: 11 Nov 2024

https://github.com/onolab-tmu/libss

A Python library for blind source separation.

audio-processing blind-source-separation independent-component-analysis independent-vector-analysis speech-processing

Last synced: 30 Nov 2024

https://github.com/buaadreamer/slpkiller

语音和自然语言处理学习/Learning Speech and Language Processing

ai language-processing nlp speech-processing

Last synced: 05 Jan 2025

https://github.com/wvangansbeke/speech-subband-coding

Subband filtering with ADPCM

adpcm c filterbank quantization speech-processing

Last synced: 07 Nov 2024

https://github.com/mahtafetrat/manatts-persian-speech-dataset

ManaTTS is the largest open Persian speech dataset with 86+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.

data-collection data-preprocessing dataset-preparation forced-alignment mana-tts persian persian-speech speech-corpus speech-data-collection speech-dataset speech-processing speech-synthesis text-to-speech text-to-speech-dataset tts tts-dataset

Last synced: 06 Nov 2024

https://github.com/mastashake08/speech-kit

Simplifying the Speech Synthesis and Speech Recognition engines for Javascript. Listen for commands and perform callback actions, make the browser speak and transcribe your speech!

grammar grammar-rules speech speech-processing speech-recognition speech-synthesis speech-to-text