Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with speech-processing
A curated list of projects in awesome lists tagged with speech-processing .
https://github.com/speechbrain/speechbrain
A PyTorch-based Speech Toolkit
asr audio audio-processing deep-learning huggingface language-model pytorch speaker-diarization speaker-recognition speaker-verification speech-enhancement speech-processing speech-recognition speech-separation speech-to-text speech-toolkit speechrecognition spoken-language-understanding transformers voice-recognition
Last synced: 13 Jan 2025
https://github.com/pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
overlapped-speech-detection pretrained-models pytorch speaker-change-detection speaker-diarization speaker-embedding speaker-recognition speaker-verification speech-activity-detection speech-processing voice-activity-detection
Last synced: 14 Jan 2025
https://github.com/snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
onnx onnx-runtime onnxruntime pytorch speech speech-processing vad voice-activity-detection voice-commands voice-control voice-detection voice-recognition
Last synced: 16 Jan 2025
https://github.com/microsoft/torchscale
Foundation Architecture for (M)LLMs
computer-vision machine-learning multimodal natural-language-processing pretrained-language-model speech-processing transformer translation
Last synced: 15 Jan 2025
https://github.com/r9y9/wavenet_vocoder
WaveNet vocoder
neural-vocoder python pytorch speech speech-processing speech-synthesis wavenet wavenet-vocoder
Last synced: 17 Jan 2025
https://github.com/linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
asr attention-is-all-you-need attention-mechanism attention-model attention-network attention-seq2seq attention-visualization deep-learning machine-learning multilingual-models python python3 pytorch speaker-diarization speech speech-processing speech-recognition speech-to-text transformers whisper
Last synced: 14 Jan 2025
https://github.com/r9y9/deepvoice3_pytorch
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
end-to-end machine-learning multi-speaker python pytorch speech-processing speech-synthesis tts
Last synced: 17 Jan 2025
https://github.com/resemble-ai/resemble-enhance
AI powered speech denoising and enhancement
denoise speech-denoising speech-enhancement speech-processing
Last synced: 15 Jan 2025
https://github.com/coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
speech-emotion-recognition speech-processing speech-recognition speech-separation speech-synthesis speech-to-text stt text-to-speech tts voice-activity-detection voice-cloning voice-recognition
Last synced: 03 Dec 2024
https://github.com/mravanelli/sincnet
SincNet is a neural architecture for efficiently processing raw audio samples.
artificial-intelligence asr audio audio-processing cnn convolutional-neural-networks deep-learning digital-signal-processing filtering neural-networks python pytorch signal-processing speaker-identification speaker-recognition speaker-verification speech-processing speech-recognition timit waveform
Last synced: 19 Jan 2025
https://github.com/mravanelli/SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
artificial-intelligence asr audio audio-processing cnn convolutional-neural-networks deep-learning digital-signal-processing filtering neural-networks python pytorch signal-processing speaker-identification speaker-recognition speaker-verification speech-processing speech-recognition timit waveform
Last synced: 11 Nov 2024
https://github.com/midas-research/audino
Open source audio annotation tool for humans
annotation-tool audio-annotation audio-processing datasets machine-learning python speech-processing
Last synced: 17 Jan 2025
https://github.com/haoheliu/voicefixer
General Speech Restoration
declipping denoise dereverberation mel speech speech-analysis speech-enhancement speech-processing speech-synthesis super-resolution tts vocoder
Last synced: 14 Jan 2025
https://github.com/ictnlp/streamspeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
all-in-one asr audio-processing machine-translation non-autoregressive seamless simultaneous-translation speech speech-enhancement speech-processing speech-recognition speech-synthesis speech-to-text speech-translation streaming-audio text-to-audio text-to-speech translation tts voice
Last synced: 17 Jan 2025
https://github.com/Ryuk17/SpeechAlgorithms
You can find the speech algorithms you want here
Last synced: 01 Nov 2024
https://github.com/drethage/speech-denoising-wavenet
A neural network for end-to-end speech denoising
deep-learning end-to-end machine-learning neural-networks speech speech-denoising speech-processing wavenet
Last synced: 22 Nov 2024
https://github.com/x-lance/slam-llm
Speech, Language, Audio, Music Processing with Large Language Model
audio-processing large-language-model multimodal-large-language-models music-processing peft speech-processing
Last synced: 18 Jan 2025
https://github.com/X-LANCE/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
audio-processing large-language-model multimodal-large-language-models music-processing peft speech-processing
Last synced: 06 Jan 2025
https://github.com/huawei-noah/speech-backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
speech-processing speech-recognition speech-synthesis
Last synced: 18 Jan 2025
https://github.com/Audio-WestlakeU/FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
audio band denoising full-band narrow-band noise-reduction paper pretrained-model pytorch reproducible-research single-channel speech speech-enhancement speech-processing speech-separation sub-band
Last synced: 22 Nov 2024
https://github.com/nyrahealth/crisperwhisper
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
asr audio detection filler recognition speech speech-processing speech-recognition timestamps transcription verbatim whisper
Last synced: 17 Jan 2025
https://github.com/pliang279/multibench
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
computer-vision deep-learning healthcare machine-learning multimodal-learning natural-language-processing representation-learning robotics speech-processing
Last synced: 12 Jan 2025
https://github.com/pliang279/MultiBench
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
computer-vision deep-learning healthcare machine-learning multimodal-learning natural-language-processing representation-learning robotics speech-processing
Last synced: 15 Nov 2024
https://github.com/arjo129/uspeech
Speech recognition toolkit for the arduino
arduino signal speech-processing speech-recognition
Last synced: 24 Nov 2024
https://github.com/arjo129/uSpeech
Speech recognition toolkit for the arduino
arduino signal speech-processing speech-recognition
Last synced: 17 Nov 2024
https://github.com/microsoft/unispeech
UniSpeech - Large Scale Self-Supervised Learning for Speech
diarization pytorch speaker-verification speech speech-diarization speech-processing speech-recognition speech-separation
Last synced: 19 Jan 2025
https://github.com/r9y9/pysptk
A python wrapper for Speech Signal Processing Toolkit (SPTK).
digital-signal-processing dsp python python-wrapper speech speech-processing speech-synthesis sptk
Last synced: 20 Jan 2025
https://github.com/santi-pdp/pase
Problem Agnostic Speech Encoder
deep-learning multi-task-learning pytorch self-supervised-learning speech-processing unsupervised-learning waveform-analysis
Last synced: 25 Oct 2024
https://github.com/SuperKogito/spafe
:sound: spafe: Simplified Python Audio Features Extraction
audio audio-analysis beat dsp features-extraction filterbank frequencies frequency frequency-analysis gammatone-filterbanks mfcc music music-information-retrieval pitch python signal-processing sound speech-processing time-frequency-analysis voice
Last synced: 22 Nov 2024
https://github.com/gemengtju/Tutorial_Separation
This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.
deep-learning deep-neural-networks signal-processing speech-analysis speech-processing speech-separation
Last synced: 02 Nov 2024
https://github.com/novoic/surfboard
Novoic's audio feature extraction library
alzheimers-disease audio audio-processing feature-extraction healthcare machine-learning parkinsons-disease python signal-processing speech-processing
Last synced: 04 Nov 2024
https://github.com/r9y9/nnmnkwii
Library to build speech synthesis systems designed for easy and fast prototyping.
machine-learning python speech-processing speech-synthesis text-to-speech voice-conversion
Last synced: 20 Jan 2025
https://github.com/speechbrain/speechbrain.github.io
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
beamforming deep-learning deeplearning librispeech neural-network neural-networks speaker-identification speaker-recognition speaker-verification speech speech-analysis speech-api speech-emotion-recognition speech-processing speech-recognition speech-recognizer speech-separation speech-to-text speechrecognition timit
Last synced: 13 Nov 2024
https://github.com/seanwood/gcc-nmf
Real-time GCC-NMF Blind Speech Separation and Enhancement
cross-correlation dictionary-learning gcc gcc-nmf generalized-cross-correlation ipython-notebook low-latency machine-learning nmf real-time real-time-processing speaker speech speech-enhancement speech-processing speech-separation tdoa unsupervised-machine-learning
Last synced: 15 Jan 2025
https://github.com/haoxiangsnr/A-Convolutional-Recurrent-Neural-Network-for-Real-Time-Speech-Enhancement
A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorch
cnn cnn-rnn pytorch real-time rnn speech-enhancement speech-processing
Last synced: 22 Nov 2024
https://github.com/nvidia/cleanunet
Official PyTorch Implementation of CleanUNet (ICASSP 2022)
noise-reduction speech-denoising speech-enchacement speech-processing
Last synced: 14 Jan 2025
https://github.com/Yuan-ManX/audio-development-tools
This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.
artificial-intelligence audio audio-generation audio-processing deep-learning dsp machine-learning music music-generation signal-processing speech speech-processing speech-synthesis
Last synced: 27 Oct 2024
https://github.com/haoheliu/voicefixer_main
General Speech Restoration
machine-learning speech speech-analysis speech-enhancement speech-processing speech-synthesis speech-to-text tts
Last synced: 14 Jan 2025
https://github.com/gtreshchev/runtimespeechrecognizer
Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.
audio-processing openai speech-detection speech-processing speech-recognition speech-to-text ue4 ue4-plugin ue5 ue5-plugin unreal-engine unreal-engine-4 unreal-engine-5 voice-recognition whis whisper whisper-ai whisper-cpp
Last synced: 15 Jan 2025
https://github.com/gtreshchev/RuntimeSpeechRecognizer
Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.
audio-processing openai speech-detection speech-processing speech-recognition speech-to-text ue4 ue4-plugin ue5 ue5-plugin unreal-engine unreal-engine-4 unreal-engine-5 voice-recognition whis whisper whisper-ai whisper-cpp
Last synced: 06 Nov 2024
https://github.com/r9y9/ttslearn
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
attention-mechanism book deep-learning digital-signal-processing dnn neural-networks python python-tts seq2seq speech speech-processing speech-synthesis text-to-speech tts wavenet wavenet-vocoder
Last synced: 20 Jan 2025
https://github.com/AkojimaSLP/Beamforming-for-speech-enhancement
simple delaysum, MVDR and CGMM-MVDR
beamforming cgmm-mvdr delay-sum mvdr python signal-processing speech-enhancement speech-processing speech-recognition
Last synced: 02 Nov 2024
https://github.com/jtkim-kaist/Speech-enhancement
Deep neural network based speech enhancement toolkit
speech-enhancement speech-processing
Last synced: 02 Nov 2024
https://github.com/tomchang25/whisper-auto-transcribe
Auto transcribe tool based on whisper
asr deep-learning gradio gradio-interface language-model pytorch speech-processing speech-recognition speech-to-text text-to-speech video-captioning voice-activity-detection
Last synced: 20 Nov 2024
https://github.com/innfactory/react-native-dialogflow
A React-Native Bridge for the Google Dialogflow (API.AI) SDK
api-ai apiai dialogflow google react-native speak speech speech-processing speech-to-function text-recognition voice
Last synced: 15 Jan 2025
https://github.com/innFactory/react-native-dialogflow
A React-Native Bridge for the Google Dialogflow (API.AI) SDK
api-ai apiai dialogflow google react-native speak speech speech-processing speech-to-function text-recognition voice
Last synced: 07 Dec 2024
https://github.com/jefflai108/pytorch-kaldi-neural-speaker-embeddings
A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.
kaldi learnable-dictionary-encoding pytorch speaker-identification speaker-recognition speaker-verification speech-processing
Last synced: 27 Nov 2024
https://github.com/albertaparicio/tfg-voice-conversion
Deep Learning-based Voice Conversion system
deep-learning deep-neural-networks gplv3 keras numpy python speaker speech speech-processing tensorflow voice-conversion
Last synced: 27 Oct 2024
https://github.com/ahkarami/great-deep-learning-books
A Great Collection of Deep Learning (e)Books
books convolutional-neural-networks deep-learning deep-neural-networks ebooks keras machine-learning mxnet natural-language-processing pytorch recurrent-neural-networks reinforcement-learning speech-processing tensorflow
Last synced: 20 Jan 2025
https://github.com/haoheliu/torchsubband
Pytorch implementation of subband decomposition
deep-learning music-source-separation signal-processing speech-enhancement speech-processing speech-recognition
Last synced: 20 Jan 2025
https://github.com/r9y9/sptk
A modified version of Speech Signal Processing Toolkit (SPTK)
Last synced: 03 Dec 2024
https://github.com/vocalpy/vak
A neural network framework for researchers studying acoustic communication
animal-communication animal-vocalizations bioacoustic-analysis bioacoustics birdsong python python3 pytorch spectrograms speech-processing torch torchvision vocalizations
Last synced: 17 Jan 2025
https://github.com/ga642381/SpeechGen
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
deep-learning large-language-models prompt speech-generation speech-llm speech-processing
Last synced: 28 Nov 2024
https://github.com/grausof/keras-sincnet
Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)
artificial-intelligence asr audio audio-processing cnn convolutional-neural-networks deep-learning digital-signal-processing filtering keras machine-learning neural-network speaker-recognition speaker-verification speech-processing speech-recognition tensorflow timit waveform
Last synced: 10 Dec 2024
https://github.com/SIP-Lab/CNN-VAD
A Convolutional Neural Network based Voice Activity Detector for Smartphones
deep-learning deep-neural-networks digital-signal-processing smartphone speech-processing
Last synced: 14 Nov 2024
https://github.com/inevolin/discordearsbot
A speech-to-text framework and bot for Discord. Take control of your Discord server using speech and voice commands. Can also be useful for hearing impaired and deaf people.
discord discord-bot discord-js hearing-aids hearing-impaired speech speech-processing speech-recognition speech-synthesis speech-to-text stt
Last synced: 13 Jan 2025
https://github.com/wq2012/simpleder
A lightweight library to compute Diarization Error Rate (DER).
diarization machine-learning metrics speaker-diarization speech-processing speech-recognition
Last synced: 07 Nov 2024
https://github.com/fulldecent/formant-analyzer
iOS application for finding formants in spoken sounds
app application ios language language-learning mature speech-processing speech-recognition speech-therapy swift
Last synced: 16 Jan 2025
https://github.com/markparker5/stark
S.T.A.R.K. - Speech And Text Algorithmic Recognition Kit
cross-platform framework natural-language natural-language-processing natural-language-understanding python python3 speech-processing speech-recognition voice voice-assistant voice-commands voice-control voice-interface voice-recognition
Last synced: 11 Nov 2024
https://github.com/clement-pages/gryannote
Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.
annotation-processing annotation-tool audio gradio gradio-custom-component interspeech2024 pyannote speaker-diarization speech-processing
Last synced: 17 Jan 2025
https://github.com/spokestack/spokestack-ios
Spokestack: give your iOS app a voice interface!
asr hacktoberfest ios natural-language-understanding speech-api speech-processing speech-recognition speech-synthesis speech-to-text swift tensorflow text-to-speech vad voice-activity-detection voice-assistant voice-recognition voice-synthesis wakeword wakeword-activation
Last synced: 28 Sep 2024
https://github.com/vectominist/spin
Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering"
clustering disentanglement self-supervised-learning speech-processing speech-recognition
Last synced: 02 Dec 2024
https://github.com/montrealcorpustools/polyglotdb
Language data store and linguistic query API
acoustics database influxdb neo4j rest-api speech-analysis speech-processing
Last synced: 20 Jan 2025
https://github.com/declare-lab/speech-adapters
Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient transfer learning approaches on SURE benchmark for speech understanding
adapter asr speech-processing speech-recognition speech-synthesis speech-to-text tts
Last synced: 08 Nov 2024
https://github.com/ardauzunoglu/rte-speech-generator
Natural Language Processing to generate new speeches for the President of Turkey.
natural-language-processing nlp politics python speech-processing tensorflow turkce turkish turkish-nlp
Last synced: 12 Nov 2024
https://github.com/k2kobayashi/shifter
Pitch shifter using WSOLA and resampling implemented by Python3
signal-processing speech speech-processing voice-control voice-conversion
Last synced: 03 Dec 2024
https://github.com/aydinnyunus/linuxvoiceassistant
Linux Voice Assistant for to Make Your Work Easier
assistant assistant-chat-bots google google-assistant google-assistant-apps google-assistant-desktop python python3 speech-processing speech-recognition speech-to-text tkinter tkinter-graphic-interface tkinter-gui tkinter-python voice voice-assistant voice-commands voice-control voice-conversion
Last synced: 11 Nov 2024
https://github.com/aydinnyunus/LinuxVoiceAssistant
Linux Voice Assistant for to Make Your Work Easier
assistant assistant-chat-bots google google-assistant google-assistant-apps google-assistant-desktop python python3 speech-processing speech-recognition speech-to-text tkinter tkinter-graphic-interface tkinter-gui tkinter-python voice voice-assistant voice-commands voice-control voice-conversion
Last synced: 07 Nov 2024
https://github.com/bunyaminergen/callytics
Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.
denoising diarization forced-alignment llama3 llm openai opensource sentiment-analysis speech-emotion-recognition speech-processing speech-recognition speech-to-text summary topic-modeling transcription voice-activity-detection voice-recognition
Last synced: 08 Jan 2025
https://github.com/navalnica/be_nlp_speech_resources
Links to Belarusian NLP and Speech resources
asr belarus belarusian belarusian-language natural-language-processing nlp speech speech-processing speech-recognition speech-synthesis speech-to-text stt text-to-speech tts
Last synced: 12 Jan 2025
https://github.com/mycrazycracy/tf-kaldi-speaker
Neural speaker recognition/verification system based on Kaldi and Tensorflow
kaldi kaldi-asr machine-learning neural-network speaker-identification speaker-recognition speaker-verification speech-processing tensorflow
Last synced: 13 Nov 2024
https://github.com/ryota-komatsu/speaker_disentangled_hubert
Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"
self-supervised-learning speech speech-processing
Last synced: 14 Jan 2025
https://github.com/r9y9/world.jl
A lightweight julia wrapper for WORLD - a high-quality speech analysis, modification and synthesis system
julia julia-wrapper speech-processing
Last synced: 03 Dec 2024
https://github.com/tabahi/formantanalyzer.js
Extract formant features such as frequency, power, energy, and bandwidth of formants at syllable or word level from audio sources in a web browser using WebAudio API.
audio-analysis audio-processing feature feature-engineering feature-extraction formant formant-detection music music-visualizer signal-processing spectrum-analyzer speech-processing
Last synced: 20 Dec 2024
https://github.com/bhattbhavesh91/wav2vec2-huggingface-demo
Speech to Text with self-supervised learning based on wav2vec 2.0 framework using Hugging Face's Transformer
facebook-wav2vec self-supervised-learning speech speech-processing speech-recognition speech-to-text unsupervised-learning wav2vec
Last synced: 16 Nov 2024
https://github.com/liamdugan/speech-to-speech
Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"
simultaneous-translation speech speech-processing speech-to-speech speech-translation
Last synced: 27 Oct 2024
https://github.com/farzadforuozanfar/speech-recognition
I recorded 10 voices with the same words from myself and compared them with another 10 words from another person. I was able to find a threshold level that acknowledges and recognizes my own voice.
distance dtw dtw-algorithm jupyter-notebook python3 speech-processing speech-recognition speech-to-text
Last synced: 25 Nov 2024
https://github.com/gogyzzz/beamformit_matlab
A MATLAB implementation of CHiME4 baseline Beamformit
beamforming beamformit beamformit-step matlab speech-enhancement speech-processing speech-recognition
Last synced: 02 Nov 2024
https://github.com/ringabout/scim
[wip]Speech recognition tool-box written by Nim. Based on Arraymancer.
arraymancer audio digital-signal-processing mfcc nim scientific-computing speech-analysis speech-processing speech-recognition wav
Last synced: 25 Nov 2024
https://github.com/r9y9/melgeneralizedcepstrums.jl
Mel-Generalized Cepstrum analysis
Last synced: 03 Dec 2024
https://github.com/tabahi/webspeechanalyzer
JS speech analyzer for fast speech analysis and labeling
audio-analysis audio-processing feature feature-engineering feature-extraction formant-detection music music-information-retrieval music-visualizer phonemes signal-processing spectrum spectrum-analyzer speech speech-analysis speech-processing speech-recognition
Last synced: 14 Nov 2024
https://github.com/shunsukeaihara/pyssp
python speech signal processing library
python2 python3 signal-processing speech-processing
Last synced: 07 Nov 2024
https://github.com/inevolin/discordspeechbot
A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.
discord discord-bot discord-js music music-player speech speech-processing speech-recognition speech-to-text stt
Last synced: 12 Nov 2024
https://github.com/alecokas/bilatticernn-confidence
Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks https://arxiv.org/abs/1910.11933 or https://ieeexplore.ieee.org/document/9053264
asr attention confidence-estimates confidence-estimation confidence-scores confusion-networks lattice latticernn lattices low-resource-languages lstm pytorch pytorch-implementation speech-processing speech-recognition
Last synced: 13 Oct 2024
https://github.com/wq2012/vb_diarization
VB Diarization with Eigenvoice and HMM Priors, refactored
machine-learning speaker-diarization speech-processing speech-recognition
Last synced: 14 Oct 2024
https://github.com/pprablanc/ppsrt
A python algorithm to change the pitch of the voice in real time
lpc pitch pitch-shift python python-algorithm real-time signal-processing speech-processing voice
Last synced: 10 Nov 2024
https://github.com/r9y9/sptk.jl
A thin Julia wrapper for Speech Signal Processing Toolkit (SPTK) API
julia julia-wrapper speech-processing
Last synced: 03 Dec 2024
https://github.com/viig99/esolafast
Fast C++ implementation of ESOLA using KFRLib, can be used for online time-stretch augmentation during SpeechToText training.
asr esola kfr pybind11 python-bindings speech speech-augmentation speech-processing speech-recognition speech-to-text time-stretch
Last synced: 11 Nov 2024
https://github.com/onolab-tmu/libss
A Python library for blind source separation.
audio-processing blind-source-separation independent-component-analysis independent-vector-analysis speech-processing
Last synced: 30 Nov 2024
https://github.com/buaadreamer/slpkiller
语音和自然语言处理学习/Learning Speech and Language Processing
ai language-processing nlp speech-processing
Last synced: 05 Jan 2025
https://github.com/wvangansbeke/speech-subband-coding
Subband filtering with ADPCM
adpcm c filterbank quantization speech-processing
Last synced: 07 Nov 2024
https://github.com/mahtafetrat/manatts-persian-speech-dataset
ManaTTS is the largest open Persian speech dataset with 86+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.
data-collection data-preprocessing dataset-preparation forced-alignment mana-tts persian persian-speech speech-corpus speech-data-collection speech-dataset speech-processing speech-synthesis text-to-speech text-to-speech-dataset tts tts-dataset
Last synced: 06 Nov 2024
https://github.com/mastashake08/speech-kit
Simplifying the Speech Synthesis and Speech Recognition engines for Javascript. Listen for commands and perform callback actions, make the browser speak and transcribe your speech!
grammar grammar-rules speech speech-processing speech-recognition speech-synthesis speech-to-text
Last synced: 12 Nov 2024
https://github.com/hpez/emotion-recognition
Emotion recognition from speech - Android
ai android neural-networks speech-processing
Last synced: 24 Nov 2024
https://github.com/daanzu/py-silero-vad-lite
Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies
python speech speech-processing vad voice voice-activity-detection
Last synced: 08 Nov 2024
https://github.com/tabahi/mel-spectrum-analyzer
Online web based mel-spectrum, power spectrum, FFT analyzer for speech and music processing
fft-analysis frequency-analysis music-information-retrieval music-visualizer signal-processing spectrum-analyzer spectrum-scale speech-processing speech-recognition
Last synced: 14 Nov 2024