Projects in Awesome Lists tagged with librosa

https://github.com/librosa/librosa

Python library for audio and music analysis

Last synced: 12 May 2025

https://github.com/neptunehub/audiomuse-ai

AudioMuse-AI is an Open Source Dockerized environment that brings automatic playlist generation to Jellyfin, Navidrome, LMS, Lyrion and Emby. Using powerful tools like Librosa and ONNX, it performs sonic analysis on your audio files locally, allowing you to curate the perfect playlist for any mood or occasion without relying on external APIs.

clap docker emby jellyfin jellyfin-plugin k3s kubernetes librosa llm lyrion music navidrome ollama onnx open-source open-source-project playlist self-hosted smart-playlists sonic-analysis

Last synced: 05 Apr 2026

https://github.com/x4nth055/emotion-recognition-using-speech

Building and training Speech Emotion Recognizer that predicts human emotions using Python, Sci-kit learn and Keras

deep-learning emotion-detection emotion-recognition emotion-recognizer feature-extraction gradient-boosting keras kneighborsclassifier librosa machine-learning mfcc mlp-classifier neural-networks random-forest-classifier recurrent-neural-networks sklearn speech-emotion-recognition support-vector-machine

Last synced: 04 Apr 2025

https://github.com/marcogdepinto/emotion-classification-from-audio-files

Understanding emotions from audio files using neural networks and multiple datasets.

audio audio-processing classification-report datascience deep-learning deep-neural-networks emotion emotion-classification-ravdess keras keras-neural-networks librosa livingstone machine-learning python python3 ravdess-dataset song songs speech tensorflow

Last synced: 05 Apr 2025

https://github.com/marcogdepinto/Emotion-Classification-Ravdess

Understanding emotions from audio files using neural networks and multiple datasets.

audio audio-processing classification-report datascience deep-learning deep-neural-networks emotion emotion-classification-ravdess keras keras-neural-networks librosa livingstone machine-learning python python3 ravdess-dataset song songs speech tensorflow

Last synced: 12 Mar 2025

https://github.com/demfier/multimodal-speech-emotion-recognition

Lightweight and Interpretable ML Model for Speech Emotion Recognition and Ambiguity Resolution (trained on IEMOCAP dataset)

iemocap librosa lstm multimodal-emotion-recognition pandas python3 pytorch scikit-learn speech-emotion-recognition

Last synced: 01 May 2025

https://github.com/scherroman/mugen

A command-line music video generator based on rhythm

amv audio command-line librosa montage moviepy mugen music-video python remix rhythm tesseract video

Last synced: 13 Apr 2026

https://github.com/kaist-maclab/pytsmod

An open-source Python library for audio time-scale modification.

audio dsp librosa music numpy python scipy time-scale tsm

Last synced: 04 Apr 2025

https://github.com/KAIST-MACLab/PyTSMod

An open-source Python library for audio time-scale modification.

audio dsp librosa music numpy python scipy time-scale tsm

Last synced: 14 Jul 2025

https://github.com/spotify/realbook

Easier audio-based machine learning with TensorFlow.

audio cqt librosa machine-learning mel-spectrogram spectrograms stft tensorflow

Last synced: 04 Apr 2025

https://github.com/yeyupiaoling/audioclassification-paddlepaddle

基于PaddlePaddle实现的音频分类，支持EcapaTdnn、PANNS、TDNN、Res2Net、ResNetSE等各种模型，还有多种预处理方法

audio-classification ecapa-tdnn librosa paddlepaddle panns res2net resnet-se tdnn urbansound8k

Last synced: 06 Sep 2025

https://github.com/GianlucaPaolocci/Sound-classification-on-Raspberry-Pi-with-Tensorflow

In this project is presented a simple method to train an MLP neural network for audio signals. The trained model can be exported on a Raspberry Pi (2 or superior suggested) to classify audio signal registered with USB microphone

audio-analysis audio-signals dataset librosa machine-learning multilayer-perceptron-network raspberry raspberry-pi sound-classification tensorflow tensorflow-models

Last synced: 07 Apr 2025

https://github.com/gianlucapaolocci/sound-classification-on-raspberry-pi-with-tensorflow

In this project is presented a simple method to train an MLP neural network for audio signals. The trained model can be exported on a Raspberry Pi (2 or superior suggested) to classify audio signal registered with USB microphone

audio-analysis audio-signals dataset librosa machine-learning multilayer-perceptron-network raspberry raspberry-pi sound-classification tensorflow tensorflow-models

Last synced: 02 May 2025

https://github.com/ztrimus/speech-emotion-recognition

Predicting various emotion in human speech signal by detecting different speech components affected by human emotion.

audio-files colab-notebook convolutional-neural-networks data-science deep-learning emotion-detection emotion-recognition jupyter-notebook keras librosa lstm natural-language-processing neural-network python3 pytorch rnn speech-emotion-recognition speech-recoginition supervised-learning voice

Last synced: 14 Apr 2025

https://github.com/rohankrgupta/Orca-call-Classifier-Machine-learning

Advanced ML Project : An Orca Call classifier using mel-spectrograms as audio representations to detect Killer whales

advanced-machine-learning feature keras-tensorflow librosa mel-spectrograms opencv

Last synced: 11 Mar 2025

https://github.com/bits-bytes-nn/sound-anomaly-detection-with-autoencoders

MIMII Sound Anomaly Detection with AutoEncoders

anomaly-detection autoencoder bokeh librosa matplotlib sagemaker tensorflow variational-autoencoder

Last synced: 06 Jul 2025

https://github.com/olilarkin/librosa.cpp

C++17 port of librosa with wasm and SPM package. Done with agents. YMMV.

audio-analysis audio-analyzer beat-tracking librosa music-information-retrieval pitch-detection spectrogram swift wasm

Last synced: 28 Jun 2026

https://github.com/danyalimran93/Music-Genre-Classification

Classifying English Music (.mp3) files using Music Information Retrieval (MIR), Digital/Audio Signal Processing (DIP) and Machine Learning (ML) Strategies

audio-signal-processing librosa machine-learning music-genre music-information-retrieval

Last synced: 26 Aug 2025

https://github.com/adzialocha/tomomibot

Artificial intelligence bot for live voice improvisation

keras librosa machine-learning music

Last synced: 05 Apr 2025

https://github.com/albincorreya/chromacoverid

Methods to compute various chroma audio features and audio similarity measures particularly for the task of cover song identification

audio-processing audio-similarity-measures chroma cover-song-detection cover-song-identification essentia librosa music-information-retrieval

Last synced: 23 Aug 2025

https://github.com/clolsonus/VirtualChoir

Automatically sync, mix, and draw virtual choir videos from raw tracks of individual recordings. You may need some singing skills but you don't need video editing skills or additional software.

audacity audio-tracks choir librosa opencv pydub python sync video-tracks virtual-choirs

Last synced: 09 May 2025

https://github.com/anujdutt9/audio-scene-classification

Scene Classification using Audio in the nearby Environment.

audio-classification deep-learning keras librosa python36 tensorflow

Last synced: 04 May 2025

https://github.com/swapnilkumbhar/dsp-project

Digital Signal Processing mini project: Autotune

digital-signal-processing librosa pitch-correction

Last synced: 13 May 2025

https://github.com/skempin/audio-peak-detection

Python script utilising Librosa to log the timings of audio peaks in an MP3 file

audio-analysis audio-applications librosa mp3 python python-2 wav

Last synced: 09 Apr 2025

https://github.com/xi/infinity-player

infinite jukebox clone using librosa

audio librosa numpy

Last synced: 29 Jul 2025

https://github.com/andi611/conditional-specgan-tensorflow

Text-to-Speech Synthesis by Generating Spectrograms using Generative Adversarial Network

audio-synthesis conditional-gan digital-signal-processing gan librosa machine-learning nlp nlp-machine-learning tensorflow tts

Last synced: 13 Apr 2025

https://github.com/musa11971/manhuw

Recognizing and identifying Quran reciters from audio recordings.

librosa machine-learning python quran speaker-identification speaker-recognition

Last synced: 12 Aug 2025

https://github.com/ryoha000/librosapp

A C++ implementation of stft, melspectrogram and mel_to_stft

librosa melspectrogram spectrogram stft

Last synced: 14 Oct 2025

https://github.com/librosa/data

Example (audio) data for use with librosa

audio librosa

Last synced: 01 May 2025

https://github.com/kookmin-sw/capstone-2022-15

IN4U - 면접 연습 웹 서비스

aws django interview interview-practice librosa mysql-database opencv react stylegan2 wav2lip

Last synced: 06 May 2025

https://github.com/matlab-deep-learning/use-a-python-speech-command-recognition-system-to-matlab

Use a Python speech command recognition system in MATLAB

audio audio-processing co-execution deep deeplearning example librosa matlab pytorch speech-recognition

Last synced: 07 May 2025

https://github.com/bhattsameer/eyeshield

Data Transmission Between two devices using Sound

datasharing filesharing librosa matplotlib pied-piper pyaudio pydub python python3 sound soundcompare sounddatatransmission soundjoin soundsplit textsharing wave

Last synced: 19 Apr 2025

https://github.com/korseby/py3tag

Write tags to audio files (mp3, flac, and m4a are supported) based on their filenames

audioread flac-files librosa m4a-files mp3-files mutagen python-libraries python3 scheme

Last synced: 11 Apr 2025

https://github.com/mehdihosseinimoghadam/signal-processing

Signal Processing with Python and Librosa

griffinlim librosa melspectrogram python signal-processing spectrogram torchaudio variational-autoencoder vector-quantization voice voice-reconstruction voice-synthesis vq-vae

Last synced: 22 Jul 2025

https://github.com/zionc27/speech-emotion-recognition

Speech Emotion Recognition (SER) using Deep neural networks CNN and RNN

clstm cnn ipython-notebook keras librosa lstm machine-learning python rnn speech speech-emotion-classification speech-emotion-recognition tensorflow

Last synced: 09 May 2025

https://github.com/georgiosioannoucoder/vera

Voice Emotion Recognition of Audio (VERA) is an open-source project created for the Data Science track for the program CUNY Tech Prep (CTP) in Cohort 8. 🔊

audio-classification classification cnn-model data-science emotion emotion-recognition librosa machine-learning speech-emotion-recognition voice-emotion

Last synced: 09 Aug 2025

https://github.com/inishchith/soundanalysis

2nd Runner-Up @MumbaiHackathon 2017

librosa music numpy python3 scipy sound-clips

Last synced: 20 Aug 2025

https://github.com/rupeshs/audio-regen

audio fourier-transform librosa matlabplot python signalprocessing spectrogram stft

Last synced: 10 Jul 2025

https://github.com/agentmaker/paddle-librosa

Paddle-Librosa provides Paddle implementation of some librosa functions

librosa paddlepaddle

Last synced: 24 Mar 2025

https://github.com/ritzvik/singerrecognization

librosa machine-learning multiprocessing multithreading neural-network python3 sound-processing tensorflow tensorflow-experiments

Last synced: 25 Apr 2026

https://github.com/parthvadhadiya/tensorflow-speech-recognition-challenge

this repository contains end to end python script to train speech data provided by google, evaluate testing data, and submite to competition

competition kaggle-competition keras librosa spectrum speech-data speech-recognition tensorflow

Last synced: 13 Apr 2026

https://github.com/matlab-deep-learning/convert-librosa-audio-feature-extraction-to-matlab

Convert librosa Audio Feature Extraction To MATLAB

audio deep-learning librosa matlab matlab-deep-learning pytorch

Last synced: 02 Jan 2026

https://github.com/akash-rajak/volume-suggester

Python Script to suggest the volume at which the music audio file needs to be played for better experience and feeling.

audio-feature-extraction audio-loudness ffmpeg librosa matplotlib mutagen numpy os path pyaudio pydub pynput python3 subprocess tkinter volume-suggester wave

Last synced: 18 Feb 2026

https://github.com/kr1shnasomani/tonesense

Speech emotion recognition from audio clips using CNN

deep-learning keras librosa matplotlib neural-network numpy pandas scikit-learn tensorflow

Last synced: 15 Apr 2026

https://github.com/talkuhulk/music-genres-classification

Tensorflow implementation of music-genres-classification with InceptionResnetV2

audio-classification classification cnn-tensorflow genres-classification inception-resnet-v2 librosa python tensorflow

Last synced: 16 May 2026

https://github.com/akashmodak97/genre-detection-of-bengali-songs-based-on-audio-data

Genre Detection of Bengali Rabindranath Tagore's Song Based On Audio Data.

bengali-songs deep-learning deep-neural-networks feature-extraction genre-classification keras-tensorflow librosa lstm mfcc-features music-classification neural-network python3 song-genre tensorflow2

Last synced: 08 Oct 2025

https://github.com/pprattis/automatic-speech-recognision-system-ASR

A python script that implements an automatic speech recognision system.

asr automatic-speech-recognition computer-science dtw dynamic-time-warping fir-filter librosa mel-frequency-cepstral-coefficients mfcc nyquist program python short-time-fourier-transform short-time-signal-analysis signal signal-processing student

Last synced: 28 Sep 2025

https://github.com/georgiosioannoucoder/vera-deployed

Voice Emotion Recognition of Audio (VERA) is an open-source project created for the Data Science track for the program CUNY Tech Prep (CTP) in Cohort 8. This is the deployed version of Vera. 🔊

audio-classification classification cnn-model data-science emotion emotion-recognition librosa machine-learning speech-emotion-recognition voice-recognition

Last synced: 17 Apr 2026

https://github.com/pprattis/automatic-speech-recognision-system-asr

A python script that implements an automatic speech recognision system.

asr automatic-speech-recognition computer-science dtw dynamic-time-warping fir-filter librosa mel-frequency-cepstral-coefficients mfcc nyquist program python short-time-fourier-transform short-time-signal-analysis signal signal-processing student

Last synced: 07 Sep 2025

https://github.com/harmanveer2546/music-genre-classification

Classifying the genre of a music using deep neural networks.

cnn deep-learning ipython keras knn labelencoder librosa music numpy pandas pickle python scipy sequential svc tensorflow

Last synced: 06 Apr 2026

https://github.com/pomilon/sonir

A modular, physics-based audio visualizer engine. Turns music into deterministic 2D geometric worlds using signal processing and AI stem separation.

audio-visualizer creative-coding demucs dsp generative-art librosa music-visualization physics-based pygame python

Last synced: 18 May 2026

https://github.com/djdhairya/speech-emotion-recognition-

Predicting various emotion in human speech signal by detecting different speech components affected by human emotion.

audio-processing convolutional-neural-networks data-science deep-learning emotion-detection emotion-recognition kersa librosa natural-language-processing nuralnetwork overfitting python pytorch rnn speech-emotion-recognition speech-processing speech-recognition voice

Last synced: 13 Apr 2025

https://github.com/kitsuya0828/inpersonation-app

An application that automatically scores your mimics

dtw-algorithm librosa python3 streamlit

Last synced: 04 Apr 2025

https://github.com/natgluons/chronosense

Personalized Sleep Optimizer App, a machine learning project that analyzes sleep audio using librosa, PyTorch, and scikit-learn to detect disturbances and optimize sleep quality through personalized recommendations.

audio-analysis audio-classification audio-processing chronobiology librosa sleep-analysis sleep-research sleep-tracker torchaudio

Last synced: 26 Jun 2025

https://github.com/machinelearningzuu/data-engineering-process-of-audio-data

This Repository Consists of the Feature Engineering Process of Audio Signals in both Time Domain & Frequency Domain. In more the repository contains Jupiter-notebook implementations which uses python & librosa

audio-processing librosa machine-learning python

Last synced: 29 Apr 2026

https://github.com/thekartikeyamishra/voicecloner

The Voice Cloner is a Python-based project that leverages Tacotron 2 and WaveGlow models for text-to-speech (TTS) synthesis and basic voice cloning. This project supports 22 official Indian languages, including Sanskrit, making it versatile for multilingual text input.

ai indic-transliteration librosa machine-learning numpy nvidia-pyindex nvidia-tacotron2 nvidia-waveglow python torch torchaudio

Last synced: 03 Feb 2026

https://github.com/joshuamhtsang/yt2spec

Convert audio into spectrograms.

ffmpeg flask flask-restful librosa python3 youtube-dl

Last synced: 06 May 2026

https://github.com/trilokida/speaker-identification-from-voice

cnn-classification keras-neural-networks librosa lstm-neural-network mlp-classifier rnn-keras speaker speaker-identification speech-recognition voice

Last synced: 04 Jun 2026

https://github.com/palak-463/tablataalrecognitionsystem

Software built using Python which makes use of CNN and FNN to detect the Taals of the Tabla, an Indian classical music instrument. 🎛️

cnn deep-learning flask fnn librosa numpy os pickle python scikit-learn

Last synced: 11 Apr 2026

https://github.com/alihassanml/speech-recognition-system

This project implements a speech recognition system using the LibriSpeech dataset and the `librosa` library for feature extraction, alongside a deep learning model built with TensorFlow/Keras.

deep-learning librosa speech-recognition speech-to-text

Last synced: 31 Mar 2025

https://github.com/ptrpaws/augaudio

A simple audio data augmentation package

audio data-augmentation librosa python python3 simple

Last synced: 11 Jun 2026

https://github.com/cwklurks/hook-gen

Turn your drum loops into melodies. Upload a beat, and this tool generates matching hooks that lock to the groove and key.

algorit docker fast-api hmic-com librosa music-generation music-theory nextjs python react typescript

Last synced: 06 Apr 2026

https://github.com/vasugi2003/fusion-ai---multimodal-persuvasiveness-prediction

Developed a system to predict persuasiveness using multi-modal data (text, images, audio). Utilized BERT for text embeddings, ResNet for image features, and Librosa for audio analysis. Fused data from all modalities for enhanced prediction accuracy.

ai bert-model fusion librosa multimodal-deep-learning python resnet-50 tensorflow

Last synced: 11 Apr 2026

https://github.com/adzialocha/notebook

Jupyter notebooks for random experiments with audio processing, data analysis and machine learning

jupyter-notebook keras learning librosa music21 scikit-learn

Last synced: 15 Apr 2026

https://github.com/davidegat/infiniloop

Real-time local AI music generator that crafts seamless loops for uninterrupted instrumental playback.

ai-generated ai-music ai-music-generator audio audio-loops audio-processing audio-samples librosa linux loops meta-ai music music-player python

Last synced: 20 Apr 2026

https://github.com/kitsuyaazuma/inpersonation-app

An application that automatically scores your mimics

dtw-algorithm librosa python3 streamlit

Last synced: 27 Apr 2026

https://github.com/niranjanchaudhari0929/prediction-of-insect-species-using-acoustic-features

Prediction model built to predict the insect species using the acoustic data gathered.

librosa matplotlib pandas sklearn

Last synced: 06 May 2026

https://github.com/santiviquez/noisy-human-recognition

Recognized non-speech human sounds such as: clapping, footsteps, brushing teeth, drinking sipping, laughing, etc

audio-classification audio-recognition librosa pytorch

Last synced: 11 May 2026

https://github.com/limitlessgreen/signavis

Interactive waveform + spectrogram audio player for analysis, annotation, and embeddable previews (Web, Streamlit, Gradio)

audio-annotation audio-player audio-visualization bioacoustics birdnet ecology fft javascript jupyter labeling-tool librosa mel-spectrogram pcen pypi python scientific-visualization spectrogram waveform webgl xeno-canto

Last synced: 11 May 2026

https://github.com/rijoslal/mickey

Mickey is a ML web app that captures emotions in music using LSTM and GRU-based neural networks built with TensorFlow. It features a FastAPI backend with Jinja templates for the frontend, and uses Librosa for audio processing. The system analyzes music to classify emotions, making it a powerful tool for mood-based music recommendations

fastapi html-css-javascript jinja2-templates librosa sklearn tensorflow

Last synced: 12 May 2026

https://github.com/dhavaltaunk08/gender-classification

I did this project during my internship at IIT Guwahati. It aimed to perform gender classification in video streaming.

deep-learning librosa opencv-python python scikit-learn

Last synced: 14 May 2026

https://github.com/saadarazzaq/speech-to-text-transformer

ASR with Facebook's Wav2Vec2 model for accurate 🎙️ to 📝 conversion.

asr huggingface-transformer librosa speech-recognition speech-to-text transformer wav2vec2-base-960h wav2vec2ctc wav2vec2tokenizer

Last synced: 17 Mar 2025

https://github.com/ziadasem/audio-processing-for-ml

audio processing with python and librosa for ML

audio-processing librosa machine-learning python

Last synced: 06 May 2026

https://github.com/alex1iv/asr_ru_numbers

Automatic Speech Recognition (ASR) system for Russian digits

audio-processing librosa numpy speech-recognition tensorflow

Last synced: 13 Apr 2026

https://github.com/sudemc/firstvoiceproject

🎵 Müzik Enstrüman Ayrıştırma ve Görselleştirme Projesi Bu proje, bir müzik parçasını Spleeter ve Librosa kullanarak enstrüman ve vokal bileşenlerine ayırır. Ayrıca, ses sinyallerinin spektral ve zamansal analizini görselleştirir.

audio-processing deep-learning librosa machine-learning music musicanalysis python spleeter visualization

Last synced: 09 Apr 2025

https://github.com/georgiosioannoucoder/vera-deployed-v2

Voice Emotion Recognition of Audio (VERA) is an open-source project created for the Data Science track for the program CUNY Tech Prep (CTP) in Cohort 8. This is the 2nd deployed version of VERA. 🔊

audio-classification classification cnn-model data-science emotion emotion-recognition librosa machine-learning speech-emotion-recognition voice-emotion

Last synced: 16 May 2026

https://github.com/sanatren/signal_processing_and_speech_recognition

The system processes audio signals, extracts relevant features, and employs neural networks to recognize and classify speech patterns

fourier-transform librosa mel-spectrograms signal-processing speech-recognition

Last synced: 17 Feb 2026

https://github.com/najdbinrabah/deep-learning-with-tensorflow-and-keras

This project explores emotion recognition in audio data, focusing on feature extraction techniques while also comparing the performance of LSTM and 1D CNN models.

1d-convolutional-neural-network audio-analysis chroma-features convolutional-neural-networks data-science deep-learning deep-neural-networks emotion-detection feature-extraction keras librosa long-short-term-memory-network machine-learning mel-frequency-cepstral-coefficients mel-spectrogram multiclass-classification python recurrent-neural-networks tensorflow

Last synced: 07 May 2026

https://github.com/kavayk29/audio-classification-using-python-library

This is a audio classification Project using python Libraries such as librosa to make the visual representation of the audio files, and using numpy to make array of data for manipulation and then extraction the features for classification to train and test of CNN model.

librosa matplotlib-pyplot mfcc-features numpy pandas sklearn-library

Last synced: 07 May 2026

https://github.com/wxjiao/librosa-audio-features

Temporal audio features extraction by Librosa.

audio-features librosa

Last synced: 22 Jul 2025

https://github.com/codersacademy006/speech-recognition-system

The objective of this DLM (Deep Learning Model) is to recognize the emotions from speech.

deep-learning emotion-detection emotion-recognition emotion-recognizer feature-extraction gradient-boosting keras kneighborsclassifier librosa machine-learning mfcc mlp-classifier neural-networks random-forest-classifier recurrent-neural-networks sklearn speech-emotion-recognition support-vector-machine

Last synced: 31 May 2026

https://github.com/chaman2003/parkinson-detection

Al-powered Parkinson's Disease Detection System leveraging smartphone sensors (voice and motion) for real-time analysis. Combines ensemble machine learning models (SVM, Random Forest, Gradient Boosting, XGBoost) with advanced feature extraction to provide accurate early detection, sub-second processing, and detailed reporting.

ai flask html-css-javascript librosa ml numpy pandas pydup python scikit-learn

Last synced: 08 Apr 2026

https://github.com/khushijtrivedi/speech

The Assistive Speech Technology System is designed to enhance communication by analyzing and processing various speech and audio inputs.

ajax bigru-crf bootstrap flask flask-server html-css-javascript librosa python restapi-framework voice-recognition whisper

Last synced: 16 Feb 2026

https://github.com/zvikinoza/masr

Mini Automatic Speech Recognition

fourier-transform keras-tensorflow librosa sound-processing speech-recognition speech-to-text

Last synced: 21 Mar 2025

https://github.com/sagartr/deep-audio-classifier-using-machine-learning

Languages Used: Python Developed and implemented a deep audio classifier using CNNs and LSTMs to accurately categorize diverse audio signals, achieving high accuracy and robustness. Utilized Python and TensorFlow for model development and training, incorporating data augmentation techniques to enhance performance

audio-processing capuchin librosa python tensorflow tensorflow-models

Last synced: 21 Jan 2026

https://github.com/qhuyitb/engine-audio-retrieval-system

Audio retrieval system for vehicle sound similarity search using MFCC-based feature extraction, FastAPI, and Qdrant vector database.

audio-retrieval audio-similarity fastapi information-retrieval librosa python qdrant

Last synced: 18 May 2026

https://github.com/mr-vaibh/speechalyze

librosa machine-learning opensmile praat pydub python sklearn speech-defect speech-processing stutter-detection vosk

Last synced: 09 May 2026

https://github.com/harmanveer-2546/music-genre-classification

Classifying the genre of a music using deep neural networks.

cnn deep-neural-networks keras labelencoder librosa matplotlib music numpy pandas pickle python scipy seaborn sequential-models tensorflow

Last synced: 12 Feb 2026

https://github.com/psavarmattas/speechtotext

we shall build a very simple speech recognition system that takes our voice as input and produces the corresponding text by hearing the input.

facebook-api ipython librosa machine-learning numpy python pytorch soundfile transformers

Last synced: 02 Apr 2026

https://github.com/psyb0t/docker-audiolla

Self-hosted audio API in one Docker container. Stem separation, mastering, BPM/key match, fingerprinting, similarity, EQ, sidechain duck, MIDI composition + rendering, MIR analysis, effects chain, loudness normalization. REST + MCP. CPU and CUDA. Drive it from a shell, DAW pipeline, or LLM agent.

audio audio-fingerprinting bpm-detection demucs docker fastapi fluidsynth key-detection librosa llm-agents loudness mastering matchering mcp midi midi-generation music-production pedalboard self-hosted stem-separation

Last synced: 07 Jun 2026

https://github.com/archismwanchatterjee/parkinson_detection

audio-processing bagging-classifier classification-algorithm ensemble-learning feature-extraction feature-selection librosa mrmr networkx-graph streamlit

Last synced: 18 May 2026

https://github.com/costopoulos/ntua-dsp

:signal_strength: NTUA ECE Digital Signal Processing Course Source Codes and Reports

dsp filters fourier-transform librosa numpy pywt scipy short-time-signal-analysis stft

Last synced: 19 Apr 2026

https://github.com/sugarcane-mk/finetuning_wav2vec2

This repo provides step by step process from sctatch to fine tune facebook's wav2vec2-large model using transformers

asr asr-model cuda facebook fairseq fine-tuning finetuning huggingface librosa python torch transformers wav2vec2 wav2vec2-large-960h

Last synced: 09 May 2026

https://github.com/hallowshaw/speech-emotion-recognition-with-mfcc

A project to classify emotions like happiness, sadness, and anger from speech using MFCCs, machine learning models, and visualizations for audio features and model performance.

crema-d kaggle-dataset librosa lstm matplotlib mel-frequency-cepstral-coefficient mfcc mfcc-algorithm python ravdees savee scikit-learn seaborn sentiment-analyser sentiment-analysis speech-emotion-regonition speech-sentiment-analysis tess voice-emotion-recognition voice-sentiment-analysis