{"id":56228,"url":"https://github.com/andreimatveyeu/awesome-python-audio","name":"awesome-python-audio","description":"Awesome Python resources related to audio and music","projects_count":129,"last_synced_at":"2026-07-13T20:00:29.270Z","repository":{"id":186952535,"uuid":"676046543","full_name":"andreimatveyeu/awesome-python-audio","owner":"andreimatveyeu","description":"Awesome Python resources related to audio and music","archived":false,"fork":false,"pushed_at":"2026-06-06T17:51:40.000Z","size":69,"stargazers_count":114,"open_issues_count":0,"forks_count":17,"subscribers_count":6,"default_branch":"main","last_synced_at":"2026-06-25T14:04:05.645Z","etag":null,"topics":["audio","awesome-list","awesome-lists","awesome-python","python","python-audio","python-music"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc-by-4.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/andreimatveyeu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-08-08T10:04:47.000Z","updated_at":"2026-06-13T02:03:14.000Z","dependencies_parsed_at":"2026-03-07T16:02:34.296Z","dependency_job_id":null,"html_url":"https://github.com/andreimatveyeu/awesome-python-audio","commit_stats":null,"previous_names":["andreimatveyeu/awesome-python-audio"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/andreimatveyeu/awesome-python-audio","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreimatveyeu%2Fawesome-python-audio","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreimatveyeu%2Fawesome-python-audio/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreimatveyeu%2Fawesome-python-audio/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreimatveyeu%2Fawesome-python-audio/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/andreimatveyeu","download_url":"https://codeload.github.com/andreimatveyeu/awesome-python-audio/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreimatveyeu%2Fawesome-python-audio/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":35434586,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-07-13T02:00:06.543Z","response_time":119,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"created_at":"2024-02-18T22:16:04.187Z","updated_at":"2026-07-13T20:00:29.270Z","primary_language":"Python","list_of_lists":false,"displayable":true,"categories":["Datasets","Analysis \u0026 Feature Extraction","Audio Processing \u0026 I/O","Music Generation \u0026 AI","Tutorials","Source Separation","Playback \u0026 Services","Speech Processing","Synthesis and Generation","Music Theory \u0026 Composition","Music Transcription \u0026 Pitch","Analysis and Visualization","Music Theory and Composition","Audio Manipulation","Synthesis \u0026 Sound Design","Audio Embeddings \u0026 Representations","Playback and Streaming"],"sub_categories":["Music","Audio","Text-to-Speech","librosa","PyDub","Speech-to-Text"],"readme":"# Awesome Python Audio and Music 🎵\n\nA curated list of Python tools, libraries, and resources for audio and music processing, analysis, synthesis, and playback.\n\n## Audio Processing \u0026 I/O\n\n- [audioread](https://github.com/beetbox/audioread): Cross-library audio decoding (GStreamer + Core Audio + MAD + FFmpeg)\n- [audiomentations](https://github.com/iver56/audiomentations): Audio data augmentation library for machine learning\n- [babycat](https://github.com/babycat-io/babycat): Audio manipulation library for Rust Python WebAssembly and C\n- [matchering](https://github.com/sergree/matchering): Open source audio matching and mastering\n- [Matchering-cli](https://github.com/sergree/matchering-cli): Command line application for Matchering 2.0\n- [noisereduce](https://github.com/timsainb/noisereduce): Noise reduction using spectral gating\n- [numpy \u0026 scipy.io.wavfile](https://docs.scipy.org/doc/scipy/reference/io.html): Read/write and manipulate WAV files\n- [pedalboard](https://github.com/spotify/pedalboard): Spotify's library for audio effects and processing\n- [PyDub](https://github.com/jiaaro/pydub): Manipulate audio with a simple and easy high level interface\n- [pyAudioProcessing](https://github.com/jsingh811/pyAudioProcessing): Audio feature extraction classification and segmentation\n- [SoundDevice](https://github.com/spatialaudio/python-sounddevice): Play and record audio with Python\n- [soundfile](https://github.com/bastibe/SoundFile): Read and write sound files using libsndfile\n- [torch-audiomentations](https://github.com/asteroid-team/torch-audiomentations): Fast GPU audio data augmentation for PyTorch\n- [torchaudio](https://github.com/pytorch/audio): Audio data manipulation and transformation powered by PyTorch\n- [wave](https://docs.python.org/3/library/wave.html): Read and write WAV files (Python standard library)\n\n## Analysis \u0026 Feature Extraction\n\n- [aubio](https://github.com/aubio/aubio): Library for audio and music analysis including pitch and beat detection\n- [audioFlux](https://github.com/libAudioFlux/audioFlux): Library for audio and music analysis and feature extraction\n- [Essentia](https://github.com/MTG/essentia): C++ library with Python bindings for audio analysis and MIR\n- [librosa](https://github.com/librosa/librosa): Python package for music and audio analysis\n- [Madmom](https://github.com/CPJKU/madmom): Audio signal processing library focused on MIR tasks\n- [mir_eval](https://github.com/craffel/mir_eval): Evaluation functions for MIR and audio signal processing algorithms\n- [mirdata](https://github.com/mir-dataset-loaders/mirdata): Python library for working with MIR datasets\n- [nnAudio](https://github.com/KinWaiCheuk/nnAudio): GPU audio processing using PyTorch neural networks\n- [pyAudioAnalysis](https://github.com/tyiannak/pyAudioAnalysis): Audio feature extraction classification segmentation and visualization\n- [Pyo](https://github.com/belangeo/pyo): Python DSP module with synthesis and analysis capabilities\n- [scipy.signal](https://docs.scipy.org/doc/scipy/reference/signal.html): Signal processing routines for SciPy\n- [timeside](https://github.com/Parisson/TimeSide): Framework for audio analysis imaging transcoding and streaming\n\n## Audio Embeddings \u0026 Representations\n\n- [CLAP (LAION)](https://github.com/LAION-AI/CLAP): Contrastive Language-Audio Pretraining for zero-shot audio classification\n- [CLAP (Microsoft)](https://github.com/microsoft/CLAP): Learning audio concepts from natural language supervision\n- [OpenL3](https://github.com/marl/openl3): Open-source deep audio and image embeddings\n- [panns-inference](https://github.com/qiuqiangkong/panns_inference): Pretrained audio neural networks for audio tagging and sound event detection\n- [wav2vec2](https://huggingface.co/docs/transformers/model_doc/wav2vec2): Self-supervised speech representations from Facebook AI\n\n## Speech Processing\n\n### Speech-to-Text\n\n- [Whisper](https://github.com/openai/whisper): OpenAI's robust multilingual speech recognition model\n- [faster-whisper](https://github.com/SYSTRAN/faster-whisper): CTranslate2 reimplementation of Whisper up to 4x faster\n- [WhisperX](https://github.com/m-bain/whisperX): Whisper with word-level timestamps and speaker diarization\n- [SpeechRecognition](https://github.com/Uberi/speech_recognition): Library for performing speech recognition with multiple backends\n- [Vosk](https://github.com/alphacep/vosk-api): Offline speech recognition API supporting 20+ languages\n- [SpeechBrain](https://github.com/speechbrain/speechbrain): PyTorch toolkit for speech processing and conversational AI\n- [pyannote-audio](https://github.com/pyannote/pyannote-audio): Neural speaker diarization and voice activity detection\n\n### Text-to-Speech\n\n- [Coqui TTS](https://github.com/coqui-ai/TTS): Deep learning toolkit for Text-to-Speech\n- [Bark](https://github.com/suno-ai/bark): Transformer-based text-to-audio model with emotions and non-speech sounds\n- [pyttsx3](https://github.com/nateshmbhat/pyttsx3): Offline text-to-speech conversion library\n\n## Source Separation\n\n- [Demucs](https://github.com/facebookresearch/demucs): State-of-the-art music source separation from Meta\n- [audio-separator](https://github.com/nomadkaraoke/python-audio-separator): Easy stem separation using MDX-Net VR Arch and Demucs models\n- [Asteroid](https://github.com/asteroid-team/asteroid): PyTorch-based audio source separation toolkit for researchers\n- [pydsm](https://github.com/google-research/sound-separation): Google's toolkit for sound separation using deep learning\n- [Spleeter](https://github.com/deezer/spleeter): Deezer source separation library (note: Demucs now preferred)\n\n## Music Transcription \u0026 Pitch\n\n- [basic-pitch](https://github.com/spotify/basic-pitch): Spotify's lightweight neural network for polyphonic pitch detection\n- [CREPE](https://github.com/marl/crepe): Monophonic pitch tracker using deep convolutional neural network\n- [torchcrepe](https://github.com/maxrmorrison/torchcrepe): PyTorch implementation of CREPE pitch tracker\n- [MT3](https://github.com/magenta/mt3): Multi-instrument automatic music transcription from Google Magenta\n- [piano_transcription_inference](https://github.com/qiuqiangkong/piano_transcription_inference): High-resolution piano transcription with pedal detection\n\n## Music Generation \u0026 AI\n\n- [AudioCraft](https://github.com/facebookresearch/audiocraft): Meta's library for MusicGen AudioGen EnCodec and MAGNeT models\n- [Stable Audio Tools](https://github.com/Stability-AI/stable-audio-tools): Generative models for conditional audio generation from Stability AI\n- [Riffusion](https://github.com/riffusion/riffusion-hobby): Real-time music generation using stable diffusion on spectrograms\n- [Magenta](https://github.com/magenta/magenta): Google's machine learning for music and art generation\n- [musicautobot](https://github.com/bearpelican/musicautobot): Music generation with transformers using fastai\n- [NSynth](https://github.com/magenta/magenta/tree/main/magenta/models/nsynth): Neural audio synthesis model from Magenta\n\n## Synthesis \u0026 Sound Design\n\n- [ctcsound](https://github.com/csound/ctcsound): Python bindings for Csound using ctypes\n- [Mido](https://github.com/mido/mido): MIDI objects for Python\n- [Pippi](https://git.sr.ht/~hecanjog/pippi): Computer music composition library\n- [pyfluidsynth](https://github.com/nwhitehead/pyfluidsynth): Python bindings for FluidSynth software synthesizer\n- [Python-audio](https://github.com/mgeier/python-audio): Jupyter notebooks about audio signal processing with Python\n- [Renardo](https://github.com/e-lie/renardo): Maintained fork of FoxDot for Python live coding music\n- [sc3nb](https://github.com/interactive-sonification/sc3nb): SuperCollider integration for Python and Jupyter notebooks\n\n## Music Theory \u0026 Composition\n\n- [Abjad](https://github.com/Abjad/abjad): Python API for building LilyPond music notation files\n- [AthenaCL](https://github.com/ales-tsurko/athenaCL): Algorithmic composition tool (Python 3 fork)\n- [maelzel](https://github.com/gesellkammer/maelzel): Framework for computer music in Python\n- [MIDIUtil](https://github.com/MarkCWirt/MIDIUtil): Pure Python library for creating multi-track MIDI files\n- [mingus](https://github.com/bspaans/python-mingus): Advanced music theory and notation package\n- [music21](https://github.com/cuthbertLab/music21): Toolkit for computer-aided musical analysis\n- [MusPy](https://github.com/salu133445/muspy): Toolkit for symbolic music generation\n- [pretty-midi](https://github.com/craffel/pretty-midi): MIDI data handling and manipulation library\n- [pychord](https://github.com/yuma-m/pychord): Handle and transform musical chords\n- [scamp](https://github.com/MarcTheSpark/scamp): Suite for Computer-Assisted Music in Python\n- [xenharmlib](https://gitlab.com/retooth/xenharmlib): Music theory library for microtonal and non-standard tuning systems\n\n## Playback \u0026 Services\n\n- [audiostream](https://github.com/kivy/audiostream): Audio API for streaming raw data to speakers\n- [beets](https://github.com/beetbox/beets): Music library manager and MusicBrainz tagger\n- [discord.py](https://github.com/Rapptz/discord.py): Python wrapper for Discord API with music streaming\n- [freesound-python](https://github.com/MTG/freesound-python): Freesound API wrapper for audio retrieval and analysis\n- [miniaudio](https://github.com/irmen/pyminiaudio): Python bindings for miniaudio audio playback library\n- [Mopidy](https://github.com/mopidy/mopidy): Extensible music server written in Python\n- [Mopidy-YouTube](https://github.com/natumbri/mopidy-youtube): Mopidy extension for playing music from YouTube\n- [mpv](https://github.com/jaseg/python-mpv): Python interface to MPV media player\n- [MusicBot](https://github.com/just-some-bots/MusicBot): Discord music bot written in Python\n- [pyAV](https://github.com/PyAV-org/PyAV): Pythonic bindings for FFmpeg libraries\n- [pygame.mixer](https://github.com/pygame/pygame): Pygame module for sound loading and playback\n- [pyglet](https://github.com/pyglet/pyglet): Cross-platform windowing and multimedia library\n- [pyradio](https://github.com/coderholic/pyradio): Command line internet radio player\n- [Spotipy](https://github.com/spotipy-dev/spotipy): Python client for the Spotify Web API\n\n## Datasets\n\n### Audio\n\n- [AudioSet](https://research.google.com/audioset/): Large-scale dataset of manually annotated audio events\n- [Birdsong](https://www.kaggle.com/c/birdsong-recognition): Dataset of annotated bird songs and calls\n- [Common Voice](https://commonvoice.mozilla.org/): Mozilla's open source multilingual speech dataset\n- [ESC-50](https://github.com/karolpiczak/ESC-50): Environmental sound classification dataset\n- [Free Spoken Digit Dataset](https://github.com/Jakobovski/free-spoken-digit-dataset): Dataset of spoken digits in English\n- [Freesound Dataset](https://datasets.freesound.org/fsd/): Collaborative dataset of audio samples from Freesound\n- [LibriSpeech](https://www.openslr.org/12): Large corpus of read English speech for ASR research\n- [RAVDESS](https://zenodo.org/record/1188976): Audio-visual dataset of emotional speech and song\n- [Speech Commands](https://www.tensorflow.org/datasets/catalog/speech_commands): Dataset for speech command recognition\n- [TIDIGITS](https://catalog.ldc.upenn.edu/LDC93S10): Spoken digit dataset for speech recognition\n- [UrbanSound8K](https://urbansounddataset.weebly.com/urbansound8k.html): 8000 urban sound samples in 10 classes\n- [VCTK](https://datashare.ed.ac.uk/handle/10283/3443): Multispeaker speech dataset for voice technologies\n- [VoxCeleb](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/): Large-scale speaker identification dataset\n\n### Music\n\n- [Beatport EDM Key](https://zenodo.org/record/1101082): Electronic dance music tracks with musical key labels\n- [DALI](https://github.com/gabolsgabs/DALI): Dataset of lyrics and audio with time alignments\n- [DEAM](https://cvml.unige.ch/databases/DEAM/): MediaEval dataset for music emotion recognition\n- [FMA](https://github.com/mdeff/fma): Free Music Archive dataset for music analysis\n- [GiantMIDI-Piano](https://github.com/bytedance/GiantMIDI-Piano): Large-scale MIDI dataset of classical piano music\n- [hsmusic](https://github.com/Didayolo/hsmusic): Huge symbolic music dataset\n- [IRMAS](https://www.upf.edu/web/mtg/irmas): Instrument recognition in musical audio signals\n- [Jamendo Audio Tagging](https://github.com/MTG/mtg-jamendo-dataset): Multi-label audio tagging dataset\n- [LAION-Audio-630K](https://github.com/LAION-AI/audio-dataset): Large collection of audio-text pairs for CLAP training\n- [MAESTRO](https://magenta.tensorflow.org/datasets/maestro): MIDI and audio dataset for music transcription and generation\n- [MagnaTagATune](http://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset): Dataset for music annotation and audio tagging\n- [MedleyDB](https://medleydb.weebly.com/): Dataset for multi-track mixing research\n- [Musdb18](https://sigsep.github.io/datasets/musdb.html): Dataset for music source separation\n- [MusicCaps](https://www.kaggle.com/datasets/googleai/musiccaps): Dataset of music clips with rich text descriptions\n- [MusicNet](https://zenodo.org/record/5120004): Dataset of classical music with instrument labels\n- [NSynth](https://magenta.tensorflow.org/datasets/nsynth): Large-scale dataset of annotated musical notes\n- [Open MIC](https://zenodo.org/record/1432913): Open Music Instrument Classification dataset\n- [RWC Music Database](https://staff.aist.go.jp/m.goto/RWC-MDB/): Musical instrument sound genre and rhythm databases\n- [symbolic-music-datasets](https://github.com/wayne391/symbolic-music-datasets): Collection of symbolic music datasets\n- [The Million Song Dataset](http://millionsongdataset.com/): Massive collection of audio features and metadata\n\n## Tutorials\n\n- [librosa tutorial - Introduction](https://iq.opengenus.org/introduction-to-librosa/): Advanced librosa tutorial covering spectrograms and remixing\n- [librosa tutorial - Visualization](https://www.analyticsvidhya.com/blog/2021/06/visualizing-sounds-librosa/): Visualizing sounds using librosa and matplotlib\n- [PyDub tutorial](https://www.geeksforgeeks.org/working-with-wav-files-in-python-using-pydub/): Working with WAV files using PyDub\n- [Whisper tutorial](https://github.com/openai/whisper/discussions/categories/guides): Using OpenAI Whisper for speech-to-text\n- [AudioCraft tutorial](https://github.com/facebookresearch/audiocraft/blob/main/docs/MUSICGEN.md): Getting started with MusicGen and AudioGen\n- [Hugging Face Audio Course](https://huggingface.co/learn/audio-course): Comprehensive course on audio ML with transformers\n\n","projects_url":"https://awesome.ecosyste.ms/api/v1/lists/andreimatveyeu%2Fawesome-python-audio/projects"}