awesome-python-audio

Awesome Python resources related to audio and music
https://github.com/andreimatveyeu/awesome-python-audio

Last synced: 3 days ago
JSON representation

Analysis and Visualization
- audio-fingerprint-identifying-python - similar app, that can identify the song using audio fingerprints & spectrum analysis and Fast Fourier transform
- AudioLazy
- AudioOwl
- BregmanToolkit
- paura
- Realtime_PyAudio_FFT
Analysis & Feature Extraction
- scipy.signal
- aubio
- audioFlux
- Essentia
- librosa
- Madmom
- pyAudioAnalysis
- Pyo
- timeside
- mirdata
- nnAudio
- mir_eval
Audio Embeddings & Representations
- CLAP (LAION) - Audio Pretraining for zero-shot audio classification
- CLAP (Microsoft)
- OpenL3 - source deep audio and image embeddings
- panns-inference
- wav2vec2 - supervised speech representations from Facebook AI
Audio Manipulation
Audio Processing & I/O
- numpy & scipy.io.wavfile
- wave
- audiomentations
- audioread - library audio decoding (GStreamer + Core Audio + MAD + FFmpeg)
- babycat
- matchering
- Matchering-cli
- noisereduce
- pedalboard
- pyAudioProcessing
- PyDub
- torchaudio
- SoundDevice
- torch-audiomentations
Datasets
- Audio
  - AudioSet - scale dataset of manually annotated audio events
  - Birdsong
  - Freesound Dataset
  - RAVDESS - visual dataset of emotional speech and song
  - TIDIGITS
  - UrbanSound8K
  - VCTK
  - Speech Commands
  - ESC-50
  - Free Spoken Digit Dataset
  - Freesound Dataset
  - RAVDESS - visual dataset of emotional speech and song
  - VoxCeleb - scale speaker identification dataset
  - Speech Commands
  - Common Voice
  - LibriSpeech
  - Speech Commands
  - VoxCeleb - scale speaker identification dataset
- Music
  - AcousticBrainz
  - Beatport EDM Key
  - DEAM
  - IRMAS
  - MAESTRO
  - MedleyDB - track mixing research
  - Musdb18
  - NSynth - scale dataset of annotated musical notes
  - Open MIC
  - RWC Music Database
  - The Million Song Dataset
  - DALI
  - FMA
  - hsmusic
  - Jamendo Audio Tagging - label audio tagging dataset
  - symbolic-music-datasets
  - The Million Song Dataset
  - Beatport EDM Key
  - MagnaTagATune
  - Open MIC
  - GiantMIDI-Piano - scale MIDI dataset of classical piano music
  - LAION-Audio-630K - text pairs for CLAP training
  - MusicCaps
  - MusicNet
  - Open MIC
Music Generation & AI
- Text-to-Speech
  - NSynth
  - Magenta
  - musicautobot
  - AudioCraft
  - Stable Audio Tools
  - Riffusion - time music generation using stable diffusion on spectrograms
Music Theory and Composition
- Music
  - Musical-scales
  - MusicMaker - Audio Description Language
  - pyHarmonySearch
  - PyTheory
Music Theory & Composition
- Text-to-Speech
  - Abjad
  - AthenaCL
  - maelzel
  - MIDIUtil - track MIDI files
  - mingus
  - music21 - aided musical analysis
  - MusPy
  - pychord
  - scamp - Assisted Music in Python
  - pretty-midi
  - xenharmlib - standard tuning systems
Music Transcription & Pitch
- Text-to-Speech
  - basic-pitch
  - CREPE
  - torchcrepe
  - MT3 - instrument automatic music transcription from Google Magenta
  - piano_transcription_inference - resolution piano transcription with pedal detection
Playback and Streaming
- Music
  - pyAV
Playback & Services
- Text-to-Speech
  - freesound-python
  - audiostream
  - beets
  - discord.py
  - Mopidy
  - Mopidy-YouTube
  - mpv
  - MusicBot
  - pygame.mixer
  - pyglet - platform windowing and multimedia library
  - pyradio
  - miniaudio
  - pyAV
  - Spotipy
Source Separation
- Text-to-Speech
  - Spleeter
  - pydsm
  - Demucs - of-the-art music source separation from Meta
  - audio-separator - Net VR Arch and Demucs models
  - Asteroid - based audio source separation toolkit for researchers
Speech Processing
- Speech-to-Text
  - Whisper
  - faster-whisper
  - WhisperX - level timestamps and speaker diarization
  - SpeechRecognition
  - Vosk
  - SpeechBrain
  - pyannote-audio
- Text-to-Speech
  - Coqui TTS - to-Speech
  - Bark - based text-to-audio model with emotions and non-speech sounds
  - pyttsx3 - to-speech conversion library
Synthesis and Generation
- Music
  - Audioguide
  - FoxDot - based live coding environment for sound synthesis
  - PySynth
  - Python-musical
  - WaveGAN
  - Nsynth
Synthesis & Sound Design
- Text-to-Speech
  - ctcsound
  - Mido
  - Pippi
  - pyfluidsynth
  - Python-audio
  - Renardo
  - sc3nb
Tutorials

Programming Languages

Python 79 Jupyter Notebook 15 C 5 C++ 2 Rust 1

Categories

Datasets 43 Audio Processing & I/O 14 Playback & Services 14 Analysis & Feature Extraction 12 Music Theory & Composition 11 Speech Processing 10 Tutorials 8 Synthesis & Sound Design 7 Synthesis and Generation 6 Music Generation & AI 6 Analysis and Visualization 6 Source Separation 5 Music Transcription & Pitch 5 Audio Embeddings & Representations 5 Music Theory and Composition 4 Audio Manipulation 4 Playback and Streaming 1

Sub Categories

Text-to-Speech 51 Music 41 Audio 18 Speech-to-Text 7 PyDub 2 librosa 1

Keywords

python 41 audio 26 music 18 deep-learning 13 music-information-retrieval 11 machine-learning 11 dsp 8 pytorch 8 audio-processing 8 dataset 7 sound 7 speech-recognition 6 speech-to-text 5 midi 4 signal-processing 4 audio-analysis 4 mir 4 scipy 3 augmentation 3 spectral-analysis 3 pitch 3 speech 3 speaker-verification 3 asr 3 sound-processing 3 mfcc 3 synthesis 3 pretrained-models 3 python-library 3 python3 3 spectrum 2 audio-data-augmentation 2 audio-data 2 vst 2 whisper 2 audio-effects 2 data-augmentation 2 speaker-diarization 2 speaker-recognition 2 speech-enhancement 2 speech-processing 2 speech-separation 2 c 2 voice-recognition 2 analysis 2 text-to-speech 2 feature-extraction 2 mopidy 2 tensorflow 2 cli 2