awesome-audio

A curated list of awesome audio technology resources for developers
https://github.com/DolbyIO/awesome-audio

Last synced: 1 day ago
JSON representation

How-To Playback Audio
- Oboe - C++ library that wraps OpenSL ES and AAudio for high performance audio operations
- AudioTrack - Android class that streams PCM audio buffers to audio hardware for playback
- ExoPlayer - library for local or streaming playback of audio and video
- MediaPlayer - class for controlling playback of a pre-existing audio or video file
- Cross-Browser Audio Basics - tutorial for creating an HTML5 audio player
- PyAudio - python bindings for PortAudio to interface with audio drivers to record or playback audido (Open-Source/MIT)
- Cross-Browser Audio Basics - tutorial for creating an HTML5 audio player
How-To Analyze Audio
- PyAudio Analysis - python package for audio analysis and feature extraction
- Dolby.io Media Analyze API - services to analyze an audio file to identify codec, clipping, loudness, sound classification, silence, etc. Also has options force Speech, and Diagnostics.
- MATLAB DSP System Toolbox - application for designing, simulating, and analyzing signal processing systems
- Librosa - python package for music and audio analysis
- Dolby.io Media Analyze API - services to analyze an audio file to identify codec, clipping, loudness, sound classification, silence, etc. Also has options force Speech, and Diagnostics.
How-To Edit Audio
- Dolby.io Media Enhance API - services to enhance media such as correcting audio impurities like noise, sibilance, equalization, tonality, loudness
- Dolby.io Media Transcode API - Convert and assemble content that looks and sounds great no matter the device or where it’s viewed. With support for high resolution, high frame rates, and web and streaming formats.
- Avid Pro Tools - music software to create audio recording, composing, editing, and mastering
- iZotope - audio software for music production and post production, composing, editing, and mastering
- FL Studio - DAW for MacOS and Windows
- Ableton Live - DAW for MacOS and Windows
- Nuendo - DAW for MacOS and Windows that has support for Dolby Atmos and other forms of spatial audio
- Logic Pro - Logic Pro is a digital audio workstation and MIDI sequencer software application for macOS
- Garageband - Free tool for MacOS users to record and edit audio
- Audacity - Audacity is a free and open-source digital audio editor and recording application software
- Reaper - Propietary cross platform DAW
- Bitwig Studio - Cross Platform DAW made by ex-Ableton employees
- Ardour - Ardour is a hard disk recorder and digital audio workstation application that runs on Linux, macOS, FreeBSD and Microsoft Windows.
- LMMS - free, open source, cross platform DAW
- Dolby.io Media Enhance API - services to enhance media such as correcting audio impurities like noise, sibilance, equalization, tonality, loudness
- Dolby.io Media Music Mastering API - Get professional-sounding audio masters that keep your creative intent intact with the powerful Music Mastering API from Dolby.io — the result of thousands of hours of musical analysis.
- Dolby.io Media Music Mastering API - Get professional-sounding audio masters that keep your creative intent intact with the powerful Music Mastering API from Dolby.io — the result of thousands of hours of musical analysis.
- Dolby.io Media Transcode API - Convert and assemble content that looks and sounds great no matter the device or where it’s viewed. With support for high resolution, high frame rates, and web and streaming formats.
- Dolby.io Media Enhance API - services to enhance media such as correcting audio impurities like noise, sibilance, equalization, tonality, loudness
How-To Send Real-Time Audio
- HLS Streaming - HLS lets you deploy content using ordinary web servers and content delivery networks. HLS is designed for reliability and dynamically adapts to network conditions by optimizing playback for the available speed of wired and wireless connections.
- WebRTC API - capture and stream audio / video media between browsers without requiring an intermediary
- PulseAudio - PulseAudio is a sound server system for POSIX OSes
- JACK - JACK Audio Connection Kit is a professional sound server API and pair of daemon implementations to provide real-time, low-latency connections for both audio and MIDI data between applications
- Loopback - Cable-free audio routing for Mac
- Dolby.io Communications API - services with SDKs for adding audio and video conferencing and communications
How-To Read and Write Audio Files
- GStreamer - library for constructing graphs of media-handling components
- mpv - mpv is a free (as in freedom) media player for the command line. It supports a wide variety of media file formats, audio and video codecs, and subtitle types.
- VLC - VLC is a free and open source cross-platform multimedia player and framework that plays most multimedia files as well as DVDs, Audio CDs, VCDs, and various streaming protocols.
- Handbrake - HandBrake is a tool for converting video from nearly any format to a selection of modern, widely supported codecs.
How-To Record Audio
- AudioRecord - Android class for reading buffers of raw audio data from audio hardware
- MediaRecorder - records encoded audio or video and saves the recording to an output file
- MediaRecorder - Web API for processing a stream of media content such as audio tracks
- AVFoundation - framework for audiovisual assets, control devices, audio processing, and system audio interactions
- AVCapture - device, input, session, and output classes for a graph processing architecture allowing buffer analysis and processing (including video support)
- AVFAudio - foundation framework to play, record, and process audio and configure system behavior
- AVAudioRecorder - class to record audio to a file and may be simplest when getting started
- AVAudioEngine - group of audio nodes to generate and process audio signals for input and output; does not natively support video capture but highly configurable processing nodes
- Audio Toolbox - framework to record or play audio, convert formats, parse audio streams, and configure your audio session
- Core Audio APIS
How-To Transcribe Audio into Text
- AWS Transcribe - speech to text capabilities
- Google Speech-to-Text - convert speech into text
- Symbl.ai Transcription over WebSockets - speech to text
- Rev - Convert Audio & Video To Text worked on by humans
- rev.ai - AI branch of Rev.com
- Speechmatics - The Most Accurate and Inclusive Speech Technology.
- Deepgram - Build better voice applications with faster, more accurate transcription through AI Speech Recognition.
- Picovoice - Picovoice is the end-to-end platform for adding voice to anything on your terms.
- sonix - Automated transcription in 35+ languages.
- fireflies.ai - Process audio, generate transcripts, and extract actionable data.
- AssemblyAI - Transcribe and understand audio with a single AI-powered API
- descript.com - use transcripts to cut and edit video
- Otter.ai - Generate rich notes for meetings, interviews, lectures, and other important voice conversations
How-to Turn Text into Voice and Speech
- Aflorithmic API.audio - Simple APIs to transform text to speech, add sound design and make it sound beautiful at scale.
- Amazon Polly - Turn text into lifelike speech using deep learning
- Google Cloud Text to Speech - Convert text into natural-sounding speech using an API powered by the best of Google’s AI technologies.
- IBM Watson Text to Speech - Convert text into natural-sounding speech in a variety of languages and voices
- Azure Text to Speech - A Speech service feature that converts text to lifelike speech
How-To Visualize Audio
- headliner.app - create engaging social video with audio editing, transcription, and visualization
- getaudiogram.com - create engaging social video with audio visualizations
Audio Plugin Development Tools
- JUCE - JUCE is an open-source cross-platform C++ application framework for desktop and mobile applications, including VST, VST3, AU, AUv3, RTAS and AAX audio plug-ins.
- react-juce - React-JUCE is a hybrid JavaScript/C++ framework that enables a React.js frontend for a JUCE application or plugin.
- iPlug2 - iPlug 2 is a simple-to-use C++ framework for developing cross platform audio plug-ins/apps and targeting multiple plug-in APIs with the same minimalistic code.
- Plug'n Script - Blue Cat's Plug'n Script is an audio and MIDI scripting plug-in and application that can be programmed to build custom effects or virtual instruments, without quitting your favorite DAW software.
- Faust - Faust (Functional Audio Stream) is a functional programming language for sound synthesis and audio processing with a strong focus on the design of synthesizers, musical instruments, audio effects, etc. Faust targets high-performance signal processing applications and audio plug-ins for a variety of platforms and standards.
- SOUL - The SOUL project is creating a new language and infrastructure for writing and deploying audio code. It aims to unlock improvements in latency, performance, portability and ease-of-development that aren't possible with the current mainstream techniques that are being used.
- Max - Max is an infinitely flexible space to create your own interactive software
Awesome Lists
- awesome-scientific-audio - python for scientific audio
- awesome-webaudio - curated list of awesome webaudio packages and resources
Collections
- Internet Archive Audio Archive - over 14 million recordings of music, concerts, audiobooks, radio, etc.
- Library of Congress Audio Recordings - over 20,000 audio recordings of historical or cultural significance
Conferences and Events
- Audio Developers Conference - ADC is an annual event celebrating audio development technologies from music applications and game audio to audio processing andd embedded systems. ADC's mission is to help attendees acquire and develop new skills.
- Web Audio Conference - WAC is an international conference dedicated to web audio technologies and applications. The conference addresses academic research, artistic research, development, design, evaluation and standards concerned with emerging audio-related web technologies such as Web Audio API, Web RTC, WebSockets and Javascript.
Experiences and Places
- Audium - sound art event in a theatre of sound-sculpted space (San Francisco)
- ASMR University - art & science of autonomous sensory meridian response
- Exploratorium Listen Exhibit - making sense of sound (San Francisco)
Groups
- Audio Engineering Society - AES is an international organization that unites audio engineers, creative artists, scientists, and students promoting advances in audio and disseminating new knowledge and research with many local communities
- International Society for Music Information Retrieval - ISMIR is a non-profit seeking to advance access, organization, and understanding of music information
- Women's Audio Mission - WAM is a non-profit built and run by women to inspire and educate on the subject of audio in music and media
Podcasts
- Audio Programmer Podcast - all things audio programming, including DSP (digital signal processing), coding, and audio tech.
- Dissect - Long form music analysis of albums that goes track by track discussing music theory and artist intention
- Game Audio Podcast - aims to provide sound designers, composers, and everyone else interested in game audio a biweekly show
- Song Exploder - music podcast where musicians take apart their songs and tell the story of how they were made
- Twenty Thousand Hertz - the stories behind the world's most recognizable and interesting sounds
Social Forums
- Music and Audio Professionals - LinkedIn group for audio engineers, music arrangers, music composers, etc.
- r/audioengineering - products, practices, and stories about the profession or hobby of recording, editing, and producing audio
- Signal Processing StackExchange - question and answer for practioners of the art and science of signal, image, and video processing
- The Audio Programmer Discord - We invite you to the Audio Programmer community, where you can connect with other audio programmers, ask questions about coding and choosing the right career path, find job opportunities and more!
- Music and Audio Professionals - LinkedIn group for audio engineers, music arrangers, music composers, etc.
Social Networks
- Display - social platform for creators
- Lava - social network for audio
Video Channels
- The Audio Programmer - SOUL tutorials, JUCE tutorials, teaching audio programming for beginners, etc.
Courses
- Audio Signal Processing - audio signal methodologies for music. Topics include: spectral processing techniques, transformation of sounds, analyze, synthesize, transform audio signals, python (Coursera)
- Digital Media Foundations - Audio Made Simple. Topics include creating space with channels, measuring power of sound, capturing tone as frequency, phase. (LinkedIn Learning)
- Communication Acoustics - This is a comprehensive course starting from the basics: what is sound, how it propagates and prepares us gradually to learn about the human auditory system, psychoacoustics(connecting the physical world to how we perceive sounds), speech acoustics(human speech production system) and finally electroacoustics(the world of loud speakers and microphones)(Edx)
- Fundamentals of Audio and Music Engineering - basic concepts of acoustics and electronics and how they can be applied to understanding musical sound and make music with electronic instruments. Topics include: sound waves, musical sound, basic electronics, and applications of these basic principles in amplifiers and speaker design (Coursera)
Journals
- Computer Music Journal - a peer-reviewed academic journal that covers a wide range of topics related to digital audio signal processing and electroacoustic music
- IEEE/ACM Transactions on Audio, Speech, and Language Processing - dedicated to innovative theory and methods for processing signals representing audio, speech and language, and their applications. This includes analysis, synthesis, enhancement, transformation, classification and interpretation of such signals as well as the design, development, and evaluation of associated signal processing systems
- Journal of the Acoustical Society of America - a monthly peer-reviewed scientific journal covering aspects of acoustics
- Journal of the Audio Engineering Society - peer-reviewed journal devoted to audio technology
- SMPTE Motion Imaging Journal - the key publication of the Society, providing peer-reviewed articles on topics in 3D, imaging processing, display technologies, audio, compression, digitaal cinema, and much more
- Computer Music Journal - a peer-reviewed academic journal that covers a wide range of topics related to digital audio signal processing and electroacoustic music
Tutorials and Blogs
- Designing Sound - tutorials on the art & technique of sound design
- ProAudioGirl - Amy Tucker's blog covering audio for filmmakers, dialog editing basics, hacks & tricks, etc.
- The Ear Training Guide for Audio Producers - NPR training guide to help identify problematic audio and prevent most common problems
- Using ffmpeg to manipulate audio and video files - How to tame the "Swiss army knife" of audio and video manipulation…
Standards
- AES Standards - 2-channel digital audio, MADI, analog XLR pin-out, networked audio, etc.
- ATSC A/85 - Advanced Television Systems Committee (ATSC) Techniques for establishing and maintaining audio loudness for digital television
- EBU R.128 - European Broadcasting Union (EBU) loudness normalisation and permitted maximum level of audio signals
- ITU-R BS.1770 - International Telecommunication Union (ITU) algorithms to measure audio programme loudness and true-peak audio level
- ITU-R BS.2159-7 - International Telecommunications Union (ITU) multi-channel speaker configurations for home and broadcast applications
- MPEG Advanced Audio Coding - aac wideband perceptual audio coding algorithm that provides state of the art levels of compression for audio signals
- SMPTE Audio Standards - collection of standards related to audio
Data
- AudioSet - large-scale dataset of manually annotated audio events with sound ontology
- CSTR VCTK - speech data uttered by 110 English speakers with various accents reading about 400 sentences from newspapers
- Freesound - Freesound is a collaborative database of Creative Commons Licensed sounds.
- Mozilla Common Voice - open-source, multi-language dataset of voices to train speech-enabled applications with 68 validated hours and 18 languages
- Netflix Open Content - test titles with documentary, live action, and animation films
- Spoken Wikipedia Corpora - SWC is comprised of spoken articles in multiple languages

Programming Languages

Python 1 C++ 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

awesome-audio

How-To Playback Audio

How-To Analyze Audio

How-To Edit Audio

How-To Send Real-Time Audio

How-To Read and Write Audio Files

How-To Record Audio

How-To Transcribe Audio into Text

How-to Turn Text into Voice and Speech

How-To Visualize Audio

Audio Plugin Development Tools

Awesome Lists

Collections

Conferences and Events

Experiences and Places

Groups

Podcasts

Video Channels

Courses

Journals

Tutorials and Blogs

Standards

Data

awesome-audio

How-To Playback Audio

How-To Analyze Audio

How-To Edit Audio

How-To Send Real-Time Audio

How-To Read and Write Audio Files

How-To Record Audio

How-To Transcribe Audio into Text

How-to Turn Text into Voice and Speech

How-To Visualize Audio

Audio Plugin Development Tools

Awesome Lists

Collections

Conferences and Events

Experiences and Places

Groups

Podcasts

Social Forums

Social Networks

Video Channels

Courses

Journals

Tutorials and Blogs

Standards

Data