{"id":17036857,"url":"https://github.com/faroit/python_audio_loading_benchmark","last_synced_at":"2025-04-12T12:31:58.291Z","repository":{"id":39737986,"uuid":"165292565","full_name":"faroit/python_audio_loading_benchmark","owner":"faroit","description":"Benchmark popular audio i/o packages ","archived":false,"fork":false,"pushed_at":"2023-12-19T09:08:02.000Z","size":1056,"stargazers_count":140,"open_issues_count":9,"forks_count":10,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-26T07:11:30.853Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/faroit.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-01-11T18:43:03.000Z","updated_at":"2025-03-23T21:28:23.000Z","dependencies_parsed_at":"2023-12-19T10:31:34.969Z","dependency_job_id":"3ddc0251-773e-4576-b427-f0b4896bcc06","html_url":"https://github.com/faroit/python_audio_loading_benchmark","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faroit%2Fpython_audio_loading_benchmark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faroit%2Fpython_audio_loading_benchmark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faroit%2Fpython_audio_loading_benchmark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faroit%2Fpython_audio_loading_benchmark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/faroit","download_url":"https://codeload.github.com/faroit/python_audio_loading_benchmark/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248566567,"owners_count":21125687,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-14T08:52:12.968Z","updated_at":"2025-04-12T12:31:57.922Z","avatar_url":"https://github.com/faroit.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Python Audio-Loading Benchmark\n\nThe aim of his repository is to evaluate the loading performance of various audio I/O packages interfaced from python.\n\nThis is relevant for machine learning models that today often process raw (time domain) audio and assembling a batch on the fly. It is therefore important to load the audio as fast as possible. At the same time a library should ideally support a variety of uncompressed and compressed audio formats and also is capable of loading only chunks of audio (seeking). The latter is especially important for models that cannot easily work with samples of variable length (convnets).\n\n## Tested Libraries\n\n| Library                                                                                                                                 | Version | Short-Name/Code  | Out Type          | Supported codecs                                    | Excerpts/Seeking |\n| --------------------------------------------------------------------------------------------------------------------------------------- | ------- | ---------------- | ----------------- | --------------------------------------------------- | ---------------- |\n| [scipy.io.wavfile](https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.io.wavfile.read.html#scipy.io.wavfile.read)        | 1.9.3   | `scipy`          | Numpy             | PCM (only 16 bit)                                   | ❌               |\n| [scipy.io.wavfile memmap](https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.io.wavfile.read.html#scipy.io.wavfile.read) | 1.9.3   | `scipy_mmap`     | Numpy             | PCM (only 16 bit)                                   | ✅               |\n| [soundfile](https://pysoundfile.readthedocs.io/en/latest/) ([libsndfile](http://www.mega-nerd.com/libsndfile/))                         | 0.12.1  | `soundfile`      | Numpy             | PCM, Ogg, Flac, MP3                                 | ✅               |\n| [pydub](https://github.com/jiaaro/pydub)                                                                                                | 0.25.1  | `pydub`          | Python Array      | PCM, MP3, OGG or other FFMPEG/libav supported codec | ❌               |\n| [aubio](https://github.com/aubio/aubio)                                                                                                 | 0.4.9   | `aubio`          | Numpy Array       | PCM, MP3, OGG or other avconv supported code        | ✅               |\n| [audioread](https://github.com/beetbox/audioread) ([FFMPEG](https://www.ffmpeg.org/))                                                   | 2.1.9   | `ar_ffmpeg`      | Numpy Array       | all of FFMPEG                                       | ❌               |\n| [librosa](https://librosa.org/)                                                                                                         | 0.10.0   | `librosa`        | Numpy Array       | all of soundfile                                   | ✅               |\n| [tensorflow `tf.io.audio.decode_wav`](https://www.tensorflow.org/api_docs/python/tf/contrib/ffmpeg/decode_audio)                        | 2.11.0   | `tf_decode_wav`  | Tensorflow Tensor | PCM (only 16 bit)                                   | ❌               |\n| [tensorflow-io `from_audio`](https://www.tensorflow.org/io/api_docs/python/tfio/v0/IOTensor#from_audio)                                 | 0.30.0  | `tfio_fromaudio` | Tensorflow Tensor | PCM, Ogg, Flac                                      | ✅               |\n| [torchaudio](https://github.com/pytorch/audio) (sox_io)                                                                                 | 0.13.1   | `torchaudio`     | PyTorch Tensor    | all codecs supported by Sox                         | ✅               |\n| [torchaudio](https://github.com/pytorch/audio) (soundfile)                                                                              | 0.13.1   | `torchaudio`     | PyTorch Tensor    | all codecs supported by Soundfile                   | ✅               |\n| [soxbindings](https://github.com/pseeth/soxbindings)                                                                                    | 0.9.0   | `soxbindings`    | Numpy Tensor      | all codecs supported by Soundfile                   | ✅               |\n| [stempeg](https://github.com/faroit/stempeg)                                                                                            | 0.2.3   | `stempeg`        | Numpy Tensor      | all codecs supported by FFMPEG                      | ✅               |\n\n### Not included\n\n- **[audioread (coreaudio)](https://github.com/beetbox/audioread/blob/master/audioread/macca.py)**: only available on macOS.\n- **[audioread (gstreamer)](https://github.com/beetbox/audioread/blob/master/audioread/gst.py)**: too difficult to install.\n- **[madmom](https://github.com/CPJKU/madmom)**: same ffmpeg interface as `ar_ffmpeg`.\n- **[pymad](https://github.com/jaqx0r/pymad)**: only support for MP3, also very slow.\n- **[python builtin `wave`](https://docs.python.org/3.7/library/wave.html)**: too limited cocdec support.\n\n## Results\n\nThe benchmark loads a number of (single channel) audio files of different length (between 1 and 151 seconds) and measures the time until the audio is converted to a tensor. Depending on the target tensor type (either `numpy`, `pytorch` or `tensorflow`) a different number of libraries were compared. E.g. when the output type is `numpy` and the target tensor type is `tensorflow`, the loading time included the cast operation to the target tensor. Furthermore, multiprocessing was disabled for data loaders. So especially for deep learning applications the loading speed doesn't necessarily reprent the batch loading speed.\n\n**All results shown below, depict loading time **in seconds\\*\\*.\n\n### Load to Numpy Tensor\n\n![](results/benchmark_np.png)\n\n### Load to PyTorch Tensor\n\n![](results/benchmark_pytorch.png)\n\n### Load to Tensorflow Tensor\n\n![](results/benchmark_tf.png)\n\n### Getting metadata information\n\nIn addition to loading the file, one might also be interested in extracting\nmetadata. To benchmark this we asked for every file to provide metadata for\n_sampling rate_, _channels_, _samples_, and _duration_. All in consecutive\ncalls, which means the file is not allowed to be opened once and extract all\nmetadata together. Note, that we have excluded `pydub` from the benchmark\nresults on metadata as it was significantly slower than the other tools.\n\n![](results/benchmark_metadata.png)\n\n## Running the Benchmark\n\n### Generate sample data\n\nTo test the loading speed, we generate different durations of random (noise) audio data and encode it either to **PCM 16bit WAV**, **MP3 CBR**, or **MP4**.\nThe data is generated by using a shell script. To generate the data in the folder `AUDIO`, run\n\n```bash\ngenerate_audio.sh\n```\n\n### Setting up using Docker\n\nBuild the docker container using\n\n```bash\ndocker build -t audio_benchmark .\n```\n\nIt installs all the package requirements for all audio libraries.\nAfterwards, mount the data directory into the docker container and run `run.sh` inside the\ncontainer, e.g.:\n\n```bash\ndocker run -v /home/user/repos/python_audio_loading_benchmark/:/app \\\n    -it audio_benchmark:latest /bin/bash run.sh\n```\n\n### Setting up in a virtual environment\n\nCreate a virtual environment, install the necessary dependencies and run the\nbenchmark with\n\n```bash\nvirtualenv --python=/usr/bin/python3 --no-site-packages _env\nsource _env/bin/activate\npip install -r requirements.txt\npip install git+https://github.com/pytorch/audio.git\n```\n\n### Benchmarking\n\nRun the benchmark with\n\n```bash\nbash run.sh\n```\n\nand plot the result with\n\n```bash\npython plot.py\n```\n\nThis generates PNG files in the `results` folder.\nThe data is generated by using a shell script. To generate the data in the folder `AUDIO`, run `generate_audio.sh`.\n\n## Authors\n\n@faroit, @hagenw\n\n## Contribution\n\nWe encourage interested users to contribute to this repository in the issue section and via pull requests. Particularly interesting are notifications of new tools and new versions of existing packages. Since benchmarks are subjective, I (@faroit) will reran the benchmark on our server again.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffaroit%2Fpython_audio_loading_benchmark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffaroit%2Fpython_audio_loading_benchmark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffaroit%2Fpython_audio_loading_benchmark/lists"}