https://github.com/gabyfle/soundml
A high level DSP library in the OCaml language
https://github.com/gabyfle/soundml
audio-processing ffmpeg ocaml owl
Last synced: 8 months ago
JSON representation
A high level DSP library in the OCaml language
- Host: GitHub
- URL: https://github.com/gabyfle/soundml
- Owner: gabyfle
- License: apache-2.0
- Created: 2024-05-16T15:20:50.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-23T05:47:04.000Z (8 months ago)
- Last Synced: 2025-02-14T21:04:41.408Z (8 months ago)
- Topics: audio-processing, ffmpeg, ocaml, owl
- Language: OCaml
- Homepage: https://soundml.gabyfle.dev
- Size: 15.3 MB
- Stars: 15
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
SoundML
A little and very high level library to perform basic operations on audio files in the OCaml language
· Documentation · Report Bug · Request Feature ·
## About the Project
> [!WARNING]
> The project is still in development and is not yet ready for use.## Getting Started
### Installation
This project uses Opam as a package manager
```bash
opam install soundml
```## Roadmap
The project is still work in progress.
* [x] Read and Write audio
* [x] Audio slicing (in a similar way to Owl's slicing)
* [ ] Onset detection algorithms
* [ ] Spectral analysis
* [x] Generic spectrogram helper function
* [X] Unify the spectrogram parameters inside a config module.
* [X] Mel spectrogram
* [X] MFCC spectrogram
* [ ] Chroma spectrogram (*WIP*)
* [x] Constant and linear detrend
* [ ] Time domain analysis
* [x] RMS computation
* [X] Zero crossing rate
* [ ] Effects module (*WIP*)
* [ ] Pitch shifting
* [ ] Time stretching
* [ ] Filters (low-pass, high-pass, band-pass, etc...)
* [ ] Write test files for the whole library## Features
- Natively written in OCaml for a perfect OCaml developer experience
- Easily read and write audio files in various formats (WAV, MP3, etc...)
- Audio slicing
- Feature extraction (MFCC, mel-spectrogram, ZCR, ...)
- Audio effects (pitch shifting, time stretching, filters, ...)## Requirements
You should be using the OCaml compiler with a version at least equal to 5.1.0. You can install it by following the instructions on the [OCaml website](https://ocaml.org/docs/install.html). This project uses the Dune build system.
This library heavily relies on the Owl and ocaml-ffmpeg libraries.
| Name | Version | Description |
| ----------------------------------------------------------------------------------------------------- | -------- | -------------------------------------------------------------------------------------------------- |
| [**Owl**](https://github.com/owlbarn/owl) - *OCaml Scientific Computing* | `>= 1.1` | Library for scientific computing in OCaml. Used to make the heavy computations (FFT, IFFT, etc...) |
| [**ocaml-ffmpeg**](https://github.com/savonet/ocaml-ffmpeg) - *OCaml bindings to the FFmpeg library.* | `>= 1.2` | OCaml bindings for FFmpeg. Used to read and write audio data. |## Inspirations
This project is heavily inspired by other amazing open-source libraries such as:
| Name | Inspiration | Reference |
| ------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [**librosa**](https://github.com/librosa/librosa) | General functionalities for audio signal processing | McFee, Brian, Colin Raffel, Dawen Liang, Daniel PW Ellis, Matt McVicar, Eric Battenberg, and Oriol Nieto. "librosa: Audio and music signal analysis in python." In Proceedings of the 14th python in science conference, pp. 18-25. 2015. |
| [**pydub**](https://github.com/jiaaro/pydub) | Ease of use, audio slicing using milliseconds and manipulation | - |
| [**Numpy**](https://numpy.org/) | Numerous implementation of SoundML's algorithms were directly taken from Numpy | - |
| [**Matplotlib**](https://matplotlib.org/) | The implementation of their spectral helper to compute spectrogram as well as the one of the linear detrend function were took from the [`matplotlib.mlab`](https://github.com/matplotlib/matplotlib/blob/main/lib/matplotlib/mlab.py#L213-L373) module. | - |Don't hesitate to check the amazing work done by the authors and contributors of these libraries!
## License
Distributed under the Apache License Version 2.0. See LICENSE for more information.
## Acknowledgements
* Logo generated with DALL-E by OpenAI