https://github.com/zabir-nabil/audioperm

A python library for generating different permutations of audible segments from audio files.
https://github.com/zabir-nabil/audioperm

audio-augmentation audio-classification audio-processing augmentation speaker-recognition speech-augmentation

Last synced: 11 months ago
JSON representation

A python library for generating different permutations of audible segments from audio files.

Host: GitHub
URL: https://github.com/zabir-nabil/audioperm
Owner: zabir-nabil
License: mit
Created: 2021-07-06T23:52:55.000Z (almost 5 years ago)
Default Branch: main
Last Pushed: 2022-06-13T06:50:53.000Z (about 4 years ago)
Last Synced: 2025-06-26T08:18:44.544Z (12 months ago)
Topics: audio-augmentation, audio-classification, audio-processing, augmentation, speaker-recognition, speech-augmentation
Language: Jupyter Notebook
Homepage:
Size: 7.55 MB
Stars: 13
Watchers: 2
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          


  





    Audioperm, a python library for generating different permutations of audible segments from audio files.







    





    





    





  





---

### Audioperm

A python library for generating different permutations of audible segments from audio files. 

```console

pip install audioperm

```

#### Use:

* Silence Removal from Audio

* Audio / Speech augmentation

* Word segmentation

* Word level permutation generation

* Add new synthetic data for deep learning

* Speaker recognition, Speaker verification, Audio classification, Audio fingerprinting

**Documentation**: https://zabir-nabil.github.io/audioperm/

**Source Code**: https://github.com/zabir-nabil/audioperm

---

#### Word segmentation

```python

from audioperm import AudioPerm

from audioperm.utils import save_audio

ap = AudioPerm("i_love_cats.m4a")

label = "i love cats"

words = ap.word_segments()

label_words = label.split()

for i, w in enumerate(words):

  save_audio(w, label_words[i] + ".wav")

```

```

cats.wav  i_love_cats.m4a  i.wav  love.wav

```

#### Word-level permutation

```python

import numpy as np

from audioperm import AudioPerm

from audioperm.utils import save_audio

ap = AudioPerm("i_love_cats.m4a")

ap.word_segments(return_words=False)

perm_sentences = ap.permute(n_permutations = 5)

for i, s in enumerate(perm_sentences):

  save_audio(s, f"perm_{i}.wav")

```

```

cats.wav	   i.wav       perm_1.wav    perm_4.wav

i_love_cats.m4a    love.wav    perm_2.wav    perm_0.wav  

perm_3.wav

```

#### `permutations` on multiple files

```python

from audioperm import read_audio, word_segments, permutations

ap = read_audio(["bangla_demo.wav", "i_love_cats.m4a"])

out = word_segments(ap)

perms = permutations(out, n_permutations = 5)

```

#### Fixed-length segments

* Generate fixed length audible segments (with permutation/augmentation)

```python

from audioperm import fixed_len_segments

fixed_len_segments("bangla_demo.wav", return_segments = False, save_path = "fls_out", save = True, segment_size = 0.5)

out = fixed_len_segments("bangla_demo.wav", return_segments = True, max_segments = 5, permute = True, save = False, segment_size = 0.5)

```

### Support

> **Tested with:** `python3.6` `python3.7` `python3.8`

> **Internal audio representation:** `PCM 16` `float32`

> **TO-DO:**

 - [ ] multi-channel audio

 - [ ] augmentation

 - [ ] multi-processing

 - [ ] gpu-support

### Others

> **To run the code:** [Google Colab](https://colab.research.google.com/github/zabir-nabil/audioperm/blob/main/notebooks/audioperm_demo.ipynb)

> Any contribution is welcome. 

  - [Contributors](https://github.com/zabir-nabil/audioperm/graphs/contributors)

  - [Contribution guide](https://github.com/zabir-nabil/audioperm/blob/main/CONTRIBUTE.md)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zabir-nabil/audioperm

Awesome Lists containing this project

README