Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/micah5/pyaudioclassification
🎶 dead simple audio classification
https://github.com/micah5/pyaudioclassification
audio-classification audio-processing keras neural-network
Last synced: 3 months ago
JSON representation
🎶 dead simple audio classification
- Host: GitHub
- URL: https://github.com/micah5/pyaudioclassification
- Owner: micah5
- License: mit
- Created: 2018-10-15T05:03:30.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2019-11-14T15:52:21.000Z (about 5 years ago)
- Last Synced: 2024-10-12T10:06:06.007Z (3 months ago)
- Topics: audio-classification, audio-processing, keras, neural-network
- Language: Python
- Homepage:
- Size: 19.5 MB
- Stars: 134
- Watchers: 7
- Forks: 23
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# pyAudioClassification
Dead simple audio classification![PyPI - Python Version](https://img.shields.io/badge/python-3.1.0-blue.svg)
[![PyPI](https://img.shields.io/badge/pypi-v0.1.3-blue.svg)](https://pypi.org/project/pyaudioclassification/)
## Who is this for? 👩💻 👨💻
People who just want to classify some audio quickly, without having to dive into the world of audio analysis.
If you need something a little more involved, check out [pyAudioAnalysis](https://github.com/tyiannak/pyAudioAnalysis) or [panotti](https://github.com/drscotthawley/panotti)## Quick install
```
pip install pyaudioclassification
```### Requirements
* __Python 3__
* Keras
* Tensorflow
* librosa
* NumPy
* Soundfile
* tqdm
* matplotlib## Quick start
```python
from pyaudioclassification import feature_extraction, train, predict
features, labels = feature_extraction()
model = train(features, labels)
pred = predict(model, )
```Or, if you're feeling reckless, you could just string them together like so:
```python
pred = predict(train(feature_extraction()), )
```A full example with saving, loading & some dummy data can be found [here](https://github.com/98mprice/pyAudioClassification/blob/master/example/test.py).
---
Read below for a more detailed look at each of these calls.
## Detailed Guide
### Step 1: Preprocessing 🐶 🐱
First, add all your audio files to a directory in the following structure
```
data/
├── /
│ ├──
│ └── ...
└── ...
```For example, if you were trying to classify dog and cat sounds it might look like this
```
data/
├── cat/
│ ├── cat1.ogg
│ ├── cat2.ogg
│ ├── cat3.wav
│ └── cat4.wav
└── dog/
├── dog1.ogg
├── dog2.ogg
├── dog3.wav
└── dog4.wav
```Great, now we need to preprocess this data. Just call `feature_extraction()` and it'll return our input and target data.
Something like this:
```python
features, labels = feature_extraction('/Users/mac2015/data/')
```(If you don't want to print to stdout, just pass `verbose=False` as a argument)
---
Depending on how much data you have, this process could take a while... so it might be a good idea to save. You can save and load with [NumPy](https://www.numpy.org/)
```python
np.save('%s.npy' % , features)
features = np.load('%s.npy' % )
```### Step 2: Training 💪
Next step is to train your model on the data. You can just call...
```python
model = train(features, labels)
```
...but depending on your dataset, you might need to play around with some of the hyper-parameters to get the best results.#### Options
* `epochs`: The number of iterations. Default is `50`.* `lr`: Learning rate. Increase to speed up training time, decrease to get more accurate results (if your loss is 'jumping'). Default is `0.01`.
* `optimiser`: Choose any of [these](https://keras.io/optimizers/). Default is `'SGD'`.
* `print_summary`: Prints a summary of the model you'll be training. Default is `False`.
* `loss_type`: Classification type. Default is `categorical` for >2 classes, and `binary` otherwise.
You can add any of these as optional arguments, for example `train(features, labels, lr=0.05)`
---
Again, you probably want to save your model once it's done training. You can do this with Keras:
```python
from keras.models import load_modelmodel.save('my_model.h5')
model = load_model('my_model.h5')
```### Step 3: Prediction 🙏 🙌
Now the fun part- try your trained model on new data!```python
pred = predict(model, )
```Your `` should point to a new, untested audio file.
#### Binary
If you have 2 classes (or if you force selected `'binary'` as a type), `pred` will just be a single number for each file.The closer it is to 0, the closer the prediction is for the first class, and the closer it is to 1 the closer the prediction is to the second class.
So for our cat/dog example, if it returns `0.2` it's 80% sure the sound is a cat, and if it returns `0.8` it's 80% sure it's a dog.
#### Categorical
If you have more than 2 classes (or if you force selected `'categorical'` as a type), `pred` will be an array for each sound file.It'll look something like this
```
[[1.6454633e-06 3.7017996e-11 9.9999821e-01 1.5900606e-07]]
```The index of each item in the array will correspond to the prediction for that class.
---
You can pretty print the predictions by showing them in a leaderboard, like so:```python
print_leaderboard(pred, )
```
It looks like this:```
1. Cow 100.0% (index 2)
2. Rooster 0.0% (index 0)
3. Frog 0.0% (index 3)
4. Pig 0.0% (index 1)
```## References
* Large parts of the code (particularly the feature extraction) are based on [mtobeiyf/audio-classification](https://github.com/mtobeiyf/audio-classification)
* [panotti](https://github.com/drscotthawley/panotti)