{"id":13861478,"url":"https://github.com/micah5/pyAudioClassification","last_synced_at":"2025-07-14T09:32:19.859Z","repository":{"id":57455708,"uuid":"153055678","full_name":"micah5/pyAudioClassification","owner":"micah5","description":"🎶 dead simple audio classification","archived":false,"fork":false,"pushed_at":"2019-11-14T15:52:21.000Z","size":20443,"stargazers_count":134,"open_issues_count":8,"forks_count":22,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-03-30T22:32:18.634Z","etag":null,"topics":["audio-classification","audio-processing","keras","neural-network"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/micah5.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-10-15T05:03:30.000Z","updated_at":"2024-09-10T18:06:20.000Z","dependencies_parsed_at":"2022-09-01T05:02:32.380Z","dependency_job_id":null,"html_url":"https://github.com/micah5/pyAudioClassification","commit_stats":null,"previous_names":["98mprice/pyaudioclassification"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/micah5/pyAudioClassification","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/micah5%2FpyAudioClassification","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/micah5%2FpyAudioClassification/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/micah5%2FpyAudioClassification/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/micah5%2FpyAudioClassification/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/micah5","download_url":"https://codeload.github.com/micah5/pyAudioClassification/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/micah5%2FpyAudioClassification/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265269367,"owners_count":23737831,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio-classification","audio-processing","keras","neural-network"],"created_at":"2024-08-05T06:01:23.306Z","updated_at":"2025-07-14T09:32:14.839Z","avatar_url":"https://github.com/micah5.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# pyAudioClassification\nDead simple audio classification\n\n![PyPI - Python Version](https://img.shields.io/badge/python-3.1.0-blue.svg)\n[![PyPI](https://img.shields.io/badge/pypi-v0.1.3-blue.svg)](https://pypi.org/project/pyaudioclassification/)\n## Who is this for? 👩‍💻 👨‍💻\nPeople who just want to classify some audio quickly, without having to dive into the world of audio analysis.\nIf you need something a little more involved, check out [pyAudioAnalysis](https://github.com/tyiannak/pyAudioAnalysis) or [panotti](https://github.com/drscotthawley/panotti)\n\n## Quick install\n```\npip install pyaudioclassification\n```\n\n### Requirements\n* __Python 3__\n* Keras\n* Tensorflow\n* librosa\n* NumPy\n* Soundfile\n* tqdm\n* matplotlib\n\n## Quick start\n```python\nfrom pyaudioclassification import feature_extraction, train, predict\nfeatures, labels = feature_extraction(\u003cdata_path\u003e)\nmodel = train(features, labels)\npred = predict(model, \u003cdata_path\u003e)\n```\n\nOr, if you're feeling reckless, you could just string them together like so:\n```python\npred = predict(train(feature_extraction(\u003ctraining_data_path\u003e)), \u003cprediction_data_path\u003e)\n```\n\nA full example with saving, loading \u0026 some dummy data can be found [here](https://github.com/98mprice/pyAudioClassification/blob/master/example/test.py).\n\n---\n\nRead below for a more detailed look at each of these calls.\n\n## Detailed Guide\n### Step 1: Preprocessing 🐶 🐱\nFirst, add all your audio files to a directory in the following structure\n```\ndata/\n├── \u003cclass_name\u003e/\n│   ├── \u003cfile_name\u003e\n│   └── ...\n└── ...\n```\n\nFor example, if you were trying to classify dog and cat sounds it might look like this\n```\ndata/\n├── cat/\n│   ├── cat1.ogg\n│   ├── cat2.ogg\n│   ├── cat3.wav\n│   └── cat4.wav\n└── dog/\n    ├── dog1.ogg\n    ├── dog2.ogg\n    ├── dog3.wav\n    └── dog4.wav\n```\n\nGreat, now we need to preprocess this data. Just call `feature_extraction(\u003cdata_path\u003e)` and it'll return our input and target data.\nSomething like this:\n```python\nfeatures, labels = feature_extraction('/Users/mac2015/data/')\n```\n\n(If you don't want to print to stdout, just pass `verbose=False` as a argument)\n\n---\nDepending on how much data you have, this process could take a while... so it might be a good idea to save. You can save and load with [NumPy](https://www.numpy.org/)\n```python\nnp.save('%s.npy' % \u003cfile_name\u003e, features)\nfeatures = np.load('%s.npy' % \u003cfile_name\u003e)\n```\n\n### Step 2: Training 💪\nNext step is to train your model on the data. You can just call...\n```python\nmodel = train(features, labels)\n```\n...but depending on your dataset, you might need to play around with some of the hyper-parameters to get the best results.\n\n#### Options\n* `epochs`: The number of iterations. Default is `50`.\n\n* `lr`: Learning rate. Increase to speed up training time, decrease to get more accurate results (if your loss is 'jumping'). Default is `0.01`.\n\n* `optimiser`: Choose any of [these](https://keras.io/optimizers/). Default is `'SGD'`.\n\n* `print_summary`: Prints a summary of the model you'll be training. Default is `False`.\n\n* `loss_type`: Classification type. Default is `categorical` for \u003e2 classes, and `binary` otherwise.\n\nYou can add any of these as optional arguments, for example `train(features, labels, lr=0.05)`\n\n---\nAgain, you probably want to save your model once it's done training. You can do this with Keras:\n```python\nfrom keras.models import load_model\n\nmodel.save('my_model.h5')\nmodel = load_model('my_model.h5')\n```\n\n### Step 3: Prediction 🙏 🙌\nNow the fun part- try your trained model on new data!\n\n```python\npred = predict(model, \u003cdata_path\u003e)\n```\n\nYour `\u003cdata_path\u003e` should point to a new, untested audio file.\n\n#### Binary\nIf you have 2 classes (or if you force selected `'binary'` as a type), `pred` will just be a single number for each file.\n\nThe closer it is to 0, the closer the prediction is for the first class, and the closer it is to 1 the closer the prediction is to the second class.\n\nSo for our cat/dog example, if it returns `0.2` it's 80% sure the sound is a cat, and if it returns `0.8` it's 80% sure it's a dog.\n\n#### Categorical\nIf you have more than 2 classes (or if you force selected `'categorical'` as a type), `pred` will be an array for each sound file.\n\nIt'll look something like this\n```\n[[1.6454633e-06 3.7017996e-11 9.9999821e-01 1.5900606e-07]]\n```\n\nThe index of each item in the array will correspond to the prediction for that class.\n\n---\nYou can pretty print the predictions by showing them in a leaderboard, like so:\n\n```python\nprint_leaderboard(pred, \u003ctraining_data_path\u003e)\n```\nIt looks like this:\n\n```\n1. Cow 100.0% (index 2)\n2. Rooster 0.0% (index 0)\n3. Frog 0.0% (index 3)\n4. Pig 0.0% (index 1)\n```\n\n## References\n* Large parts of the code (particularly the feature extraction) are based on [mtobeiyf/audio-classification](https://github.com/mtobeiyf/audio-classification)\n* [panotti](https://github.com/drscotthawley/panotti)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicah5%2FpyAudioClassification","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmicah5%2FpyAudioClassification","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicah5%2FpyAudioClassification/lists"}