An open API service indexing awesome lists of open source software.

https://github.com/sabaudian/music_genre_classification_project

Audio Pattern Recognition project - Music Genres Classification
https://github.com/sabaudian/music_genre_classification_project

artificial-intelligence audio-analysis audio-classification audio-processing genre-classification genres-classification k-nearest-neighbours k-nn machine-learning music-genre-classification music-information-retrieval neural-network preprocessing preprocessing-data python random-forest random-forest-classification svm svm-classifier

Last synced: about 1 month ago
JSON representation

Audio Pattern Recognition project - Music Genres Classification

Awesome Lists containing this project

README

        

# Music Genre Classification Project

This repository is based on the recognition of musical genres through supervised and unsupervised learning.

apr_project_architecture

## Plugins:
- numpy: https://numpy.org
- librosa: https://librosa.org/doc/latest/index.html
- matplotlib: https://matplotlib.org
- pydub: https://pypi.org/project/pydub/
- pandas: https://pandas.pydata.org
- scikit-learn: https://scikit-learn.org/stable/
```
$ pip install -r requirements.txt
```

## Information:
the dataset used for built this project is the notorious GTZAN dataset, recovered from kaggle (_**link to database:** https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification_).

In the **utils** directory,
there are all that classes used for preprocessing the dataset and performing data augmentation
(I did not use the csv file available at the previous link, I built my own).

- **_features_computation:_** computation of the various features to extract from audio files.
- _**features_extractions:**_ extraction of the computed features to a csv file in a proper directory.
- _**features_visualizations:**_ visualization of the single audio signals and the visualization of the various extracted features with a confrontation of the different genres.
- _**prepare_dataset:**_ check the duration of audio files and perform data augmentation (30s long file -> ten 3s long chunk).

Then we have the core classes of the project:

- _**main:**_ main class of the project that calls all the other.
- _**genres_ul_functions:**_ performs k-means clustering and then performs its evaluation.
- **_genres_sl_functions:_** performs various classification algorithms (Neural Network, Random Forest, K-Nearest Neighbors, Support Vector Machine) and evaluate their performances with confusion matrix, roc curve and metrics (accuracy, F1-score,...).
- **_plot_functions:_** used for defining all the plot functions.
- **_constants:_** contains all the constants used in the project.

## Performace Summary:

| | MULTILAYER PERCEPTRON | RANDOM FOREST | K-NEAREST NEIGHBORS | SUPPORT VECTOR MACHINE |
| - | --------------------- | ------------- | ------------------- | ---------------------- |
| ACCURACY | 84.80 | 79.33 | 89.80 | 89.40 |
| F1-SCORE | 0.85 | 0.79 | 0.90 | 0.89 |
| EXECUTION TIME (sec) | 63.72 | 52.76 | 7.73 | 21.40 |