https://github.com/aitor-alvarez/mir-song-dataset-collection
Scripts to create Music Information Retrieval datasets from streaming services for singer identification tasks
https://github.com/aitor-alvarez/mir-song-dataset-collection
audio-signal-processing dataset-generation deep-learning-dataset machine-learning-dataset music-information-retrieval singer-identification-tasks
Last synced: 2 months ago
JSON representation
Scripts to create Music Information Retrieval datasets from streaming services for singer identification tasks
- Host: GitHub
- URL: https://github.com/aitor-alvarez/mir-song-dataset-collection
- Owner: aitor-alvarez
- Created: 2023-10-02T23:47:53.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-26T19:12:46.000Z (over 1 year ago)
- Last Synced: 2025-01-25T16:44:17.001Z (4 months ago)
- Topics: audio-signal-processing, dataset-generation, deep-learning-dataset, machine-learning-dataset, music-information-retrieval, singer-identification-tasks
- Language: Python
- Homepage:
- Size: 16.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# MIR-song-dataset-collection
The current scripts searches on iTunes API for artists provided in a list (see step 2 under instructions) and downloads a 30 second preview of each song from those artists.
The resulting dataset could be used for training Deep Learning models in singer identification tasks.
## Instructions
1. Clone the current repository: ``` git clone https://github.com/aitor-alvarez/MIR-song-dataset-collection.git ```
2. Create a file with the artists/performers list. The file should be named ```artists.txt``` and contain a single column with the first row (as header) named artist. Place all your artists' names below this header.
3. Execute the following command to parse the list of artists and search iTunes API for their songs: ```python main.py -a artists.txt ```. This will result in the creation of a file named dataset.csv that will contain all songs from those artists.
4. Feel free to edit dataset.csv if you want to exclude songs.
5. To download the 30 second previews execute: ``` python main.py -d ```
6. All previews will be downloaded into a subfolder within this repository named ```songs/```