{"id":13584966,"url":"https://github.com/gabolsgabs/DALI","last_synced_at":"2025-04-07T06:32:04.987Z","repository":{"id":38848187,"uuid":"136622322","full_name":"gabolsgabs/DALI","owner":"gabolsgabs","description":" DALI: a large Dataset of synchronised Audio, LyrIcs and vocal notes.","archived":false,"fork":false,"pushed_at":"2020-06-11T11:07:17.000Z","size":32472,"stargazers_count":349,"open_issues_count":2,"forks_count":34,"subscribers_count":11,"default_branch":"master","last_synced_at":"2024-11-06T02:38:10.380Z","etag":null,"topics":["dataset","deep-learning","ismir","music-information-retrieval","singing-voice","teacher-student"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gabolsgabs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"citations/DALI_v1.0.bib","codeowners":null,"security":null,"support":null}},"created_at":"2018-06-08T13:26:30.000Z","updated_at":"2024-10-13T14:23:19.000Z","dependencies_parsed_at":"2022-09-20T11:37:29.125Z","dependency_job_id":null,"html_url":"https://github.com/gabolsgabs/DALI","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gabolsgabs%2FDALI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gabolsgabs%2FDALI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gabolsgabs%2FDALI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gabolsgabs%2FDALI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gabolsgabs","download_url":"https://codeload.github.com/gabolsgabs/DALI/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247607159,"owners_count":20965925,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataset","deep-learning","ismir","music-information-retrieval","singing-voice","teacher-student"],"created_at":"2024-08-01T15:04:37.975Z","updated_at":"2025-04-07T06:32:04.973Z","avatar_url":"https://github.com/gabolsgabs.png","language":"Python","funding_links":[],"categories":["Python","Multimodal Embedding Space","Audio Datasets","Datasets"],"sub_categories":["Creative Uses of Generative AI Image Synthesis Tools","Music"],"readme":"\n[horizontal]: ./docs/images/horizontal.png\n[vertical]: ./docs/images/vertical.png\n[p1]: ./docs/images/p1.png\n[l1]: ./docs/images/l1.png\n[w1]: ./docs/images/w1.png\n[Example]: ./docs/images/Example.png\n\n\n# WELCOME TO THE DALI DATASET: a large **D**ataset of synchronised **A**udio, **L**yr**I**cs and vocal notes.\n\nYou can find a detailed explanation of how DALI has been created at:\n***[Meseguer-Brocal_2018]*** [G. Meseguer-Brocal, A. Cohen-Hadria and G. Peeters. DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm. In ISMIR Paris, France, 2018.](http://ismir2018.ircam.fr/doc/pdfs/35_Paper.pdf)\n\nCite this [paper](https://zenodo.org/record/1492443):\n\n\u003e@inproceedings{Meseguer-Brocal_2018,\n\tAuthor = {Meseguer-Brocal, Gabriel and Cohen-Hadria, Alice and Peeters, Geoffroy},\n\tBooktitle = {19th International Society for Music Information Retrieval Conference},\n\tEditor = {ISMIR},\n\tMonth = {September},\n\tTitle = {DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm.},\n\tYear = {2018}}\n\n\n\nHere's an example of the kind of information DALI contains:\n\n![alt text][Example]\n\n\nDALI has two main elements:\n\n## 1- The dataset - dali_data\n\nThe dataset itself. It is denoted as **dali_data** and it is presented as a collection of **gz** files.\nYou can find the different DALI_data versions in [here](https://github.com/gabolsgabs/DALI/blob/master/versions/).\n\n## 2- The code for working with DALI - dali_code\nThe code, denoted as **dali_code**, for reading and working with dali_data.\nIt is stored in this repository and presented as a python package.\nDali_code has its own versions controlled by this github.\nThe release and stable versions can be found at [pypi](https://pypi.org/project/DALI-dataset/).\n\nrepository\u003cbr\u003e\n├── code\u003cbr\u003e\n│   ├── DALI\u003cbr\u003e\n│   │   ├── \\_\\_init\\_\\_.py\u003cbr\u003e\n│   │   ├── Annotations.py\u003cbr\u003e\n│   │   ├── main.py\u003cbr\u003e\n│   │   ├── utilities.py\u003cbr\u003e\n│   │   ├── extra.py\u003cbr\u003e\n│   │   ├── download.py\u003cbr\u003e\n│   │   ├── vizualization.py\u003cbr\u003e\n│   └── setup.py\u003cbr\u003e\n\n\n# NEWS:\n\nGround-Truth for version 1.0 updated with 105 songs.\nRemember that DALI is a ongoing project. There are many things to solve.\n\nCurrently we are working in:\n* the second generation for the Singing voice detection system.\n* solving errors for indivual notes.\n* solving errors global notas errors (songs where all the notes are place off by the same certain interval)\n* errors in local note alignments.\n\nPlease, if you have any suggestion our improvement please contact us at: dali [dot] dataset [at] gmail [dot] com\n\nFor any problem with the package that deal with the annotations open an issue in this repository.\n\nThank you.\n\n# TUTORIAL:\n\nFirst of all, [download](https://github.com/gabolsgabs/DALI/blob/master/versions/) your Dali_data version and clone this repository.\n\n\n## 0- Installing Dali_code.\nFor the release and stable versions just run the command:\n\n  \u003e  pip install dali-dataset\n\nFor non-release and unstable versions  you can install them manually going to folder DALI/code and running:\n\n  \u003e  pip install .\n\nYou can upgrade DALI for future version with:\n\n  \u003e  pip install dali-dataset --upgrade\n\nDALI can be uninstalled with:\n\n  \u003e  pip uninstall dali-dataset\n\nRequirements: **numpy** and **youtube_dl**\n\n**NOTE**: the version of the code in pip only refers to the code itself. The different versions of the Dali_data can be found above.\n\n\n## 1- Loading DALI_data.\n\nDALI is presented as a set of **gz** files.\nEach gz contains the annotations of a particular song.\nWe use a unique id for each entry.\nYou can download your dali_data version as follow:\n\n    import DALI as dali_code\n    dali_data_path = 'full_path_to_your_dali_data'\n    dali_data = dali_code.get_the_DALI_dataset(dali_data_path, skip=[], keep=[])\n\nThis function can also be used to load a subset of the DALI dataset by providing the ids of the entries you either want to **skip** or to **keep**.\n\n\n**NOTE**: Loading DALI might take some minutes depending on your computer and python version. Python3 is faster than python2.\n\nAdditionally, each DALI version contains a DALI_DATA_INFO.gz:\n\n    dali_info = dali_code.get_info(dali_data_path + 'info/DALI_DATA_INFO.gz')\n    print(dali_info[0]) -\u003e array(['DALI_ID', 'NAME', 'YOUTUBE', 'WORKING'])\n\nThis file matches the unique DALI id with the artist_name-song_tile, the url to youtube and a bool that says if the youtube link is working or not.  \n\n\u003c!--- This file is updated with --\u003e\n\n## 1.1- An annotation instance.\n\n_dali_data_ is a dictionary where each key is a unique id and the value is an instance of the class DALI/Annotations namely **an annotation instance** of the class Annotations.\n\n    entry = dali_data['a_dali_unique_id']\n    type(entry) -\u003e DALI.Annotations.Annotations\n\nEach annotation instance has two attributes: **info** and **annotations**.\n\n    entry.info --\u003e {'id': 'a_dali_unique_id',\n                    'artist': 'An Artist',\n                    'title': 'A song title',\n                    'dataset_version': 1.0,     **# dali_data version**\n                    'ground-truth': False,     \n                    'scores': {'NCC': 0.8098520072498807,\n                               'manual': 0.0},  **# Not ready yet**\n                    'audio': {'url': 'a youtube url',\n                              'path': 'None',   \n                              **# To you to modify it to point to your local audio file**\n                              'working': True},\n                    'metadata': {'album': 'An album title',\n                                 'release_date': 'A year',\n                                 'cover': 'link to a image with the cover',\n                                 'genres': ['genre_0', ... , 'genre_n'],\n                                 # The n of genre depends on the song\n                                 'language': 'a language'}}\n\n    entry.annotations --\u003e {'annot': {'the annotations themselves'},\n                           'type': 'horizontal' or 'vertical',\n                           'annot_param': {'fr': float(frame rate used in the annotation process),\n                                          'offset': float(offset value)}}\n\n\n## 1.2- Saving as json.\n\nYou can export and import annotations a json file.\n\n        path_save = 'my_full_save_path'\n        name = 'my_annot_name'\n        # export\n        entry.write_json(path_save, name)\n        # import\n        my_json_entry = dali_code.Annotations()\n        my_json_entry.read_json(os.path.join(path_save, name+'.json'))\n\n\n## 1.3- Ground-truth.\n\nEach dali_data has its own ground-truth [ground-truth file](https://github.com/gabolsgabs/DALI/tree/master/versions/).\nThe annotations that are part of the ground-truth are entries of the dali_data with the offset and fr parameters manually annotated.\n\nYou can easily load a ground-truth file:\n\n    gt_file = 'full_path_to_my_ground_truth_file'\n    # you can load the ground-truth\n    gt = dali_code.utilities.read_gzip(gt_file)\n    type(gt) --\u003e dict\n    gt['a_dali_unique_id'] --\u003e {'offset': float(a_number),\n                                'fr': float(a_number)}\n\nYou can also load a **dali_gt** with all the entries of the dali_data that are part of the ground-truth with their annotations updated to the offset and fr parameters manually annotated:\n\n    # dali_gt only with ground_truth songs\n    gt = dali_code.utilities.read_gzip(gt_file)\n    dali_gt = dali_code.get_the_DALI_dataset(dali_data_path, gt_file, keep=gt.keys())\n    len(dali_gt) -\u003e == len(gt)\n\n\nYou can also load the whole dali_data and update the songs that are part of the ground truth with the offset and fr parameters manually verified.\n\n    # Two options:\n    # 1- once you have your dali_data\n    dali_data = dali_code.update_with_ground_truth(dali_data, gt_file)\n\n    # 2- while reading the dataset\n    dali_data = dali_code.get_the_DALI_dataset(dali_data_path, gt_file=gt_file)\n\n\nNOTE 1: Please be sure you have the last [ground truth version](https://github.com/gabolsgabs/DALI/tree/master/versions/).\n\n# 2- Getting the audio.\n\nYou can retrieve the audio for each annotation (if avilable) using the function dali_code.get_audio():\n\n    path_audio = 'full_path_to_store_the_audio'\n    errors = dali_code.get_audio(dali_info, path_audio, skip=[], keep=[])\n    errors -\u003e ['dali_id', 'youtube_url', 'error']\n\nThis function can also be used to download a subset of the DALI dataset by providing the ids of the entries you either want to **skip** or to **keep**.\n\n\n# 3- Working with DALI.\n\nAnnotations are in:\n\u003e entry.annotations['annot']\n\nand they are presented in two different formats: **'horizontal'** or **'vertical'**.\nYou can easily change the format using the functions:\n\n      entry.horizontal2vertical()\n      entry.vertical2horizontal()\n\n## 3.1- Horizontal.\nIn this format each level of granularity is stored individually.\nIt is the default format.\n\n![alt text][horizontal]\n\n    entry.vertical2horizontal() --\u003e 'Annot are already in a vertical format'\n    entry.annotations['type'] --\u003e 'horizontal'\n    entry.annotations['annot'].keys() --\u003e ['notes', 'lines', 'words', 'paragraphs']\n\nEach level contains a list of annotation where each element has:\n\n    my_annot = entry.annotations['annot']['notes']\n    my_annot[0] --\u003e {'text': 'wo', # the annotation itself.\n                     'time': [12.534, 12.659], # the begining and end of the  segment in seconds.\n                     'freq': [466.1637615180899, 466.1637615180899], # The range of frequency the text information is covering. At the lowest level, syllables, it corresponds to the vocal note.\n                     'index': 0} # link with the upper level. For example, index 0 at the 'words' level means that that particular word below to first line ([0]). The paragraphs level has no index key.\n\n### 3.1.1- Vizualizing an annotation file.\n\nYou can export the annotations of each individual level to a xml or text file to vizualize them with Audacity or AudioSculpt. The pitch information is only presented in the xml files for AudioSculpt.\n\n        my_annot = entry.annotations['annot']['notes']\n        path_save = 'my_save_path'\n        name = 'my_annot_name'\n        dali_code.write_annot_txt(my_annot, name, path_save)\n        # import the txt file in your Audacity\n        dali_code.write_annot_xml(my_annot, name, path_save)\n        # import Rythm XML file in AudioSculpt\n\n\n### 3.1.2- Examples.\nThis format is ment to be use for working with each level individually.\n\u003e Example 1: recovering the main vocal melody.\n\nLet's used the extra function dali_code.annot2vector() that transforms the annotations into a vector. There are two types of vector:\n\n- type='voice': each frame has a value 1 or 0 for voice or not voice.\n- type='melody': each frame has the freq value of the main vocal melody.\n\n      my_annot = entry.annotations['annot']['notes']\n      time_resolution = 0.014\n      # the value dur is just an example you should use the end of your audio file\n      end_of_the_song =  entry.annotations['annot']['notes'][-1]['time'][1] + 10\n      melody = dali_code.annot2vector(my_annot, end_of_the_song, time_resolution, type='melody')\n\n**NOTE: have a look to dali_code.annot2vector_chopping() for computing a vector chopped with respect to a given window and hop size.**\n\n\u003e Example 2: find the audio frames that define each paragraph.\n\nLet's used the other extra function dali_code.annot2frames() that transforms time in seconds into time in frames.\n\n      my_annot = entry.annotations['annot']['paragraphs']\n      paragraphs = [i['time'] for i in dali_code.annot2frames(my_annot, time_resolution)]\n      paragraphs --\u003e [(49408, 94584), ..., (3080265, 3299694)]\n\n\n**NOTE**: dali_code.annot2frames() can also be used in the vertical format but not dali_code.annot2vector().\n\n## 3.2- Vertical.\nIn this format the different levels of granularity are hierarchically connected:\n\n![alt text][vertical]\n\n      entry.horizontal2vertical()\n      entry.annotations['type'] --\u003e 'vertical'\n      entry.annotations['annot'].keys() --\u003e ['hierarchical']\n      my_annot = entry.annotations['annot']['hierarchical']\n\nEach element of the list is a paragraph.\n\n      my_annot[0] --\u003e {'freq': [277.1826309768721, 493.8833012561241], # The range of frequency the text information is covering\n                       'time': [12.534, 41.471500000000006], # the begining and end of the time segment.\n                       'text': [line_0, line_1, ..., line_n]}\n\n![alt text][p1]\n\nwhere 'text' contains all the lines of the paragraph. Each line follows the same format:\n\n      lines_1paragraph = my_annot[0]['text']\n      lines_1paragraph[0] --\u003e {'freq': [...], 'time': [...],\n                               'text': [word_0, word_1, ..., word_n]}\n\n![alt text][l1]\n\nagain, each word contains all the notes for that word to be sung:\n\n      words_1line_1paragraph = lines_1paragraph[0]['text']\n      words_1line_1paragraph[0] --\u003e {'freq': [...], 'time': [...],\n                                     'text': [note_0, note_1, ..., note_n]}\n\n![alt text][w1]\n\nOnly the deepest level directly has the text information.\n\n      notes_1word_1line_1paragraph = words_1line_1paragraph[1]['text']\n      notes_1word_1line_1paragraph[0] --\u003e {'freq': [...], 'time': [...],\n                                           'text': 'note text'}\n\nYou can always get the text at specific point with dali_code.get_text(), i.e:\n\n      dali_code.get_text(lines_1paragraph) --\u003e ['text word_0', 'text word_1', ..., text_word_n]\n      # words in the first line of the first paragraph\n\n      dali_code.get_text(my_annot[0]['text']) --\u003e ['text word_0', 'text word_1', ..., text_word_n]\n      # words in the first paragraph\n\n### 3.2.2- Examples.\nThis organization is meant to be used for working with specific hierarchical blocks.\n\n\u003e Example 1: working only with the third paragraph.\n\n      my_paragraph = my_annot[3]['text']\n      text_paragraph = dali_code.get_text(my_paragraph)\n\nAdditionally, you can easily retrieve all its individual information with the function dali_code.unroll():\n\n      lines_in_paragraph, _ = dali_code.unroll(my_paragraph, depth=0, output=[])\n      words_in_paragraph, _ = dali_code.unroll(my_paragraph, depth=1, output=[])\n      notes_in_paragraph, _ = dali_code.unroll(my_paragraph, depth=2, output=[])\n\n\u003e Example 2: working only with the first line of the third paragraph\n\n      my_line = my_annot[3]['text'][0]['text']\n      text_line = dali_code.get_text(my_line)\n      words_in_line, _ = dali_code.unroll(my_line, depth=0, output=[])\n      notes_in_line, _ = dali_code.unroll(my_line, depth=1, output=[])\n\n# 4- Correcting Annotations.\n\nUp to now, we are facing only global alignment problems. You can change this alignment by modifying the offset and frame rate parameters. The original ones are stored at:\n\n      print(entry.annotations['annot_param'])\n      {'offset': float(a_number), 'fr': float(a_number)}\n\nIf you find a better parameters set you can modified the annotations using the function dali_code.change_time():\n\n      dali_code.change_time(entry, new_offset, new_fr)\n      # The default new_offset and new_fr are entry.annotations['annot_param']\n\nWe encourage you to send us your parameters in order to improve DALI.\n\n_____\nYou can contact us at:\n\n\u003e dali dot dataset at gmail dot com\n\nThis research has received funding from the French National Research Agency under the contract ANR-16-CE23-0017-01 (WASABI project)\n\n\u003ca rel=\"license\" href=\"http://creativecommons.org/licenses/by-nc-sa/4.0/\"\u003e\u003cimg alt=\"Creative Commons License\" style=\"border-width:0\" src=\"https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png\" /\u003e\u003c/a\u003e\u003cbr /\u003eThis work is licensed under a \u003ca rel=\"license\" href=\"http://creativecommons.org/licenses/by-nc-sa/4.0/\"\u003eCreative Commons Attribution-NonCommercial-ShareAlike 4.0 International License\u003c/a\u003e.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgabolsgabs%2FDALI","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgabolsgabs%2FDALI","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgabolsgabs%2FDALI/lists"}