{"id":25215580,"url":"https://github.com/deezer/nlp4musa_melscribe","last_synced_at":"2025-10-28T17:14:50.077Z","repository":{"id":266864521,"uuid":"862992511","full_name":"deezer/nlp4musa_melscribe","owner":"deezer","description":"Code and data to reproduce the experiments presented in the article \"Harnessing High-Level Song Descriptors towards Natural Language-Based Music Recommendation\" (NLP4MusA2024)","archived":false,"fork":false,"pushed_at":"2024-12-06T15:34:05.000Z","size":1778,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-12-06T16:35:53.680Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deezer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-25T14:31:43.000Z","updated_at":"2024-12-06T15:41:30.000Z","dependencies_parsed_at":"2024-12-06T16:38:08.143Z","dependency_job_id":"7218f028-1e3d-4508-8d62-5b9bb0f03879","html_url":"https://github.com/deezer/nlp4musa_melscribe","commit_stats":null,"previous_names":["deezer/nlp4musa_melscribe"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deezer%2Fnlp4musa_melscribe","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deezer%2Fnlp4musa_melscribe/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deezer%2Fnlp4musa_melscribe/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deezer%2Fnlp4musa_melscribe/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deezer","download_url":"https://codeload.github.com/deezer/nlp4musa_melscribe/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238161490,"owners_count":19426669,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-10T18:14:58.663Z","updated_at":"2025-10-25T14:31:12.953Z","avatar_url":"https://github.com/deezer.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# nlp4musa_melscribe\n\nThis repository provides Python code to reproduce the experiments from the article [**Harnessing High-Level Song Descriptors towards Natural Language-Based Music Recommendation**](https://arxiv.org/abs/2411.05649), accepted for publication to [**NLP4MusA 2024**](https://sites.google.com/view/nlp4musa-2024/home).\n\nFor a summary of this project, please consult the [poster](https://github.com/deezer/nlp4musa_melscribe/blob/main/presentation/poster.pdf) or [slides](https://github.com/deezer/nlp4musa_melscribe/blob/main/presentation/slides.pdf).\n\n\n## Setup\n\n```sh\ngit clone https://github.com/deezer/nlp4musa_melscribe.git\ncd nlp4musa_melscribe\n```\n\nInstall the requirements:\n\n```bash\npip install -r requirements.txt\n```\n\n**LP-MusicCaps** datasets are available for download on Hugging Face ([MC](https://huggingface.co/datasets/seungheondoh/LP-MusicCaps-MC), [MTT](https://huggingface.co/datasets/seungheondoh/LP-MusicCaps-MTT), [MSD](https://huggingface.co/datasets/seungheondoh/LP-MusicCaps-MSD)). \nEach of these datasets should be read and exported to csv files for each split as we show below for **LP-MusicCaps-MTT**:\n```python\nfrom datasets import load_dataset\n\nds = load_dataset(\"seungheondoh/LP-MusicCaps-MTT\")\nds['test'].to_csv('data/LP-MusicCaps-MTT/test.csv')\nds['train'].to_csv('data/LP-MusicCaps-MTT/train.csv')\nds['valid'].to_csv('data/LP-MusicCaps-MTT/valid.csv')\n```\n**LP-MusicCaps-MSD** is a gated dataset so you must be authenticated to access it.\n\nDownload the fine-tuned models (the cross-encoder and the bi-encoder) from [Zenodo](https://zenodo.org/records/14289764):\n```bash\nwget https://zenodo.org/records/14289764/files/models.zip\nunzip models.zip -d models/\n```\n\n## Reproduce paper results\n\n###  Evaluate our model\n```bash\npython src/eval_our_model.py --output_path results/results_our_model.json --sources  lpms-mtt lpms-msd lpms-mc lpms-mc-rephrased --input_path data/ --our_model_path models/bi-encoder-lpmusicaps-msmarco-bert-base-dot-v5/\n```\n\n###  Evaluate text encoder baselines\n```bash\npython src/eval_text_encoders.py --output_path results/results_text_encoders.json --sources  lpms-mtt lpms-msd lpms-mc lpms-mc-rephrased --input_path data/\n```\n\n###  Evaluate text encoder baselines from multimodal models\n\nSet up the baselines (*depending on the environment, `pip install -e src/music-text-representation/` throws an exception regarding the package `sklearn`; as suggested, this should be replaced with `scikit-learn` in the file `setup.py`*):\n\n```bash\nwget https://huggingface.co/lukewys/laion_clap/resolve/main/music_audioset_epoch_15_esc_90.14.pt -P src/laion-clap/\ngit clone https://github.com/seungheondoh/music-text-representation.git src/music-text-representation/\npip install -e src/music-text-representation/\nwget https://zenodo.org/record/7322135/files/mtr.tar.gz -P src/music-text-representation/\ntar -zxvf src/music-text-representation/mtr.tar.gz -C src/music-text-representation/\n```\n\n\nRun the evaluation script:\n```bash\npython src/eval_text_encoder_multimodal_models.py --output_path results/results_text_encoders_multimodal.json --sources  lpms-mtt lpms-msd lpms-mc lpms-mc-rephrased --input_path data/ --ttmr_model_path src/music-text-representation/mtr/ --clap_model_path src/laion-clap/music_audioset_epoch_15_esc_90.14.pt\n```\n\n## Fine-tune a model from scratch\n\nGenerate training data:\n```bash\npython src/generate_train_data.py --input_path data/ --output_path data/training_gpl --sources lpms-mtt lpms-msd --random_seed=42 --docs_per_query=3\n```\n\nTrain the model with the Generative Pseudo-labeling method (GPL):\n```bash\npython -m  gpl.train  --path_to_generated_data \"data/training_gpl\"    --base_ckpt \"msmarco-bert-base-dot-v5\"     --gpl_score_function \"cos_sim\"     --batch_size_gpl 4   --gpl_steps 140000   --output_dir \"models/nlp4musa_seed42\"    --retrievers \"msmarco-distilbert-base-v3\" \"msmarco-MiniLM-L-6-v3\"     --retriever_score_functions \"cos_sim\"  --negatives_per_query 30  --cross_encoder \"models/cross-encoder-musiccaps-ms-marco-MiniLM-L-6-v2/\"    --qgen_prefix \"qgen\" --max_seq_length 512\n```\n\nAs described in the paper, we fine-tuned a domain-specific cross-encoder using human-annotated data from the MusicCaps dataset, specifically the model `models/cross-encoder-musiccaps-ms-marco-MiniLM-L-6-v2`. The cross-encoder predicts a similarity score between a music-related longer text (e.g., song descriptions or user requests) and a music descriptor (e.g., tags). This model serves as a teacher to generate soft labels for the training data, which are then used to train the bi-encoder.\n\n## Paper\n\nPlease cite our paper if you use this data or code in your work:\n```\n@InProceedings{Epure2024Harnessing,\n \ttitle={Harnessing High-Level Song Descriptors towards Natural Language-Based Music Recommendation},\n  \tauthor={Epure, Elena V. and Meseguer-Brocal, Gabriel and Afchar, Darius and Hennequin, Romain},\n  \tbooktitle={Proceedings of the 3rd Workshop on NLP for Music and Audio (NLP4MusA2024)},\n  \tmonth={November},\n  \tyear={2024},\n  \tpublisher = {Association for Computational Linguistics},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeezer%2Fnlp4musa_melscribe","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeezer%2Fnlp4musa_melscribe","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeezer%2Fnlp4musa_melscribe/lists"}