{"id":13861539,"url":"https://github.com/amanteur/BandSplitRNN-PyTorch","last_synced_at":"2025-07-14T09:32:22.911Z","repository":{"id":172382377,"uuid":"603770207","full_name":"amanteur/BandSplitRNN-PyTorch","owner":"amanteur","description":"Unofficial PyTorch implementation of Music Source Separation with Band-split RNN","archived":false,"fork":false,"pushed_at":"2024-06-10T04:43:16.000Z","size":27192,"stargazers_count":157,"open_issues_count":4,"forks_count":18,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-11-22T21:45:50.956Z","etag":null,"topics":["music-information-retrieval","pytorch","source-separation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amanteur.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-19T14:27:46.000Z","updated_at":"2024-11-20T20:17:48.000Z","dependencies_parsed_at":"2024-11-22T21:43:33.451Z","dependency_job_id":null,"html_url":"https://github.com/amanteur/BandSplitRNN-PyTorch","commit_stats":null,"previous_names":["amanteur/bandsplitrnn-pytorch"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/amanteur/BandSplitRNN-PyTorch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amanteur%2FBandSplitRNN-PyTorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amanteur%2FBandSplitRNN-PyTorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amanteur%2FBandSplitRNN-PyTorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amanteur%2FBandSplitRNN-PyTorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amanteur","download_url":"https://codeload.github.com/amanteur/BandSplitRNN-PyTorch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amanteur%2FBandSplitRNN-PyTorch/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265269380,"owners_count":23737833,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["music-information-retrieval","pytorch","source-separation"],"created_at":"2024-08-05T06:01:24.670Z","updated_at":"2025-07-14T09:32:20.501Z","avatar_url":"https://github.com/amanteur.png","language":"Python","readme":"# BandSplitRNN Pytorch\n\nUnofficial PyTorch implementation of the paper [Music Source Separation with Band-split RNN](https://arxiv.org/pdf/2209.15174.pdf).\n\n![architecture](images/architecture.png)\n\n---\n## Table of Contents\n\n1. [Changelog](#changelog)\n1. [Dependencies](#dependencies)\n2. [Inference](#inference)\n3. [Train your model](#trainmodel)\n   1. [Dataset preprocessing](#preprocessing)\n   2. [Training](#train)\n   3. [Evaluation](#eval)\n4. [Repository structure](#structure)\n5. [Citing](#cite)\n\n---\n\u003ca name=\"changelog\"/\u003e\n\n# Changelog\n\n\n- **29.07.2023**\n  - Made some updates to the code.\n  - A big thanks to [unemployed-denizen](https://github.com/unemployed-denizen) for finding a bug in the code. \n  I've just uploaded the new checkpoint to the table with higher scores than before.\n\n---\n\u003ca name=\"dependencies\"/\u003e\n\n# Dependencies\n\nPython version - **3.10**.  \nTo install dependencies, run:\n```\npip install -r requirements.txt\n```\nAdditionally, **ffmpeg** should be installed in the venv.  \nIf using ``conda`[example_vocals.mp3](..%2F..%2F..%2FDownloads%2Fexample_vocals.mp3)`, you can run:\n```\nconda install -c conda-forge ffmpeg\n```\nAll scripts should be run from `src` directory.\n\n`Train`/`evaluation`/`inference` pipelines support GPU acceleration. \nTo activate it, specify the following `env` variable:\n```\nexport CUDA_VISIBLE_DEVICES={DEVICE_NUM} \n```\n\n---\n\u003ca name=\"inference\"/\u003e\n\n## Inference\n\nTo run inference on your file(s), firstly, you need to download checkpoints.\n\nAvailable checkpoints:\n\n| Target                                                                                       | Epoch | uSDR (hop=0.5) | cSDR (hop=0.5) |\n|----------------------------------------------------------------------------------------------|-------|----------------|----------------|\n| [Vocals](https://drive.google.com/file/d/14FzOPUcf4BKym1kCRqRH1PeJWSvv4qHE/view?usp=sharing) | 139   | 6.883 +- 2.488 | 6.665 +- 2.717 |\n| Bass                                                                                         | -     | -              | -              |\n| Drums                                                                                        | -     | -              | -              |\n| Other                                                                                        | -     | -              | -              |\n\nAfter you download the `.pt` file, put it into `./saved_models/{TARGET}/` directory.\n\nAfterwards, run the following script: \n\n```\npython3 inference.py [-h] -i IN_PATH -o OUT_PATH [-t TARGET] [-c CKPT_PATH] [-d DEVICE]\n\noptions:\n  -h, --help            show this help message and exit\n  -i IN_PATH, --in-path IN_PATH\n                        Path to the input directory/file with .wav/.mp3 extensions.\n  -o OUT_PATH, --out-path OUT_PATH\n                        Path to the output directory. Files will be saved in .wav format with sr=44100.\n  -t TARGET, --target TARGET\n                        Name of the target source to extract.\n  -c CKPT_PATH, --ckpt-path CKPT_PATH\n                        Path to model's checkpoint. If not specified, the .ckpt from SAVED_MODELS_DIR/{target} is used.\n  -d DEVICE, --device DEVICE\n                        Device name - either 'cuda', or 'cpu'.\n```\nYou can customize inference via changing `audio_params` in `./saved_models/{TARGET}/hparams.yaml` file. Here is `vocals` example:\n```\npython3 inference.py -i ../example/example.mp3 -o ../example/ -t vocals\n```\n\nThere is still some work going on with training better checkpoints, \nand at this moment I've trained only (pretty bad) vocals extraction model. \n\n---\n\u003ca name=\"trainmodel\"/\u003e\n\n## Train your model\n\nIn this section, the model training pipeline is described.\n\n---\n\n\u003ca name=\"preprocessing\"/\u003e\n\n### Dataset preprocessing\n\nThe authors used the `MUSDB18-HQ` dataset to train an initial source separation model.\nYou can access it via [Zenodo](https://zenodo.org/record/3338373#.Y_jrMC96D5g).\n\nAfter downloading, set the path to this dataset as an environmental variable \n(you'll need to specify it before running the `train` and `evaluation` pipelines):\n```\nexport MUSDB_DIR={MUSDB_DIR}\n```\n\nTo speed up the training process, instead of loading whole files, \nwe can precompute the indices of fragments we need to extract. \nTo select these indices, the proposed Source Activity Detection algorithm was used.\n\nTo read the `musdb18` dataset and extract salient fragments according to the `target` source, use the following script:\n```\npython3 prepare_dataset.py [-h] -i INPUT_DIR -o OUTPUT_DIR [--subset SUBSET] [--split SPLIT] [--sad_cfg SAD_CFG] [-t TARGET [TARGET ...]]\n\noptions:\n  -h, --help            show this help message and exit\n  -i INPUT_DIR, --input_dir INPUT_DIR\n                        Path to directory with musdb18 dataset\n  -o OUTPUT_DIR, --output_dir OUTPUT_DIR\n                        Path to directory where output .txt file is saved\n  --subset SUBSET       Train/test subset of the dataset to process\n  --split SPLIT         Train/valid split of train dataset. Used if subset=train\n  --sad_cfg SAD_CFG     Path to Source Activity Detection config file\n  -t TARGET [TARGET ...], --target TARGET [TARGET ...]\n                        Target source. SAD will save salient fragments of vocal audio.\n\n```\nOutput is saved to `{OUTPUT_DIR}/{TARGET}_{SUBSET}.txt` file. The structure of the file is as follows:\n```\n{MUSDB18 TRACKNAME}\\t{START_INDEX}\\t{END_INDEX}\\n\n```\n\n---\n\u003ca name=\"train\"/\u003e\n\n### Training\n\nTo train the model, a combination of `PyTorch-Lightning` and `hydra` was used.\nAll configuration files are stored in the `src/conf` directory in `hydra`-friendly format.\n\nTo start training a model with given configurations, just use the following script:\n```\npython train.py\n```\nTo configure the training process, follow `hydra` [instructions](https://hydra.cc/docs/advanced/override_grammar/basic/).\nBy default, the model is trained to extract `vocals`. To train a model to extract other sources, use the following scripts:\n```\npython train.py train_dataset.target=bass model=bandsplitrnnbass\npython train.py train_dataset.target=drums model=bandsplitrnndrums\npython train.py train_dataset.target=other\n```\n\nAfter training is started, the logging folder will be created for a particular experiment with the following path:\n```\nsrc/logs/bandsplitrnn/${now:%Y-%m-%d}_${now:%H-%M}/\n```\nThis folder will have the following structure:\n```\n├── tb_logs\n│   └── tensorboard_log_file    - main tensorboard log file \n├── weights\n│   └── *.ckpt                  - lightning model checkpoint files.\n└── hydra\n│   └──config.yaml              - hydra configuration and override files \n└── train.log                   - logging file for train.py\n   \n```\n\n---\n\u003ca name=\"eval\"/\u003e\n\n### Evaluation\n\nTo start evaluating a model with given configurations, use the following script:\n\n```\npython3 evaluate.py [-h] -d RUN_DIR [--device DEVICE]\n\noptions:\n  -h, --help            show this help message and exit\n  -d RUN_DIR, --run-dir RUN_DIR\n                        Path to directory checkpoints, configs, etc\n  --device DEVICE       Device name - either 'cuda', or 'cpu'.\n```\n\nThis script creates `test.log` in the `RUN_DIR` directory and writes the `uSDR` and `cSDR` metrics there \nfor the test subset of the MUSDB18 dataset.\n\n\n---\n\u003ca name=\"structure\"/\u003e\n\n## Repository structure\nThe structure of this repository is as following:\n```\n├── src\n│   ├── conf                        - hydra configuration files\n│   │   └── **/*.yaml               \n│   ├── data                        - directory with data processing modules\n│   │   └── *.py\n│   ├── files                       - output files from prepare_dataset.py script\n│   │   └── *.txt\n│   ├── model                       - directory with modules of the model \n│   │   ├── modules\n│   │   │   └── *.py\n│   │   ├── __init__.py\n│   │   ├── bandsplitrnn.py         - file with the model itself\n│   │   └── pl_model.py             - file with Pytorch-Lightning Module for training and validation pipeline\n│   ├── utils                       - directory with utilities for evaluation and inference pipelines\n│   │   └── *.py                    \n│   ├── evaluate.py                 - script for evaluation pipeline \n│   ├── inference.py                - script for inference pipeline\n│   ├── prepare_dataset.py          - script for dataset preprocessing pipeline\n│   ├── separator.py                - separator class, which is used in evaluation and inference pipelines\n│   └── train.py                    - script for training pipeline\n├── example                         - test example for inference.py\n│   └── *.wav\n├── .gitignore\n├── README.md \n└── requirement.txt\n```\n\n---\n\u003ca name=\"cite\"/\u003e\n\n## Citing\n\nTo cite this paper, please use:\n```\n@misc{https://doi.org/10.48550/arxiv.2209.15174,\n  doi = {10.48550/ARXIV.2209.15174},\n  url = {https://arxiv.org/abs/2209.15174},\n  author = {Luo, Yi and Yu, Jianwei},\n  keywords = {Audio and Speech Processing (eess.AS), Machine Learning (cs.LG), Sound (cs.SD), Signal Processing (eess.SP), FOS: Electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering, FOS: Computer and information sciences, FOS: Computer and information sciences},\n  title = {Music Source Separation with Band-split RNN},\n  publisher = {arXiv},\n  year = {2022},\n  copyright = {Creative Commons Attribution Non Commercial Share Alike 4.0 International}\n}\n```\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famanteur%2FBandSplitRNN-PyTorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famanteur%2FBandSplitRNN-PyTorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famanteur%2FBandSplitRNN-PyTorch/lists"}