{"id":13532187,"url":"https://github.com/JusperLee/Dual-Path-RNN-Pytorch","last_synced_at":"2025-04-01T20:31:33.128Z","repository":{"id":65969303,"uuid":"236737267","full_name":"JusperLee/Dual-Path-RNN-Pytorch","owner":"JusperLee","description":"Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch","archived":false,"fork":false,"pushed_at":"2023-02-14T07:49:35.000Z","size":97,"stargazers_count":415,"open_issues_count":34,"forks_count":66,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-11-02T19:33:55.261Z","etag":null,"topics":["deep-learning","pytorch","rnn-model","speech-separation","speech-separation-algorithm"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JusperLee.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2020-01-28T13:06:33.000Z","updated_at":"2024-11-01T11:56:34.000Z","dependencies_parsed_at":"2023-07-21T11:48:06.492Z","dependency_job_id":null,"html_url":"https://github.com/JusperLee/Dual-Path-RNN-Pytorch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JusperLee%2FDual-Path-RNN-Pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JusperLee%2FDual-Path-RNN-Pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JusperLee%2FDual-Path-RNN-Pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JusperLee%2FDual-Path-RNN-Pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JusperLee","download_url":"https://codeload.github.com/JusperLee/Dual-Path-RNN-Pytorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246709923,"owners_count":20821297,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","pytorch","rnn-model","speech-separation","speech-separation-algorithm"],"created_at":"2024-08-01T07:01:08.876Z","updated_at":"2025-04-01T20:31:28.112Z","avatar_url":"https://github.com/JusperLee.png","language":"Python","funding_links":[],"categories":["Speech Separation (single channel)"],"sub_categories":["NN-based separation"],"readme":"# Dual-path-RNN-Pytorch\nDual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch\n\nIf you have any questions, you can ask them through the issue.\n\nIf you find this project helpful, you can give me a star generously.\n\nDemo Pages: [Results of pure speech separation model](https://cslikai.cn/project/Pure-Audio/)\n# Plan\n\n- [x] 2020-02-01: Reading article “[Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation](https://arxiv.org/abs/1910.06379 \"Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation\")”. Zhihu Article link \"[阅读笔记”Dual-path RNN for Speech Separation“](https://zhuanlan.zhihu.com/p/104606356 \"阅读笔记”Dual-path RNN for Speech Separation“\")\". Blog Article link \"[阅读笔记《Dual-path RNN for speech separation》](https://www.likai.show/archives/dual-path-rnn \"阅读笔记《Dual-path RNN for speech separation》\")\". Both articles are interpretations of the paper. If you have any questions, welcome to discuss with me\n\n- [x] 2020-02-02: Complete data preprocessing, data set code. Dataset Code: [/data_loader/Dataset.py](https://github.com/JusperLee/Dual-path-RNN-Pytorch/blob/master/data_loader/Dataset.py)\n\n- [x] 2020-02-03: Complete Conv-TasNet Framework (Update **/model/model.py, Trainer_Tasnet.py, Train_Tasnet.py**)\n\n- [x] 2020-02-07: Complete Training code. (Update **/model/model_rnn.py**) and Test parameters and some details are being adjusted.\n\n- [x] 2020-02-08: Fixed the code's bug.\n\n- [x] 2020-02-11: Complete Testing code.\n\n# Dataset\nWe used the WSJ0 dataset as our training, test, and validation sets. Below is the data download link and mixed audio code for WSJ0.\n- [Audio mix Sample](https://www.merl.com/demos/deep-clustering/media/female-female-mixture.wav)\n\n- [WSJ0 Dataset](https://catalog.ldc.upenn.edu/LDC93S6A)\n\n- [Create Dataset](https://www.merl.com/demos/deep-clustering/create-speaker-mixtures.zip)\n\n# Training\n## Training for Conv-TasNet model\n1. First, you need to generate the scp file using the following command. The content of the scp file is \"filename \u0026\u0026 path\".\n```shell\npython create_scp.py\n```\n\n2. Then you can modify the training and model parameters through \"[config/Conv_Tasnet/train.yml](https://github.com/JusperLee/Dual-Path-RNN-Pytorch/tree/master/config/Conv_Tasnet )\".\n```shell\ncd config/Conv-Tasnet\nvim train.yml\n```\n\n3. Then use the following command in the root directory to train the model.\n```shell\npython train_Tasnet.py --opt config/Conv_Tasnet/train.yml\n```\n## Training for Dual Path RNN model\n1. First, you need to generate the scp file using the following command. The content of the scp file is \"filename \u0026\u0026 path\".\n```shell\npython create_scp.py\n```\n\n2. Then you can modify the training and model parameters through \"[config/Dual_RNN/train.yml](https://github.com/JusperLee/Dual-Path-RNN-Pytorch/tree/master/config/Dual_RNN \"config / Dual_RNN / train.yml\")\".\n```shell\ncd config/Dual_RNN\nvim train.yml\n```\n\n3. Then use the following command in the root directory to train the model.\n```shell\npython train_rnn.py --opt config/Dual_RNN/train.yml\n```\n\n# Inference\n\n## Conv-TasNet\nYou need to modify the default parameters in the test_tasnet.py file, including test files, test models, etc.\n### For multi-audio\n```shell\npython test_tasnet.py \n```\n### For single-audio\n```shell\npython test_tasnet_wav.py \n```\n## Dual-Path-RNN\nYou need to modify the default parameters in the test_dualrnn.py file, including test files, test models, etc.\n### For multi-audio\n```shell\npython test_dualrnn.py \n```\n### For single-audio\n```shell\npython test_dualrnn_wav.py \n```\n\n# Pretrain Model\n\n## Conv-TasNet\n\n[Conv-TasNet model](https://drive.google.com/open?id=1MRe4jiwgtAFZErjz-LWuuyEG8VGSU0YS \"Google Driver\")\n\n## Dual-Path-RNN\n[Dual-Path-RNN model](https://drive.google.com/open?id=1TInJB-idggkKJ5YkNvnrTopum_HgX3_o \"Google Driver\") \n\n# Result\n\n## Conv-TasNet\n![](https://github.com/JusperLee/Dual-Path-RNN-Pytorch/blob/master/log/Conv_Tasnet/loss.png)\n\nFinal Results: **15.8690** is 0.56 higher than **15.3** in the paper.\n\n## Dual-Path-RNN\n\nFinal Results: **18.98** is 0.1 higher than **18.8** in the paper.\n\n# Reference\n1. Luo Y, Chen Z, Yoshioka T. Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation[J]. arXiv preprint arXiv:1910.06379, 2019.\n2. [Conv-TasNet code](https://github.com/JusperLee/Conv-TasNet \"Conv-TasNet code\") \u0026\u0026 [Dual-RNN code](https://github.com/yluo42/TAC/blob/master/utility/models.py \"Dual-RNN code\")\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJusperLee%2FDual-Path-RNN-Pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FJusperLee%2FDual-Path-RNN-Pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJusperLee%2FDual-Path-RNN-Pytorch/lists"}