{"id":23726000,"url":"https://github.com/voidful/asr-trainer","last_synced_at":"2025-09-04T02:31:25.163Z","repository":{"id":45222749,"uuid":"433659000","full_name":"voidful/asr-trainer","owner":"voidful","description":"one script for xls-r/xlsr/whisper fine-tuning ","archived":false,"fork":false,"pushed_at":"2023-06-29T07:17:09.000Z","size":90,"stargazers_count":41,"open_issues_count":0,"forks_count":13,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-05T00:25:06.706Z","etag":null,"topics":["xls-r"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/voidful.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-12-01T02:40:56.000Z","updated_at":"2025-03-05T12:57:17.000Z","dependencies_parsed_at":"2023-01-22T07:02:29.872Z","dependency_job_id":null,"html_url":"https://github.com/voidful/asr-trainer","commit_stats":null,"previous_names":["voidful/asr-trainer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/voidful/asr-trainer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2Fasr-trainer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2Fasr-trainer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2Fasr-trainer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2Fasr-trainer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/voidful","download_url":"https://codeload.github.com/voidful/asr-trainer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2Fasr-trainer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273541903,"owners_count":25124056,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-04T02:00:08.968Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["xls-r"],"created_at":"2024-12-31T00:18:07.932Z","updated_at":"2025-09-04T02:31:24.815Z","avatar_url":"https://github.com/voidful.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# one script for xls-r/xlsr/whisper fine-tuning\n\nscript modify from https://huggingface.co/blog/fine-tune-xlsr-wav2vec2\n\n- fix large memory usage during eval metric calculation\n- add cer and wer for evaluation\n\n## install requirement\n\n`pip install -r requirements.txt`\n\n## example usage - common voice\n```\npython -m torch.distributed.launch --nproc_per_node=2 \\\ntrain.py \\\n--train_subset zh-TW \\\n--train_split train \\\n--test_split validation \\\n--tokenize_config voidful/wav2vec2-large-xlsr-53-tw-gpt \\\n--model_config facebook/wav2vec2-xls-r-300m \\\n--batch 10 \\\n--group_by_length \\\n--max_input_length_in_sec 20\n```\n\n```\npython -m torch.distributed.launch --nproc_per_node=2 \\\ntrain.py \\\n--train_subset zh-TW \\\n--model_config openai/whisper-base \\\n--batch 10 \\\n--group_by_length \\\n--max_input_length_in_sec 20\n```\n\n## example usage - custom set\n\n### custom data format `data.csv`:\n\n```csv\npath,text\n/xxx/2.wav,被你拒絕而記仇\n/xxx/4.wav,電影界的人\n/xxx/7.wav,其實我最近在想\n```\n\n```\npython -m torch.distributed.launch --nproc_per_node=2 \\\n--custom_set ./data.csv \\\n--tokenize_config voidful/wav2vec2-large-xlsr-53-tw-gpt \\\n--model_config facebook/wav2vec2-xls-r-300m \\\n--batch 3 \\\n--grad_accum 15 \\\n--max_input_length_in_sec 15 \\\n--eval_step 10000\n```\n\n```shell\npython -m torch.distributed.launch --nproc_per_node=2 \\\ntrain.py --tokenize_config facebook/hubert-large-ls960-ft \\\n--model_config ntu-spml/distilhubert \\\n--group_by_length \\\n--train_set librispeech_asr \\\n--train_subset all \\\n--train_split train.clean.100+train.clean.360+train.other.500 \\\n--test_split test.other \\\n--learning_rate 0.0003 \\\n--batch 30 \\\n--logging_steps 10 \\\n--eval_steps 60 \\\n--epoch 150 \\\n--use_auth_token True \\\n--output_dir ./model_sweep \\\n--overwrite_output_dir\n```\n\nTrain whisper on custom dataset\n```shell\npython train.py --tokenize_config openai/whisper-base \\\n--model_config openai/whisper-base \\\n--group_by_length \\\n--custom_set_train cm_train_unified.csv \\\n--custom_set_test cm_dev_unified.csv \\\n--output_dir ./whisper-custom \\\n--overwrite_output_dir\n```\n\n## sweep usage\n\n`python -m wandb sweep ./sweep_xxx.yaml`   \n`python -m wandb agent xxxxxxxxxxxxxx`\n\n### test command\n\npython train.py --tokenize_config facebook/hubert-large-ls960-ft --model_config ntu-spml/distilhubert --group_by_length --train_set hf-internal-testing/librispeech_asr_dummy --train_split validation --train_subset clean --test_split validation --test_subset clean --learning_rate 0.0003 --batch 30 --logging_steps 10 --eval_steps 60 --epoch 150 --use_auth_token True --output_dir ./model_test --overwrite_output_dir --batch 3\n\npython train.py --tokenize_config openai/whisper-tiny --model_config openai/whisper-tiny --group_by_length --train_set hf-internal-testing/librispeech_asr_dummy --train_split validation --train_subset clean --test_split validation --test_subset clean --learning_rate 0.0003 --batch 30 --logging_steps 10 --eval_steps 60 --epoch 150 --use_auth_token True --output_dir ./model_test --overwrite_output_dir --batch 3\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvoidful%2Fasr-trainer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvoidful%2Fasr-trainer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvoidful%2Fasr-trainer/lists"}