https://github.com/voidful/asr-trainer
one script for xls-r/xlsr/whisper fine-tuning
https://github.com/voidful/asr-trainer
xls-r
Last synced: 5 months ago
JSON representation
one script for xls-r/xlsr/whisper fine-tuning
- Host: GitHub
- URL: https://github.com/voidful/asr-trainer
- Owner: voidful
- Created: 2021-12-01T02:40:56.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-06-29T07:17:09.000Z (almost 2 years ago)
- Last Synced: 2023-06-29T09:20:34.731Z (almost 2 years ago)
- Topics: xls-r
- Language: Python
- Homepage:
- Size: 87.9 KB
- Stars: 29
- Watchers: 3
- Forks: 8
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# one script for xls-r/xlsr/whisper fine-tuning
script modify from https://huggingface.co/blog/fine-tune-xlsr-wav2vec2
- fix large memory usage during eval metric calculation
- add cer and wer for evaluation## install requirement
`pip install -r requirements.txt`
## example usage - common voice
```
python -m torch.distributed.launch --nproc_per_node=2 \
train.py \
--train_subset zh-TW \
--train_split train \
--test_split validation \
--tokenize_config voidful/wav2vec2-large-xlsr-53-tw-gpt \
--model_config facebook/wav2vec2-xls-r-300m \
--batch 10 \
--group_by_length \
--max_input_length_in_sec 20
``````
python -m torch.distributed.launch --nproc_per_node=2 \
train.py \
--train_subset zh-TW \
--model_config openai/whisper-base \
--batch 10 \
--group_by_length \
--max_input_length_in_sec 20
```## example usage - custom set
### custom data format `data.csv`:
```csv
path,text
/xxx/2.wav,被你拒絕而記仇
/xxx/4.wav,電影界的人
/xxx/7.wav,其實我最近在想
``````
python -m torch.distributed.launch --nproc_per_node=2 \
--custom_set ./data.csv \
--tokenize_config voidful/wav2vec2-large-xlsr-53-tw-gpt \
--model_config facebook/wav2vec2-xls-r-300m \
--batch 3 \
--grad_accum 15 \
--max_input_length_in_sec 15 \
--eval_step 10000
``````shell
python -m torch.distributed.launch --nproc_per_node=2 \
train.py --tokenize_config facebook/hubert-large-ls960-ft \
--model_config ntu-spml/distilhubert \
--group_by_length \
--train_set librispeech_asr \
--train_subset all \
--train_split train.clean.100+train.clean.360+train.other.500 \
--test_split test.other \
--learning_rate 0.0003 \
--batch 30 \
--logging_steps 10 \
--eval_steps 60 \
--epoch 150 \
--use_auth_token True \
--output_dir ./model_sweep \
--overwrite_output_dir
```Train whisper on custom dataset
```shell
python train.py --tokenize_config openai/whisper-base \
--model_config openai/whisper-base \
--group_by_length \
--custom_set_train cm_train_unified.csv \
--custom_set_test cm_dev_unified.csv \
--output_dir ./whisper-custom \
--overwrite_output_dir
```## sweep usage
`python -m wandb sweep ./sweep_xxx.yaml`
`python -m wandb agent xxxxxxxxxxxxxx`### test command
python train.py --tokenize_config facebook/hubert-large-ls960-ft --model_config ntu-spml/distilhubert --group_by_length --train_set hf-internal-testing/librispeech_asr_dummy --train_split validation --train_subset clean --test_split validation --test_subset clean --learning_rate 0.0003 --batch 30 --logging_steps 10 --eval_steps 60 --epoch 150 --use_auth_token True --output_dir ./model_test --overwrite_output_dir --batch 3
python train.py --tokenize_config openai/whisper-tiny --model_config openai/whisper-tiny --group_by_length --train_set hf-internal-testing/librispeech_asr_dummy --train_split validation --train_subset clean --test_split validation --test_subset clean --learning_rate 0.0003 --batch 30 --logging_steps 10 --eval_steps 60 --epoch 150 --use_auth_token True --output_dir ./model_test --overwrite_output_dir --batch 3