Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jacoblincool/wft
Run Whisper fine-tuning with easeβit works on MPS, CUDA, and CPU without code changes.
https://github.com/jacoblincool/wft
fine-tuning whisper
Last synced: 12 days ago
JSON representation
Run Whisper fine-tuning with easeβit works on MPS, CUDA, and CPU without code changes.
- Host: GitHub
- URL: https://github.com/jacoblincool/wft
- Owner: JacobLinCool
- License: mit
- Created: 2024-10-23T16:11:58.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-10-30T21:10:31.000Z (about 2 months ago)
- Last Synced: 2024-11-28T20:17:47.028Z (25 days ago)
- Topics: fine-tuning, whisper
- Language: Python
- Homepage: https://pypi.org/project/wft
- Size: 54.7 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# π£οΈ Whisper Fine-Tuning (WFT)
**WFT** is a π Python library designed to streamline the fine-tuning process of π€ OpenAI's Whisper models on custom datasets. It simplifies π¦ dataset preparation, model π οΈ fine-tuning, and result πΎ saving.
## β¨ Features
- **π€ Hugging Face Integration**: Set your organization (or user) name, and everything syncs automatically to the π€ Hugging Face hub.
- **π Easy Dataset Preparation and Preprocessing**: Quickly prepare and preprocess datasets for π οΈ fine-tuning.
- **π§ Fine-Tuning Using LoRA (Low-Rank Adaptation)**: Fine-tune Whisper models with efficient LoRA techniques.
- **βοΈ Flexible Configuration Options**: Customize various aspects of the fine-tuning process.
- **π Evaluation Metrics**: Supports Character Error Rate (CER) or Word Error Rate (WER) for model evaluation.
- **π TensorBoard Logging**: Track your training progress in real-time with TensorBoard.
- **π€ Automatic Model Merging and Saving**: Automatically merge fine-tuned weights and save the final model.
- **π Resume Training**: Resume training seamlessly from interrupted runs.## π οΈ Installation
Install WFT using π pip:
```bash
pip install wft
```## π Quick Start
Fine-tune a Whisper model on a custom dataset with just a few steps:
1. **𧩠Select a Baseline Model**: Choose a pre-trained Whisper model.
2. **π΅ Select a Dataset**: Use a dataset that includes π§ audio and βοΈ transcription columns.
3. **ποΈββοΈ Start Training**: Use default training arguments to quickly fine-tune the model.Here's an example:
```python
from wft import WhisperFineTunerid = "whisper-large-v3-turbo-zh-TW-test-1"
ft = (
WhisperFineTuner(id)
.set_baseline("openai/whisper-large-v3-turbo", language="zh", task="transcribe")
.prepare_dataset(
"mozilla-foundation/common_voice_16_1",
src_subset="zh-TW",
src_audio_column="audio",
src_transcription_column="sentence",
)
.train() # Use default training arguments
)
```That's it! π You can now fine-tune your Whisper model easily.
To enable π€ Hugging Face integration and push your training log and model to Hugging Face, set the `org` parameter when initializing `WhisperFineTuner`:
```python
id = "whisper-large-v3-turbo-zh-TW-test-1"
org = "JacobLinCool" # Organization to push the model to Hugging Faceft = (
WhisperFineTuner(id, org)
.set_baseline("openai/whisper-large-v3-turbo", language="zh", task="transcribe")
.prepare_dataset(
"mozilla-foundation/common_voice_16_1",
src_subset="zh-TW",
src_audio_column="audio",
src_transcription_column="sentence",
)
.train() # Use default training arguments
.merge_and_push() # Merge the model and push it to Hugging Face
)
```This will automatically push your training logs π and the fine-tuned model π€ to your Hugging Face account under the specified organization.
## π Usage Guide
### 1οΈβ£ Set Baseline Model and Prepare Dataset
You can use a local dataset or a dataset from π€ Hugging Face:
```python
ft = (
WhisperFineTuner(id)
.set_baseline("openai/whisper-large-v3-turbo", language="zh", task="transcribe")
.prepare_dataset(
"mozilla-foundation/common_voice_16_1",
src_subset="zh-TW",
src_audio_column="audio",
src_transcription_column="sentence",
)
)
```To upload the preprocessed dataset to Hugging Face:
```python
ft.push_dataset("username/dataset_name")
```You can also prepare or load an already processed dataset:
```python
ft = (
WhisperFineTuner(id)
.set_baseline("openai/whisper-large-v3-turbo", language="zh", task="transcribe")
.prepare_dataset(
"username/preprocessed_dataset",
"mozilla-foundation/common_voice_16_1",
src_subset="zh-TW",
src_audio_column="audio",
src_transcription_column="sentence",
)
)
```### 2οΈβ£ Configure Fine-Tuning
Set the evaluation metric and π§ LoRA configuration:
```python
ft.set_metric("cer") # Use CER (Character Error Rate) for evaluation# Custom LoRA configuration to fine-tune different parts of the model
from peft import LoraConfigcustom_lora_config = LoraConfig(
r=32,
lora_alpha=16,
target_modules=["q_proj", "v_proj"],
lora_dropout=0.05,
bias="none",
)ft.set_lora_config(custom_lora_config)
```You can also set custom π οΈ training arguments:
```python
from transformers import Seq2SeqTrainingArgumentscustom_training_args = Seq2SeqTrainingArguments(
output_dir=ft.dir,
per_device_train_batch_size=8,
gradient_accumulation_steps=2,
learning_rate=1e-4,
num_train_epochs=3,
)ft.set_training_args(custom_training_args)
```### 3οΈβ£ Train the Model
To begin ποΈββοΈ fine-tuning:
```python
ft.train()
```### 4οΈβ£ Save or Push the Fine-Tuned Model
Merge π§ LoRA weights with the baseline model and save it:
```python
ft.merge_and_save(f"{ft.dir}/merged_model")# Or push to Hugging Face
ft.merge_and_push("username/merged_model")
```## π¬ Advanced Usage
### π§ Custom LoRA Configuration
Adjust the LoRA configuration to fine-tune different model parts:
```python
custom_lora_config = LoraConfig(
r=32,
lora_alpha=16,
target_modules=["q_proj", "v_proj"],
lora_dropout=0.05,
bias="none",
)ft.set_lora_config(custom_lora_config)
```### βοΈ Custom Training Arguments
Specify custom π οΈ training settings:
```python
from transformers import Seq2SeqTrainingArgumentscustom_training_args = Seq2SeqTrainingArguments(
output_dir=ft.dir,
per_device_train_batch_size=8,
gradient_accumulation_steps=2,
learning_rate=1e-4,
num_train_epochs=3,
)ft.set_training_args(custom_training_args)
```### π Run Custom Actions After Steps Using `.then()`
Add actions to be executed after each step:
```python
ft = (
WhisperFineTuner(id)
.set_baseline("openai/whisper-large-v3-turbo", language="zh", task="transcribe")
.then(lambda ft: print(f"{ft.baseline_model=}"))
.prepare_dataset(
"mozilla-foundation/common_voice_16_1",
src_subset="zh-TW",
src_audio_column="audio",
src_transcription_column="sentence",
)
.then(lambda ft: print(f"{ft.dataset=}"))
.set_metric("cer")
.then(lambda ft: setattr(ft.training_args, "num_train_epochs", 5))
.train()
)
```### π Resume Training From a Checkpoint
If training is interrupted, you can resume:
```python
ft = (
WhisperFineTuner(id)
.set_baseline("openai/whisper-large-v3-turbo", language="zh", task="transcribe")
.prepare_dataset(
"mozilla-foundation/common_voice_16_1",
src_subset="zh-TW",
src_audio_column="audio",
src_transcription_column="sentence",
)
.set_metric("cer")
.train(resume=True)
)
```> **βΉοΈ Note**: If no checkpoint is found, training will start from scratch without failure.
## π€ Contributing
We welcome contributions! π Feel free to submit a pull request.
## π License
This project is licensed under the MIT License.
## Why there are so many emojis in this README?
Because ChatGPT told me to do so. π€π