Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/wasertech/transcorerlm
Transformer as Scorer (Language Model) for STT accoustic models.
https://github.com/wasertech/transcorerlm
Last synced: 6 days ago
JSON representation
Transformer as Scorer (Language Model) for STT accoustic models.
- Host: GitHub
- URL: https://github.com/wasertech/transcorerlm
- Owner: wasertech
- License: mpl-2.0
- Created: 2023-03-02T08:17:21.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-05-11T07:25:14.000Z (over 1 year ago)
- Last Synced: 2024-10-30T06:40:43.806Z (about 2 months ago)
- Language: Python
- Size: 135 KB
- Stars: 2
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# TranScorerLM
Transformer as Scorer (Language Model) for STT accoustic models.
## Get started
```zsh
# Install with pip
❯ pip install git+https://github.com/wasertech/TranScorerLM.git# Use TranScorer to convert an accoustic representation to text
❯ transcorer -f 'audio.wav'
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Loading TranScorer......Took 2.2624789050023537 second(s).
Loading audio.wav......Took 0.00048493899521417916 second(s).
Tokenizing......Took 0.0008750690030865371 second(s).
Decoding speech......Took 0.21528533000673633 second(s).
CAN I TEST YOU
```When `--file` is a valid audio file, it is used to get the accoustic representation.
You can also use the *`scorer`* `python` module.
```python
from pathlib import Pathfrom scorer.transformer import TranScorer
ts = TranScorer("path/to/scorer.transform")
to_transcribe = "path/to/audio.wav"
transcript = ts.transcribe(to_transcribe)
print(f"{to_transcribe=} -> {transcript=}")
```You could also ask the scorer to tokenize something for you.
```
python -m scorer.transformer.tokenizer \
--lang "english" \
--data_path "parent/path/to/tokenizer/data/*.txt" \
--test "This test sentence will be tokenized."
```## Training a new scorer from `facebook/wav2vec2-large-xlsr-53`
Start training a scorer using `trainscorer`.
```zsh
# trainscorer -> python -m scorer.train
trainscorer \
--model_name_or_path facebook/wav2vec2-base \
--dataset_name wav2txt \
--dataset_config_name wav2txt \
--output_dir wav2txt/models \
--overwrite_output_dir \
--remove_unused_columns False \
--do_train y \
--do_eval y \
--fp16 \
--learning_rate 3e-5 \
--max_length_seconds 1 \
--attention_mask False \
--warmup_ratio 0.1 \
--num_train_epochs 5 \
--per_device_train_batch_size 32 \
--gradient_accumulation_steps 4 \
--per_device_eval_batch_size 32 \
--dataloader_num_workers 4 \
--logging_strategy steps \
--logging_steps 10 \
--evaluation_strategy epoch \
--save_strategy epoch \
--load_best_model_at_end True \
--metric_for_best_model accuracy \
--save_total_limit 3 \
--seed 0
```You can also pass a `--push_to_hub` flag along with `--hub_token '${HUB_API_TOKEN}' --push_to_hub_model_id '${MODEL_ID}' --use_auth_token y`; where `$HUB_API_TOKEN` is a valid HuggingFace API Token (with write access) and `$MODEL_ID` is the name of the repo you want to create for your model, this will publish it on the hub under the above mentionned repository.
You can also train directly from a `python` script.
```python
from scorer.train import trainif __name__ == "__main__":
try:
train()
sys.exit(0)
except KeyboardInterrupt:
sys.exit(1)
```Training will create a custom tokenizer using availible sentences.
## License
This project is distributed under [Mozilla Public License 2.0](LICENSE).
## Contribute
Please read [CONTRIBUTE](CONTRIBUTE.md) before anthing.