https://github.com/imvladikon/wav2vec2-hebrew

Speech Recognition for Hebrew (using wav2vec2 models)
https://github.com/imvladikon/wav2vec2-hebrew

hebrew speech-recognition

Last synced: 18 days ago
JSON representation

Speech Recognition for Hebrew (using wav2vec2 models)

Host: GitHub
URL: https://github.com/imvladikon/wav2vec2-hebrew
Owner: imvladikon
Created: 2022-01-25T14:29:21.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2023-05-08T07:12:02.000Z (over 2 years ago)
Last Synced: 2025-09-07T12:40:02.000Z (about 1 month ago)
Topics: hebrew, speech-recognition
Language: Python
Homepage:
Size: 211 KB
Stars: 5
Watchers: 1
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          
# Hebrew Speech Recognition with Wav2Vec2

## Usage

### Without package installation (using `transformers` library)

```python

from transformers import (

    AutomaticSpeechRecognitionPipeline,

    AutoFeatureExtractor,

    Wav2Vec2ForCTC,

    AutoTokenizer

)

pretrained_model_name_or_path = "imvladikon/wav2vec2-xls-r-300m-hebrew"

asr = AutomaticSpeechRecognitionPipeline(

    feature_extractor=AutoFeatureExtractor.from_pretrained(

        pretrained_model_name_or_path

    ),

    model=Wav2Vec2ForCTC.from_pretrained(

        pretrained_model_name_or_path

    ),

    tokenizer=AutoTokenizer.from_pretrained(

        pretrained_model_name_or_path

    ))

filename = "audio.wav"

print(asr(filename))

```

Chunking file into smaller chunks is not implemented yet. 

### With package installation

```bash

pip install git+https://github.com/imvladikon/wav2vec2-hebrew

```

#### Speech recognition

```python

from wav2vec2_hebrew import HebrewSpeechRecognitionPipeline

asr = HebrewSpeechRecognitionPipeline()

filename = "./samples/bereshit011.wav"

output = asr(filename)

print(output)

# [{'text': 'בראשית ברא אלוהים את השמייים ואת הארץ'}]

```

#### Alignment

```python

import torchaudio

from wav2vec2_hebrew import HebrewWav2Vec2Aligner

filename = "./samples/bereshit011.wav"

text = "בראשית ברא אלוהים את השמיים ואת הארץ"

aligner = HebrewWav2Vec2Aligner(input_sample_rate=16000, use_cuda=True)

# aligning segments to text (sentences)

first_sentence = aligner.align_data(filename, text)[0]

# {'sentence': 'בראשית ברא אלוהים את השמיים ואת הארץ', 

#  'segments': [Segment(label='בראשית', start=6750.516853932584, end=18644.284644194755, score=0.16025335497152965)...]}

# showing in IPython (notebook)

waveform, sample_rate = torchaudio.load(filename)

aligner.show_segments(waveform, first_sentence)

# showing segments using IPython.display.Audio

```

## Training process

Training logs and details are available in the [train](train) folder.

### Datasets   

* https://huggingface.co/datasets/imvladikon/hebrew_speech_kan   

* https://huggingface.co/datasets/imvladikon/hebrew_speech_coursera   

### Weights

* [imvladikon/wav2vec2-xls-r-300m-hebrew](https://huggingface.co/imvladikon/wav2vec2-xls-r-300m-hebrew)

* [imvladikon/wav2vec2-xls-r-300m-lm-hebrew](https://huggingface.co/imvladikon/wav2vec2-xls-r-300m-lm-hebrew)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/imvladikon/wav2vec2-hebrew

Awesome Lists containing this project

README