https://github.com/lakshmi-bashyam/neurallm2arpa

Implementation of conversion system : Neural Language models to backing off n-gram models for decoding in speech recognition systems
https://github.com/lakshmi-bashyam/neurallm2arpa

arpa lstm n-grams nlp speech-recognition

Last synced: 8 months ago
JSON representation

Implementation of conversion system : Neural Language models to backing off n-gram models for decoding in speech recognition systems

Host: GitHub
URL: https://github.com/lakshmi-bashyam/neurallm2arpa
Owner: Lakshmi-bashyam
Created: 2021-10-18T15:05:33.000Z (almost 4 years ago)
Default Branch: master
Last Pushed: 2022-08-04T09:27:06.000Z (about 3 years ago)
Last Synced: 2023-12-06T17:35:31.234Z (almost 2 years ago)
Topics: arpa, lstm, n-grams, nlp, speech-recognition
Language: Python
Homepage:
Size: 300 KB
Stars: 1
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Convert Neural LM to backing off n-gram

## How to Run

To train RNNLM, add the corpus to data folder

1. Train an n-gram LM using SRILM tool to generate sentences from corpus

```
ngram-count -text corpus.txt -order 6 -lm new_lm.arpa -vocab new_vocab

ngram -lm new_lm.arpa -gen [no_of_sentences] > gen.txt
```

2. To train RNNLM using above generated corpus as training data, run

```
python train.py
```

Evaluation result - Test perpexity value is ~52

3. To extract RNNLM probabilities, run

```
python predict.py
```

## References

[1] Singh, Mittul & Oualil, Youssef & Klakow, Dietrich. (2017). \
Approximated and Domain-Adapted LSTM Language Models for First-Pass Decoding in Speech Recognition. \
2720-2724. 10.21437/Interspeech.2017-147. \

[2] H. Adel, K. Kirchhoff, N. T. Vu, D. Telaar, and T. Schultz, \
“Comparing approaches to convert recurrent neural networks into backoff language models for efficient decoding,” \
in INTER-SPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014, 2014, pp. 51–655.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lakshmi-bashyam/neurallm2arpa

Awesome Lists containing this project

README