https://github.com/idiap/torgo_asr

A Kaldi recipe for training automatic speech recognition systems on the Torgo corpus of dysarthric speech
https://github.com/idiap/torgo_asr

Last synced: about 1 year ago
JSON representation

A Kaldi recipe for training automatic speech recognition systems on the Torgo corpus of dysarthric speech

Host: GitHub
URL: https://github.com/idiap/torgo_asr
Owner: idiap
License: apache-2.0
Created: 2020-03-06T12:39:26.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2023-09-22T10:02:05.000Z (over 2 years ago)
Last Synced: 2025-03-23T01:02:28.406Z (about 1 year ago)
Language: Shell
Homepage:
Size: 4.02 MB
Stars: 17
Watchers: 5
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: COPYING

Awesome Lists containing this project

README

# Torgo ASR

## Description
This is a Kaldi recipe to build automatic speech recognition systems on the
[Torgo corpus](http://www.cs.toronto.edu/~complingweb/data/TORGO/torgo.html) of
dysarthric speech.

## Setup

Update the `KALDI_ROOT` and `DATA_ORIG` variables in `path.sh` to point to the
correct locations for your Kaldi installation and the Torgo corpus. Then run
the following:

```sh
source path.sh
ln -s $KALDI_ROOT/egs/wsj/s5/{steps,utils} .
```

Some scripts in `local/` also require the following Python packages:

```
invoke numpy pandas python-Levenshtein
```

## Usage

The following instructions allow to train ASR systems on Torgo and to reproduce
results from the paper.

### Train ASR systems

```sh
# HMM/GMM systems:
./run.sh

# LF-MMI (TDNN-F) systems:
./run_tdnnf.sh

# CE (TDNN-LSTM) systems:
./local/nnet3/run_tdnn_lstm.sh

# Show WER:
./local/get_wer.py exp/sgmm
```

### Corpus statistics

Torgo corpus statistics:

```sh
./local/corpus_statistics.sh
```

### Pronunciation similarity

How similar are the isolated words to each other? First retrieve the phonetic
representation for each word, then analyse the similarity of pronunciations:

```sh
./local/get_prons.sh > data/pronunciations_single
./local/compute_pron_similarity.py
```

### Phone duration
We analysed how mean phoneme duration and WER are correlated.

```sh
# Get phone alignments with duration information:
./local/get_phone_alignments.sh exp/sgmm

# Compute mean phoneme durations:
./local/analyze_phone_lengths.py
```

## Citation

Please cite the following paper if you use this code for your research.

```BibTeX
@inproceedings{hermann2020.asr,
author = "Hermann, Enno and Magimai.-Doss, Mathew",
title = "Dysarthric Speech Recognition with Lattice-Free {MMI}",
booktitle = "Proceedings International Conference on Acoustics, Speech, and Signal Processing (ICASSP)",
pages = "6109--6113",
year = "2020",
doi = "10.1109/ICASSP40776.2020.9053549"
}
```

The code is based on [an earlier recipe](https://github.com/cristinae/ASRdys) by
Cristina España-Bonet and José A. R. Fonollosa.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/idiap/torgo_asr

Awesome Lists containing this project

README