https://github.com/flashlight/wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit
https://github.com/flashlight/wav2letter

cpp deep-learning end-to-end speech-recognition wav2letter

Last synced: 7 months ago
JSON representation

Facebook AI Research's Automatic Speech Recognition Toolkit

Host: GitHub
URL: https://github.com/flashlight/wav2letter
Owner: flashlight
License: other
Created: 2017-11-20T17:39:41.000Z (about 8 years ago)
Default Branch: main
Last Pushed: 2024-08-07T18:01:52.000Z (over 1 year ago)
Last Synced: 2024-10-29T15:05:17.926Z (about 1 year ago)
Topics: cpp, deep-learning, end-to-end, speech-recognition, wav2letter
Language: C++
Homepage: https://github.com/facebookresearch/wav2letter/wiki
Size: 6.2 MB
Stars: 6,382
Watchers: 246
Forks: 1,012
Open Issues: 108
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

awesome-list - wav2letter++ - Facebook AI Research's Automatic Speech Recognition Toolkit. (Natural Language Processing / Speech & Audio)
awesome-coding-by-voice - wav2letter++ - Facebook AI Research's Automatic Speech Recognition Toolkit

README

          # wav2letter++

[![CircleCI](https://circleci.com/gh/flashlight/wav2letter.svg?style=svg)](https://app.circleci.com/pipelines/github/flashlight/wav2letter)

[![Join the chat at https://gitter.im/wav2letter/community](https://badges.gitter.im/wav2letter/community.svg)](https://gitter.im/wav2letter/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)

## Important Note:

### wav2letter has been moved and consolidated [into Flashlight](https://github.com/flashlight/flashlight) in the [ASR application](https://github.com/flashlight/flashlight/tree/master/flashlight/app/asr).

Future wav2letter development will occur in Flashlight.

*To build the old, pre-consolidation version of wav2letter*, checkout the [wav2letter v0.2](https://github.com/flashlight/wav2letter/releases/tag/v0.2) release, which depends on the old [Flashlight v0.2](https://github.com/flashlight/flashlight/releases/tag/v0.2) release. The [`wav2letter-lua`](https://github.com/flashlight/wav2letter/tree/wav2letter-lua) project can be found on the [`wav2letter-lua` branch](https://github.com/flashlight/wav2letter/tree/wav2letter-lua), accordingly.

For more information on wav2letter++, see or cite [this arXiv paper](https://arxiv.org/abs/1812.07625).

## Recipes

This repository includes recipes to reproduce the following research papers as well as *pre-trained* models. **All results reproduction must use Flashlight <= 0.3.2** for exact reproducability. Papers contained here include:

- [Pratap et al. (2020): Scaling Online Speech Recognition Using ConvNets](recipes/streaming_convnets/)

- [Synnaeve et al. (2020): End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures](recipes/sota/2019)

- [Kahn et al. (2020): Self-Training for End-to-End Speech Recognition](recipes/self_training)

- [Likhomanenko et al. (2019): Who Needs Words? Lexicon-free Speech Recognition](recipes/lexicon_free/)

- [Hannun et al. (2019): Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions](recipes/seq2seq_tds/)

Data preparation for training and evaluation can be found in [data](data) directory.

### Building the Recipes

First, install [Flashlight](https://github.com/flashlight/flashlight/tree/0.3) **(using the [0.3 branch](https://github.com/flashlight/flashlight/tree/0.3) is required)** with the [ASR application](https://github.com/flashlight/flashlight/tree/master/flashlight/app/asr).

```shell

mkdir build && cd build

cmake .. && make -j8

```

If Flashlight or ArrayFire are installed in nonstandard paths via a custom `CMAKE_INSTALL_PREFIX`, they can be found by passing

```shell

-Dflashlight_DIR=[PREFIX]/usr/share/flashlight/cmake/ -DArrayFire_DIR=[PREFIX]/usr/share/ArrayFire/cmake

```

when running `cmake`.

## Join the wav2letter community

* Facebook page: https://www.facebook.com/groups/717232008481207/

* Google group: https://groups.google.com/forum/#!forum/wav2letter-users

* Contact: vineelkpratap@fb.com, awni@fb.com, qiantong@fb.com, jacobkahn@fb.com, antares@fb.com, avidov@fb.com, gab@fb.com, vitaliy888@fb.com, locronan@fb.com

## License

wav2letter++ is MIT-licensed, as found in the [LICENSE](LICENSE) file.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/flashlight/wav2letter

Awesome Lists containing this project

README