https://github.com/balavenkatesh3322/audio-pretrained-model

A collection of Audio and Speech pre-trained models.
https://github.com/balavenkatesh3322/audio-pretrained-model

audio audio-processing caffe keras keras-models keras-tensorflow machine-learning mxnet neural-network pre-trained pre-trained-model pre-training python3 pytorch pytorch-models speech-recognition speech-to-text tensorflow tensorflow-models

Last synced: 2 months ago
JSON representation

A collection of Audio and Speech pre-trained models.

Host: GitHub
URL: https://github.com/balavenkatesh3322/audio-pretrained-model
Owner: balavenkatesh3322
License: mit
Created: 2020-07-18T11:06:43.000Z (almost 5 years ago)
Default Branch: master
Last Pushed: 2020-07-21T01:47:52.000Z (almost 5 years ago)
Last Synced: 2025-03-24T17:52:46.325Z (3 months ago)
Topics: audio, audio-processing, caffe, keras, keras-models, keras-tensorflow, machine-learning, mxnet, neural-network, pre-trained, pre-trained-model, pre-training, python3, pytorch, pytorch-models, speech-recognition, speech-to-text, tensorflow, tensorflow-models
Homepage:
Size: 134 KB
Stars: 187
Watchers: 4
Forks: 26
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        ![Maintenance](https://img.shields.io/badge/Maintained%3F-YES-green.svg)

![GitHub](https://img.shields.io/badge/Release-PROD-yellow.svg)

![GitHub](https://img.shields.io/badge/Languages-MULTI-blue.svg)

![GitHub](https://img.shields.io/badge/License-MIT-lightgrey.svg)

# Audio and Speech Pre-trained Models

![NLP logo](https://github.com/balavenkatesh3322/audio-pretrained-model/blob/master/logo.jpg)

## What is pre-trained Model?

A pre-trained model is a model created by some one else to solve a similar problem. Instead of building a model from scratch to solve a similar problem, we can use the model trained on other problem as a starting point. A pre-trained model may not be 100% accurate in your application.

## Other Pre-trained Models

* [NLP Pre-trained Models](https://github.com/balavenkatesh3322/NLP-pretrained-model).

* [Computer Vision Pre-trained Models](https://github.com/balavenkatesh3322/CV-pretrained-model)

### Framework

* [Tensorflow](#tensorflow)

* [Keras](#keras)

* [PyTorch](#pytorch)

* [MXNet](#mxnet)

* [Caffe](#caffe)

### Model visualization

You can see visualizations of each model's network architecture by using [Netron](https://github.com/lutzroeder/Netron).

![NLP logo](https://github.com/balavenkatesh3322/NLP-pretrained-model/blob/master/netron.png)

### Tensorflow 

| Model Name | Description | Framework |

|   :---:      |     :---:      |     :---:     |

| [Wavenet]( https://github.com/ibab/tensorflow-wavenet)  | This is a TensorFlow implementation of the WaveNet generative neural network architecture for audio generation.     | `Tensorflow`

| [Lip Reading]( https://github.com/astorfi/lip-reading-deeplearning)  | Cross Audio-Visual Recognition using 3D Architectures in TensorFlow     | `Tensorflow`

| [MusicGenreClassification]( https://github.com/mlachmish/MusicGenreClassification)  | Academic research in the field of Deep Learning (Deep Neural Networks) and Sound Processing, Tel Aviv University.     | `Tensorflow`

| [Audioset](https://github.com/tensorflow/models/tree/master/research/audioset)  | Models and supporting code for use with AudioSet.     | `Tensorflow`

| [DeepSpeech]( https://github.com/tensorflow/models/tree/master/research/deep_speech)  | Automatic speech recognition.     | `Tensorflow`



    ↥ Back To Top



***

### Keras 

| Model Name | Description | Framework |

|   :---:      |     :---:      |     :---:     |

| [Ultrasound nerve segmentation]( https://github.com/jocicmarko/ultrasound-nerve-segmentation)  | This tutorial shows how to use Keras library to build deep neural network for ultrasound image nerve segmentation.     | `Keras`



    ↥ Back To Top



***

### PyTorch 

| Model Name | Description | Framework |

|   :---:      |     :---:      |     :---:     |

| [espnet]( https://github.com/espnet/espnet)  | End-to-End Speech Processing Toolkit espnet.github.io/espnet     | `PyTorch`

| [TTS]( https://github.com/mozilla/TTS)  | Deep learning for Text2Speech     | `PyTorch`

| [Neural Sequence labeling model]( https://github.com/jiesutd/NCRFpp)  | Sequence labeling models are quite popular in many NLP tasks, such as Named Entity Recognition (NER), part-of-speech (POS) tagging and word segmentation.     | `PyTorch`

| [waveglow]( https://github.com/NVIDIA/waveglow)  | A Flow-based Generative Network for Speech Synthesis.     | `PyTorch`

| [deepvoice3_pytorch]( https://github.com/r9y9/deepvoice3_pytorch)  | PyTorch implementation of convolutional networks-based text-to-speech synthesis models.     | `PyTorch`

| [deepspeech2]( https://github.com/SeanNaren/deepspeech.pytorch)  | Implementation of DeepSpeech2 using Baidu Warp-CTC. Creates a network based on the DeepSpeech2 architecture, trained with the CTC activation function.     | `PyTorch`

| [loop]( https://github.com/facebookarchive/loop)  | A method to generate speech across multiple speakers.    | `PyTorch`

| [audio]( https://github.com/pytorch/audio)  | Simple audio I/O for pytorch.     | `PyTorch`

| [speech]( https://github.com/awni/speech)  | PyTorch ASR Implementation.     | `PyTorch`

| [samplernn-pytorch]( https://github.com/deepsound-project/samplernn-pytorch)  | PyTorch implementation of SampleRNN: An Unconditional End-to-End Neural Audio Generation Model.     | `PyTorch`

| [torch_waveglow]( https://github.com/npuichigo/waveglow)  | A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis.     | `PyTorch`



    ↥ Back To Top



***

### MXNet 

| Model Name | Description | Framework |

|   :---:      |     :---:      |     :---:     |

| [deepspeech]( https://github.com/samsungsds-rnd/deepspeech.mxnet)  | This example based on DeepSpeech2 of Baidu helps you to build Speech-To-Text (STT) models at scale using     | `MXNet`

| [mxnet-audio]( https://github.com/chen0040/mxnet-audio)  | Implementation of music genre classification, audio-to-vec, song recommender, and music search in mxnet.     | `MXNet`



    ↥ Back To Top



***

### Caffe 

| Model Name | Description | Framework |

|   :---:      |     :---:      |     :---:     |

| [Speech Recognition](https://github.com/pannous/caffe-speech-recognition)  | Speech Recognition with the caffe deep learning framework.     | `Caffe`



    ↥ Back To Top



***

## Contributions

Your contributions are always welcome!!

Please have a look at contributing.md

## License

[MIT License](LICENSE)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/balavenkatesh3322/audio-pretrained-model

Awesome Lists containing this project

README