https://github.com/espnet/notebook

Last synced: 10 months ago
JSON representation

Host: GitHub
URL: https://github.com/espnet/notebook
Owner: espnet
Created: 2019-07-30T08:58:49.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2025-06-09T18:43:25.000Z (about 1 year ago)
Last Synced: 2025-06-23T00:39:19.283Z (12 months ago)
Language: Jupyter Notebook
Size: 17.5 MB
Stars: 67
Watchers: 6
Forks: 40
Open Issues: 7
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # ESPnet Notebooks

## Demo

### ASR (Speech recognition)

- [`asr_realtime_demo.ipynb`](ESPnet2/Demo/ASR/asr_realtime_demo.ipynb): ASR realtime inference with various pre-trained models.

- [`asr_transfer_learning_demo.ipynb`](ESPnet2/Demo/ASR/asr_transfer_learning_demo.ipynb): Demo on how to use pre-trained ASR models for fine-tuning.

- [`streaming_asr_demo.ipynb`](ESPnet2/Demo/ASR/streaming_asr_demo.ipynb): Streaming ASR realtime inference with pre-trained models.

### SE (Speech enhancement/separation)

- [`se_demo.ipynb`](ESPnet2/Demo/SE/se_demo.ipynb): Speech enhancement/separation inference with various pre-trained models.

- [`se_demo_for_waspaa_2021.ipynb`](ESPnet2/Demo/SE/se_demo_for_waspaa_2021.ipynb): WASPAA2021 version of ESPnet-SE demo.

### SLU (Spoken language understanding)

- [`2pass_slu_demo.ipynb`](ESPnet2/Demo/SLU/2pass_slu_demo.ipynb): Two pass spoken language understanding pre-trained model examples.

### TTS (Text-to-speech)

- [`tts_realtime_demo.ipynb`](ESPnet2/Demo/TTS/tts_realtime_demo.ipynb): TTS realtime inference with various pre-trained models.

### Other utilities

- [`onnx_conversion_demo.ipynb`](ESPnet2/Demo/Others/onnx_conversion_demo.ipynb): How to convert ESPnet models into ONNX format.

## ESPnet-EZ

### ASR (Speech recognition)

- [`train_from_scratch.ipynb`](ESPnetEZ/ASR/train_from_scratch.ipynb): Training an ASR model with ESPnet-EZ on LibriSpeech-100.

- [`ASR_finetune_owsm.ipynb`](ESPnetEZ/ASR/ASR_finetune_owsm.ipynb): Fine-tuning the weakly-supervised model (OWSM) with ESPnet-EZ on custom dataset.

### ST (Speech-to-text translation)

- [`integrate_huggingface.ipynb`](ESPnetEZ/ST/integrate_huggingface.ipynb): Integrating the weakly-supervised model (OWSM) and huggingface's pre-trained language model with ESPnet-EZ on MuST-C-v2.

- [`ST_finetune_owsm.ipynb`](ESPnetEZ/ST/ST_finetune_owsm.ipynb): Fine-tuning the weakly-supervised model (OWSM) with ESPnet-EZ on MuST-C-v2.

### SLU (Spoken language understanding)

- [`SLU_finetune_owsm.ipynb`](ESPnetEZ/SLU/SLU_finetune_owsm.ipynb): Fine-tuning the weakly-supervised model (OWSM) with ESPnet-EZ on SLURP.

### TTS (Text-to-speech)

- [`TTS_finetune_vctk_dump.ipynb`](ESPnetEZ/TTS/TTS_finetune_vctk_dump.ipynb): Fine-tuning a pre-trained VITS model with ESPnet-EZ on the VCTK dataset.

### SVS (Singing voice synthesis)

- [`SVS_finetune_ace-kising.ipynb`](ESPnetEZ/SVS/SVS_finetune_ace-kising.ipynb): Fine-tuning a pre-trained VISinger 2 model with ESPnet-EZ on ACE-KiSing.

## Course

### CMU SpeechProcessing Spring2023

- [`assignment0_data-prep.ipynb`](ESPnet2/Course/CMU_SpeechProcessing_Spring2023/assignment0_data-prep.ipynb): Course assignment on how to prepare ESPnet-format data.

- [`assignment1_espnet-tutorial.ipynb`](ESPnet2/Course/CMU_SpeechProcessing_Spring2023/assignment1_espnet-tutorial.ipynb): A simplified version of previous year's new task tutorial.

- [`assignemnt3_spk.ipynb`](ESPnet2/Course/CMU_SpeechProcessing_Spring2023/assignment3_spk.ipynb): Examples of using ESPnet to extract speaker embeddings and conduct speaker recognition.

- [`assignment4_ssl.ipynb`](ESPnet2/Course/CMU_SpeechProcessing_Spring2023/assignment4_ssl.ipynb): Exploration on using self-supervised speech representation to ESPnet ASR training.

- [`assignment5_st.ipynb`](ESPnet2/Course/CMU_SpeechProcessing_Spring2023/assignment5_st.ipynb): Examples of state-of-the-art speech translation models in ESPnet.

- [`assignment6_slu.ipynb`](ESPnet2/Course/CMU_SpeechProcessing_Spring2023/assignment6_slu.ipynb): Examples of state-of-the-art spoken language understanding models in ESPnet.

- [`assignment7_se.ipynb`](ESPnet2/Course/CMU_SpeechProcessing_Spring2023/assignment7_se.ipynb): Examples of state-of-the-art speech enhancement/separation in ESPnet.

- [`assignment8_tts.ipynb`](ESPnet2/Course/CMU_SpeechProcessing_Spring2023/assignment8_tts.ipynb): A student version of espnet2-tts realtime demonstration.

- [`s2st_demo.ipynb`](ESPnet2/Course/CMU_SpeechProcessing_Spring2023/s2st_demo.ipynb): An example of existing speech-to-speech translation model for ESPnet.

### CMU SpeechRecognition Fall2022

- [`recipe_tutorial.ipynb`](ESPnet2/Course/CMU_SpeechRecognition_Fall2022/recipe_tutorial.ipynb): A general tutorial of stage-by-stage explanation of ESPnet2 recipes (with new functions).

- [`new_task_tutorial.ipynb`](ESPnet2/Course/CMU_SpeechRecognition_Fall2022/new_task_tutorial.ipynb): A tutorial on how to add new models/tasks to ESPnet framework.

### CMU SpeechRecognition Fall2021

- [`general_tutorial.ipynb`](ESPnet2/Course/CMU_SpeechRecognition_Fall2021/general_tutorial.ipynb): A general tutorial of stage-by-stage explanation of ESPnet2 recipes.

## ESPnet1 (Legacy)

- [`asr_library.ipynb`](ESPnet1/asr_library.ipynb): Speech recognition library explanation with network training.

- [`asr_recipe.ipynb`](ESPnet1/asr_recipe.ipynb): Speech recognition recipe explanation.

- [`pretrained.ipynb`](ESPnet1/pretrained.ipynb): Tutorial on how to use pre-trained models.

- [`st_demo.ipynb`](ESPnet1/st_demo.ipynb): Speech translation demonstration with a TTS model to achieve speech-to-speech translation.

- [`tts_realtime_demo.ipynb`](ESPnet1/tts_realtime_demo.ipynb): TTS demonstration with different pre-trained TTS models.

- [`tts_recipe.ipynb`](ESPnet1/tts_recipe.ipynb): Stage explanation for TTS recipes.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/espnet/notebook

Awesome Lists containing this project

README