Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ga642381/Speech-Prompts-Adapters

This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
https://github.com/ga642381/Speech-Prompts-Adapters

List: Speech-Prompts-Adapters

adapter awesome-list papers parameter-efficient-learning prompt reprogramming speech

Last synced: 3 months ago
JSON representation

This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.

Awesome Lists containing this project

README

        

# Speech-Prompts-Adapters

This Repository surveys the paper focusing on **Adapters** and **Prompting** methods for **Speech Processing**.

## Navigation
* [ICASSP 2023 Tutorial Information](https://github.com/ga642381/Speech-Prompts-Adapters/#icassp-2023-tutorial-information)
* [**Adapters** for Speech Processing](https://github.com/ga642381/Speech-Prompts-Adapters/#adapters-for-speech-processing)
* [**Prompting** for Speech Processing](https://github.com/ga642381/Speech-Prompts-Adapters/#prompting-for-speech-processing)
* [**Reprogramming** and Prompting](https://github.com/ga642381/Speech-Prompts-Adapters/#reprogramming-and-prompting)
* [**Parameter Efficient Learning** Methods](https://github.com/ga642381/Speech-Prompts-Adapters/#parameter-efficient-learning-methods)
* [Contact](https://github.com/ga642381/Speech-Prompts-Adapters/#contact)

## NEWS
* In [ICASSP 2023](https://2023.ieeeicassp.org/), we will give a tutorial about Paramter-Efficient Learning for speech processing and natural langauge processing. I ([Kai-Wei Chang](https://scholar.google.com.tw/citations?user=hE_Oq8cAAAAJ&hl=zh-TW)) will cover the topics of adapters and prompts for speech processing.

---

## ICASSP 2023 Tutorial Information
* Title: Parameter-Efficient Learning for Speech and Language Processing: Adapters, Prompts, and Reprogramming
* Conference: ICASSP 2023
* Website: [ICASSP 2023 - Tutorials]( https://2023.ieeeicassp.org/tutorials/#1675893601715-597a5c9b-de65)
* Parameter-Efficient Learning for Speech Processing [Slides](https://github.com/ga642381/Kai-Wei-Chang-Talks/blob/main/Part%203b%20Parameter-Efficient%20Learning%20for%20Speech%20Processing.pdf)

### Presenters:
* Pin-Yu Chen (IBM Research)
* Hung-yi Lee (National Taiwan University)
* Chao-Han Huck Yang (Georgia Institute of Technology )
* Kai-Wei Chang (National Taiwan University)
* Cheng-Han Chiang (National Taiwan University)

---

## Adapters and Prompting for Speech Processing

### Adapters for Speech Processing
| Title | Authors | Modality | Task | Link |
| ----- | ------- | -------- | ---- | ---- |
| Differentially Private Adapters for Parameter Efficient Acoustic Modeling | Chun-Wei Ho et al. | Speech | keyword Spotting | [Interspeech 2023](https://arxiv.org/abs/2305.11360)
| Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition | Haoyu Tang et al. | Speech | ASR | [arXiv 2023](https://arxiv.org/abs/2303.13072)
| A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model | Srijith Radhakrishnan et al. | Speech | Dialect Identification | [Interspeech 2023](https://arxiv.org/abs/2305.11244)
| CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models | Zih-Ching Chen et al. | Speech | [Multiple] | [arXiv 2022](https://arxiv.org/abs/2212.01282)
| Parameter Efficient Transfer Learning for Various Speech Processing Tasks |Shinta Otake et al. | Speech | [Multiple] | [arXiv 2022](https://arxiv.org/abs/2212.02780)
| Parameter-efficient transfer learning of pre-trained Transformer models for speaker verification using adapters | Junyi Peng et al. | Speech | Speaker Verification | [arXiv 2022](https://arxiv.org/abs/2210.16032)
| Exploring Efficient-tuning Methods in Self-supervised Speech Models | Zih-Ching Chen et al. | Speech | [Multiple] | [SLT 2022](https://ieeexplore.ieee.org/document/10023274)
|DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children’s ASR| Ruchao Fan, Abeer Alwan | Speech | ASR | [Interspeech 2022](https://www.isca-speech.org/archive/interspeech_2022/fan22d_interspeech.html)
|Speaker adaptation for Wav2vec2 based dysarthric ASR| Murali Karthick Baskar et al. | Speech | ASR | [Interspeech 2022](https://www.isca-speech.org/archive/interspeech_2022/baskar22b_interspeech.html)
| Adaptive multilingual speech recognition with pretrained models | Ngoc-Quan Pham et al. | Speech | ASR | [Interspeech 2022](https://www.isca-speech.org/archive/interspeech_2022/pham22_interspeech.html)
|An Adapter Based Pre-Training for Efficient and Scalable Self-Supervised Speech Representation Learning| Samuel Kessler et al.| Speech | ASR | [ICASSP 2022](https://ieeexplore.ieee.org/document/9747374)
|Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition| Bethan Thomas et al. | Speech | ASR | [ICASSP 2022](https://ieeexplore.ieee.org/document/9746223)
|Scaling End-to-End Models for Large-Scale Multilingual ASR| Bo Li et al.| Speech | ASR | [ASRU 2021](https://ieeexplore.ieee.org/document/9687871)
|Meta-Adapter: Efficient Cross-Lingual Adaptation With Meta-Learning| Wenxin Hou et al. | Speech | ASR | [ICASSP 2021](https://ieeexplore.ieee.org/document/9414959)
|Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition| Wenxin Hou et al. | Speech | ASR | [TASLP 2021](https://dl.acm.org/doi/abs/10.1109/TASLP.2021.3138674)
|Lightweight Adapter Tuning for Multilingual Speech Translation | Hang Le et al.| Speech | Speech Translation | [ACL-IJCNLP 2021](https://aclanthology.org/2021.acl-short.103/)
|Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech | Katrin Tomanek et al. | Speech | ASR | [EMNLP 2021](https://aclanthology.org/2021.emnlp-main.541/)
|Multilingual Speech Recognition with Self-Attention Structured Parameterization | Yun Zhu et al. | Speech | ASR | [Interspeech 2020](https://www.isca-speech.org/archive/interspeech_2020/zhu20d_interspeech.html)
|Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model| Anjuli Kannan et al. | Speech | ASR | [Interspeech 2019](https://www.isca-speech.org/archive_v0/Interspeech_2019/abstracts/2858.html)

### Prompting for Speech Processing

| Title | Authors | Modality | Task | Link |
| ----- | ------- | -------- | ---- | ---- |
| Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization | Puyuan Peng et al. | Speech | [Multiple] | [Interspeech 2023](https://arxiv.org/abs/2305.11095)
| From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition | Chao-Han Huck Yang et al. | Speech | ASR | [ICASSP 2023](https://arxiv.org/pdf/2301.07851.pdf)
| SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks | Kai-Wei Chang et al. | Speech | [Multiple] | [arXiv 2023](https://arxiv.org/abs/2303.00733)
|Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision| Eugene Kharitonov et al. | Text & Speech | TTS |[arXiv 2023](https://arxiv.org/abs/2302.03540)
| Describing emotions with acoustic property prompts for speech emotion recognition | Hira Dhamyal et al. | Text & Speech | ER | [arXiv 2022](https://arxiv.org/abs/2211.07737)
| PromptTTS: Controllable Text-to-Speech with Text Descriptions | Zhifang Guo et al. | Text & Speech | TTS | [arXiv 2022](https://arxiv.org/abs/2211.12171)
|Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Classification| Hao Yen et al. | Speech | Spoken Command Recognition | [arXiv 2022](https://arxiv.org/abs/2110.03894)
|WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models| Heting Gao et al. |Text & Speech | SLU | [Interspeech 2022](https://www.isca-speech.org/archive/interspeech_2022/gao22e_interspeech.html)
| An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks | Kai-Wei Chang et al. | Speech | [Multiple] | [Interspeech 2022](https://www.isca-speech.org/archive/interspeech_2022/chang22e_interspeech.html) |

---

## Reprogramming and Prompting
For more information about reprogramming and prompting for large pre-trained models, please refer to the "awesome-neural-reprogramming-acoustic-prompting" repository. This topic was also covered in **ICASSP 2022** tutorial by Dr. Pin-Yu Chen and Dr. Huck Yang.

* GitHub Resource: [awesome-neural-reprogramming-acoustic-prompting](https://github.com/huckiyang/awesome-neural-reprogramming-prompting)
* Tutorial Video: [ICASSP 22 Tutorial, "Neural Model Reprogramming and Prompting for Speech Modeling, " Huck Yang](https://www.youtube.com/watch?v=-iirkbYkyXI&ab_channel=Chao-HanHuckYang)

---

## Parameter Efficient Learning Methods
| Title | Authors | Link |
| ----- | ------- | ---- |
|BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models| Elad Ben Zaken et al. | [ACL 2022](https://aclanthology.org/2022.acl-short.1/)
|Towards a Unified View of Parameter-Efficient Transfer Learning| Junxian He et al. | [ICLR 2022](https://iclr.cc/virtual/2022/poster/6524)
|LoRA: Low-Rank Adaptation of Large Language Models| Edward J. Hu et al. | [ICLR 2022](https://iclr.cc/virtual/2022/poster/6319)
|Parameter-Efficient Transfer Learning for NLP| Neil Houlsby et al. | [ICML 2019](https://proceedings.mlr.press/v97/houlsby19a.html)

---

## Acknowledgment
We thank Kuang-Chen Peng, Tzu-Han Lin, and Fabian Ritter for their invaluable contribution to the initial collection.

## Contact
This repository is maintained by [Kai-Wei Chang](https://scholar.google.com.tw/citations?user=hE_Oq8cAAAAJ&hl=zh-TW) ([email protected]) and [Zih-Ching Chen](https://scholar.google.com.tw/citations?user=rjedYCoAAAAJ). Feel free to contact us or make a pull request