https://github.com/ga642381/Speech-Prompts-Adapters

This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
https://github.com/ga642381/Speech-Prompts-Adapters

List: Speech-Prompts-Adapters

adapter awesome-list papers parameter-efficient-learning prompt reprogramming speech

Last synced: 7 months ago
JSON representation

This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.

Host: GitHub
URL: https://github.com/ga642381/Speech-Prompts-Adapters
Owner: ga642381
Created: 2023-03-20T13:07:48.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-08-04T14:32:45.000Z (almost 2 years ago)
Last Synced: 2024-12-02T02:03:36.571Z (7 months ago)
Topics: adapter, awesome-list, papers, parameter-efficient-learning, prompt, reprogramming, speech
Homepage:
Size: 45.9 KB
Stars: 104
Watchers: 11
Forks: 5
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

ultimate-awesome - Speech-Prompts-Adapters - This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing. . (Other Lists / Julia Lists)

README

        # Speech-Prompts-Adapters

This Repository surveys the paper focusing on **Adapters** and **Prompting** methods for **Speech Processing**.

## Navigation

* [ICASSP 2023 Tutorial Information](https://github.com/ga642381/Speech-Prompts-Adapters/#icassp-2023-tutorial-information)

* [**Adapters** for Speech Processing](https://github.com/ga642381/Speech-Prompts-Adapters/#adapters-for-speech-processing)

* [**Prompting** for Speech Processing](https://github.com/ga642381/Speech-Prompts-Adapters/#prompting-for-speech-processing)

* [**Reprogramming** and Prompting](https://github.com/ga642381/Speech-Prompts-Adapters/#reprogramming-and-prompting)

* [**Parameter Efficient Learning** Methods](https://github.com/ga642381/Speech-Prompts-Adapters/#parameter-efficient-learning-methods)

* [Contact](https://github.com/ga642381/Speech-Prompts-Adapters/#contact)

## NEWS

* In [ICASSP 2023](https://2023.ieeeicassp.org/), we will give a tutorial about Paramter-Efficient Learning for speech processing and natural langauge processing. I ([Kai-Wei Chang](https://scholar.google.com.tw/citations?user=hE_Oq8cAAAAJ&hl=zh-TW)) will cover the topics of adapters and prompts for speech processing.

---

## ICASSP 2023 Tutorial Information

* Title: Parameter-Efficient Learning for Speech and Language Processing: Adapters, Prompts, and Reprogramming

* Conference: ICASSP 2023

* Website: [ICASSP 2023 - Tutorials]( https://2023.ieeeicassp.org/tutorials/#1675893601715-597a5c9b-de65)

* Parameter-Efficient Learning for Speech Processing [Slides](https://github.com/ga642381/Kai-Wei-Chang-Talks/blob/main/Part%203b%20Parameter-Efficient%20Learning%20for%20Speech%20Processing.pdf)

### Presenters: 

* Pin-Yu Chen (IBM Research)

* Hung-yi Lee (National Taiwan University)

* Chao-Han Huck Yang (Georgia Institute of Technology )

* Kai-Wei Chang (National Taiwan University)

* Cheng-Han Chiang (National Taiwan University)

---

## Adapters and Prompting for Speech Processing

### Adapters for Speech Processing

| Title | Authors | Modality | Task | Link |

| ----- | ------- | -------- | ---- | ---- |

| Differentially Private Adapters for Parameter Efficient Acoustic Modeling | Chun-Wei Ho et al. | Speech | keyword Spotting | [Interspeech 2023](https://arxiv.org/abs/2305.11360)

| Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition | Haoyu Tang et al. | Speech | ASR | [arXiv 2023](https://arxiv.org/abs/2303.13072)

| A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model | Srijith Radhakrishnan et al. | Speech | Dialect Identification | [Interspeech 2023](https://arxiv.org/abs/2305.11244)

| CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models | Zih-Ching Chen et al. | Speech | [Multiple] | [arXiv 2022](https://arxiv.org/abs/2212.01282) 

| Parameter Efficient Transfer Learning for Various Speech Processing Tasks |Shinta Otake et al. | Speech | [Multiple] | [arXiv 2022](https://arxiv.org/abs/2212.02780)

| Parameter-efficient transfer learning of pre-trained Transformer models for speaker verification using adapters | Junyi Peng et al. | Speech | Speaker Verification | [arXiv 2022](https://arxiv.org/abs/2210.16032)

| Exploring Efficient-tuning Methods in Self-supervised Speech Models | Zih-Ching Chen et al. | Speech | [Multiple] | [SLT 2022](https://ieeexplore.ieee.org/document/10023274) 

|DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children’s ASR| Ruchao Fan, Abeer Alwan | Speech | ASR | [Interspeech 2022](https://www.isca-speech.org/archive/interspeech_2022/fan22d_interspeech.html)

|Speaker adaptation for Wav2vec2 based dysarthric ASR| Murali Karthick Baskar et al. | Speech | ASR | [Interspeech 2022](https://www.isca-speech.org/archive/interspeech_2022/baskar22b_interspeech.html)

| Adaptive multilingual speech recognition with pretrained models | Ngoc-Quan Pham et al. | Speech | ASR | [Interspeech 2022](https://www.isca-speech.org/archive/interspeech_2022/pham22_interspeech.html)

|An Adapter Based Pre-Training for Efficient and Scalable Self-Supervised Speech Representation Learning| Samuel Kessler et al.| Speech | ASR | [ICASSP 2022](https://ieeexplore.ieee.org/document/9747374)

|Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition| Bethan Thomas et al. | Speech | ASR | [ICASSP 2022](https://ieeexplore.ieee.org/document/9746223)

|Scaling End-to-End Models for Large-Scale Multilingual ASR| Bo Li et al.| Speech | ASR | [ASRU 2021](https://ieeexplore.ieee.org/document/9687871)

|Meta-Adapter: Efficient Cross-Lingual Adaptation With Meta-Learning| Wenxin Hou et al. | Speech | ASR | [ICASSP 2021](https://ieeexplore.ieee.org/document/9414959)

|Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition| Wenxin Hou et al. | Speech | ASR | [TASLP 2021](https://dl.acm.org/doi/abs/10.1109/TASLP.2021.3138674)

|Lightweight Adapter Tuning for Multilingual Speech Translation | Hang Le et al.| Speech | Speech Translation | [ACL-IJCNLP 2021](https://aclanthology.org/2021.acl-short.103/)

|Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech | Katrin Tomanek et al. | Speech | ASR | [EMNLP 2021](https://aclanthology.org/2021.emnlp-main.541/)

|Multilingual Speech Recognition with Self-Attention Structured Parameterization | Yun Zhu et al. | Speech | ASR | [Interspeech 2020](https://www.isca-speech.org/archive/interspeech_2020/zhu20d_interspeech.html)

|Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model| Anjuli Kannan et al. | Speech | ASR | [Interspeech 2019](https://www.isca-speech.org/archive_v0/Interspeech_2019/abstracts/2858.html)

### Prompting for Speech Processing

| Title | Authors | Modality | Task | Link |

| ----- | ------- | -------- | ---- | ---- |

| Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization | Puyuan Peng et al. | Speech | [Multiple] | [Interspeech 2023](https://arxiv.org/abs/2305.11095)

| From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition | Chao-Han Huck Yang et al. | Speech | ASR | [ICASSP 2023](https://arxiv.org/pdf/2301.07851.pdf)

| SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks | Kai-Wei Chang et al. | Speech | [Multiple] | [arXiv 2023](https://arxiv.org/abs/2303.00733)

|Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision| Eugene Kharitonov et al. | Text & Speech | TTS |[arXiv 2023](https://arxiv.org/abs/2302.03540)

| Describing emotions with acoustic property prompts for speech emotion recognition | Hira Dhamyal et al. | Text & Speech | ER | [arXiv 2022](https://arxiv.org/abs/2211.07737)

| PromptTTS: Controllable Text-to-Speech with Text Descriptions | Zhifang Guo et al. | Text & Speech | TTS | [arXiv 2022](https://arxiv.org/abs/2211.12171)

|Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Classification| Hao Yen et al. | Speech  | Spoken Command Recognition | [arXiv 2022](https://arxiv.org/abs/2110.03894)

|WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models| Heting Gao et al. |Text & Speech | SLU | [Interspeech 2022](https://www.isca-speech.org/archive/interspeech_2022/gao22e_interspeech.html)

| An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks | Kai-Wei Chang et al. | Speech | [Multiple] | [Interspeech 2022](https://www.isca-speech.org/archive/interspeech_2022/chang22e_interspeech.html) |

---

## Reprogramming and Prompting

For more information about reprogramming and prompting for large pre-trained models, please refer to the "awesome-neural-reprogramming-acoustic-prompting" repository. This topic was also covered in **ICASSP 2022** tutorial by Dr. Pin-Yu Chen and Dr. Huck Yang.

* GitHub Resource:  [awesome-neural-reprogramming-acoustic-prompting](https://github.com/huckiyang/awesome-neural-reprogramming-prompting)

* Tutorial Video: [ICASSP 22 Tutorial, "Neural Model Reprogramming and Prompting for Speech Modeling, " Huck Yang](https://www.youtube.com/watch?v=-iirkbYkyXI&ab_channel=Chao-HanHuckYang)

---

## Parameter Efficient Learning Methods

| Title | Authors | Link |

| ----- | ------- | ---- |

|BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models| Elad Ben Zaken et al. | [ACL 2022](https://aclanthology.org/2022.acl-short.1/)

|Towards a Unified View of Parameter-Efficient Transfer Learning| Junxian He et al. | [ICLR 2022](https://iclr.cc/virtual/2022/poster/6524)

|LoRA: Low-Rank Adaptation of Large Language Models| Edward J. Hu et al. | [ICLR 2022](https://iclr.cc/virtual/2022/poster/6319)

|Parameter-Efficient Transfer Learning for NLP| Neil Houlsby et al. | [ICML 2019](https://proceedings.mlr.press/v97/houlsby19a.html)

---

## Acknowledgment

We thank Kuang-Chen Peng, Tzu-Han Lin, and Fabian Ritter for their invaluable contribution to the initial collection.

## Contact

This repository is maintained by [Kai-Wei Chang](https://scholar.google.com.tw/citations?user=hE_Oq8cAAAAJ&hl=zh-TW) ([email protected]) and [Zih-Ching Chen](https://scholar.google.com.tw/citations?user=rjedYCoAAAAJ). Feel free to contact us or make a pull request

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ga642381/Speech-Prompts-Adapters

Awesome Lists containing this project

README