https://github.com/dmitryryumin/interspeech-2023-24-papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
https://github.com/dmitryryumin/interspeech-2023-24-papers
acoustic adaptation asr audio-signals interspeech interspeech2023 interspeech2024 language-modeling lexical-analysis linguistic-analysis machine-translation prosody self-supervised-learning signal-processing speech-analysis speech-production speech-recognition speech-synthesis speech-technology transmission
Last synced: 2 months ago
JSON representation
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
- Host: GitHub
- URL: https://github.com/dmitryryumin/interspeech-2023-24-papers
- Owner: DmitryRyumin
- License: mit
- Created: 2023-06-26T10:10:43.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-12-25T12:44:51.000Z (4 months ago)
- Last Synced: 2025-02-19T17:11:32.765Z (2 months ago)
- Topics: acoustic, adaptation, asr, audio-signals, interspeech, interspeech2023, interspeech2024, language-modeling, lexical-analysis, linguistic-analysis, machine-translation, prosody, self-supervised-learning, signal-processing, speech-analysis, speech-production, speech-recognition, speech-synthesis, speech-technology, transmission
- Homepage:
- Size: 11.4 MB
- Stars: 657
- Watchers: 89
- Forks: 43
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
![]()
General Information
![]()
![]()
![]()
![]()
Repository Size and Activity
![]()
![]()
Contribution Statistics
![]()
![]()
![]()
![]()
![]()
Other Metrics
![]()
![]()
![]()
![]()
![]()
Application
![]()
Progress Status
Main
![]()
![]()
---
INTERSPEECH 2024 Papers: A complete collection of influential and exciting research papers from the [*INTERSPEECH 2024*](https://interspeech2024.org/) conference. Explore the latest advances in speech and language processing. Code included. :star: the repository to support the advancement of speech technology!
---
> [!TIP]
[*The PDF version of the INTERSPEECH 2024 Conference Programme*](https://drive.google.com/file/d/1w_F9STjblCMANAZXO8l5Yy6vfDSNYFIw/view), comprises a list of all accepted full papers, their presentation order, as well as the designated presentation times.---
![]()
Other collections of the best AI conferences
> [!important]
> Conference table will be up to date all the time.
Conference
Year
2023
2024
Computer Vision (CV)
CVPR
![]()
ICCV
![]()
![]()
![]()
ECCV
![]()
![]()
WACV
:heavy_minus_sign:
![]()
![]()
FG
:heavy_minus_sign:
![]()
Speech/Signal Processing (SP/SigProc)
ICASSP
![]()
INTERSPEECH
![]()
ISMIR
![]()
![]()
:heavy_minus_sign:
Natural Language Processing (NLP)
EMNLP
![]()
![]()
Machine Learning (ML)
AAAI
:heavy_minus_sign:
![]()
ICLR
:heavy_minus_sign:
![]()
ICML
:heavy_minus_sign:
![]()
NeurIPS
:heavy_minus_sign:
![]()
---
## Contributors
> [!NOTE]
> Contributions to improve the completeness of this list are greatly appreciated. If you come across any overlooked papers, please **feel free to [*create pull requests*](https://github.com/DmitryRyumin/INTERSPEECH-2023-24-Papers/pulls), [*open issues*](https://github.com/DmitryRyumin/INTERSPEECH-2023-24-Papers/issues) or contact me via [*email*](mailto:[email protected])**. Your participation is crucial to making this repository even better.---
## [Papers-2024](https://www.isca-archive.org/interspeech_2024/)
(`In progress`)
Section
Papers
![]()
![]()
![]()
L2 Speech, Bilingualism and Code-Switching
![]()
![]()
![]()
![]()
Speaker Diarization
![]()
![]()
![]()
![]()
Speech and Audio Analysis and Representations
![]()
![]()
![]()
![]()
Acoustic Event Detection, Segmentation and Classification
![]()
![]()
![]()
![]()
Detection and Classification of Bioacoustic Signals
![]()
![]()
![]()
![]()
---
## [Papers-2023](https://www.isca-archive.org/interspeech_2023/)
Section
Papers
![]()
![]()
Resources for Spoken Language Processing
![]()
![]()
![]()
Speech Synthesis: Prosody and Emotion
![]()
![]()
![]()
Statistical Machine Translation
![]()
![]()
![]()
Self-Supervised Learning in ASR
![]()
![]()
![]()
Prosody
![]()
![]()
![]()
Speech Production
![]()
![]()
![]()
Dysarthric Speech Assessment
![]()
![]()
![]()
Speech Coding: Transmission
![]()
![]()
![]()
Speech Recognition: Signal Processing, Acoustic Modeling, Robustness, Adaptation
![]()
![]()
![]()
Analysis of Speech and Audio Signals
![]()
![]()
![]()
Speech Recognition: Architecture, Search, and Linguistic Components
![]()
![]()
![]()
Speech Recognition: Technologies and Systems for New Applications
![]()
![]()
![]()
Lexical and Language Modeling for ASR
![]()
![]()
![]()
Language Identification and Diarization
![]()
![]()
![]()
Speech Quality Assessment
![]()
![]()
![]()
Feature Modeling for ASR
![]()
![]()
![]()
Interfacing Speech Technology and Phonetics
![]()
![]()
![]()
Speech Synthesis: Multilinguality
![]()
![]()
![]()
Speech Emotion Recognition
![]()
![]()
![]()
Spoken Dialog Systems and Conversational Analysis
![]()
![]()
![]()
Speech Coding and Enhancement
![]()
![]()
![]()
Paralinguistics
![]()
![]()
![]()
Speech Enhancement and Denoising
![]()
![]()
![]()
Speech Synthesis: Evaluation
![]()
![]()
![]()
End-to-End Spoken Dialog Systems
![]()
![]()
![]()
Biosignal-enabled Spoken Communication
![]()
![]()
![]()
Neural-based Speech and Acoustic Analysis
![]()
![]()
![]()
DiGo - Dialog for Good: Speech and Language Technology for Social Good
![]()
![]()
![]()
Spoken Language Processing: Translation, Information Retrieval, Summarization, Resources, and Evaluation
![]()
![]()
![]()
Speech, Voice, and Hearing Disorders
![]()
![]()
![]()
Spoken Term Detection and Voice Search
![]()
![]()
![]()
Models for Streaming ASR
![]()
![]()
![]()
Source Separation
![]()
![]()
![]()
Speech Perception
![]()
![]()
![]()
Phonetics and Phonology: Languages and Varieties
![]()
![]()
![]()
Speaker and Language Identification
![]()
![]()
![]()
Speech Synthesis and Voice Conversion
![]()
![]()
![]()
Speech and Language in Health: from Remote Monitoring to Medical Conversations
![]()
![]()
![]()
Novel Transformer Models for ASR
![]()
![]()
![]()
Speaker Recognition
![]()
![]()
![]()
Cross-lingual and Multilingual ASR
![]()
![]()
![]()
Voice Conversion
![]()
![]()
![]()
Pathological Speech Analysis
![]()
![]()
![]()
Multimodal Speech Emotion Recognition
![]()
![]()
![]()
Phonetics, Phonology, and Prosody
![]()
![]()
![]()
Speech Coding: Privacy
![]()
![]()
![]()
Analysis of Neural Speech Representations
![]()
![]()
![]()
End-to-end ASR
![]()
![]()
![]()
Spoken Language Understanding, Summarization, and Information Retrieval
![]()
![]()
![]()
Invariant and Robust Pre-trained Acoustic Models
![]()
![]()
![]()
Speech Synthesis: Representation Learning
![]()
![]()
![]()
Speech Perception, Production, and Acquisition
![]()
![]()
![]()
Acoustic Model Adaptation for ASR
![]()
![]()
![]()
Speech Synthesis: Expressivity
![]()
![]()
![]()
Multi-modal Systems
![]()
![]()
![]()
Question Answering from Speech
![]()
![]()
![]()
Multi-talker Methods in Speech Processing
![]()
![]()
![]()
Sociophonetics
![]()
![]()
![]()
Speaker and Language Diarization
![]()
![]()
![]()
Anti-Spoofing for Speaker Verification
![]()
![]()
![]()
Speech Coding: Intelligibility
![]()
![]()
![]()
New Computational Strategies for ASR Training and Inference
![]()
![]()
![]()
MERLIon CCS Challenge: Multilingual Everyday Recordings - Language Identification On Code-Switched Child-Directed Speech
![]()
![]()
![]()
Health-Related Speech Analysis
![]()
![]()
![]()
Automatic Audio Classification and Audio Captioning
![]()
![]()
![]()
Speech Synthesis
![]()
![]()
![]()
Speech Synthesis: Controllability and Adaptation
![]()
![]()
![]()
Search Methods and Decoding Algorithms for ASR
![]()
![]()
![]()
Speech Signal Analysis
![]()
![]()
![]()
Connecting Speech-science and Speech-technology for Children's Speech
![]()
![]()
![]()
Dialog Management
![]()
![]()
![]()
Speech Activity Detection and Modeling
![]()
![]()
![]()
Multilingual Models for ASR
![]()
![]()
![]()
Speech Enhancement and Bandwidth Expansion
![]()
![]()
![]()
Articulation
![]()
![]()
![]()
Neural Processing of Speech and Language: Encoding and Decoding the Diverse Auditory Brain
![]()
![]()
![]()
Perception of Paralinguistics
![]()
![]()
![]()
Technologies for Child Speech Processing
![]()
![]()
![]()
Speech Synthesis: Multilinguality; Evaluation
![]()
![]()
![]()
Show and Tell: Health Applications and Emotion Recognition
![]()
![]()
![]()
Show and Tell: Speech Tools, Speech Enhancement, Speech Synthesis
![]()
![]()
![]()
Show and Tell: Language Learning and Educational Resources
![]()
![]()
![]()
Show and Tell: Media and Commercial Applications
![]()
![]()
![]()
---
## Key Terms
> To be added soon
---
## Star History