An open API service indexing awesome lists of open source software.

https://github.com/dmitryryumin/icassp-2023-24-papers

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
https://github.com/dmitryryumin/icassp-2023-24-papers

asr denoising domain-adaptation face-recognition generative-models icassp icassp2023 icassp2024 image-generation keyword-spotting language-modeling multimodal-learning music-generation self-supervised-learning semantic-segmentation signal-processing signal-restoration speech-recognition spoken-language-understanding vad

Last synced: 2 months ago
JSON representation

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Awesome Lists containing this project

README

        


ICASSP-2024-Papers


General Information


Awesome


Conference

Version
License: MIT



Repository Size and Activity

GitHub repo size
GitHub commit activity (branch)



Contribution Statistics

GitHub contributors
GitHub closed issues
GitHub issues
GitHub closed pull requests
GitHub pull requests



Other Metrics

GitHub last commit
GitHub watchers
GitHub forks
GitHub Repo stars
Visitors



GitHub Actions


Copy Parse Markdown and Generate JSON from Source Repo




Parse Markdown and Generate JSON




Sync Hugging Face App




Application


App




Progress Status


Main







---

ICASSP 2024 Papers: A complete collection of influential and exciting research papers from the [*ICASSP 2024*](https://2024.ieeeicassp.org/) conference. Explore the latest advancements in acoustics, speech and signal processing. Code included. :star: the repository to support the advancement of audio and signal processing!



ICASSP 2024

---

> [!TIP]
[*Online version of the ICASSP 2024 Conference Technical Program*](https://2024.ieeeicassp.org/program-schedule/), which lists all accepted full papers along with their presentation mode and time.

---



Other collections of the best AI conferences





App




> [!important]
> Conference table will be up to date all the time.



Conference
Year


2023
2024


Computer Vision (CV)


CVPR



ICCV
 



ECCV




WACV
:heavy_minus_sign:
 


FG
:heavy_minus_sign:



Speech/Signal Processing (SP/SigProc)


ICASSP



INTERSPEECH
 



ISMIR
 
:heavy_minus_sign:


Natural Language Processing (NLP)


EMNLP




Machine Learning (ML)


AAAI
:heavy_minus_sign:



ICLR
:heavy_minus_sign:



ICML
:heavy_minus_sign:



NeurIPS
:heavy_minus_sign:

---

## Contributors






> [!NOTE]
> Contributions to improve the completeness of this list are greatly appreciated. If you come across any overlooked papers, please **feel free to [*create pull requests*](https://github.com/DmitryRyumin/ICASSP-2023-24-Papers/pulls), [*open issues*](https://github.com/DmitryRyumin/ICASSP-2023-24-Papers/issues) or contact me via [*email*](mailto:[email protected])**. Your participation is crucial to making this repository even better.

---

## Papers


App


Conference



Section
Papers







Main



Audio-Visual Speech Processing


Papers


Preprints


Open Code


Videos




Vision and Language


Papers


Preprints


Open Code


Videos




Acoustic Signal Processing


Papers


Preprints


Open Code


Videos




Deep Learning Techniques


Papers


Preprints


Open Code


Videos




Speech Enhancement and Separation - Diffusion and other Probabilistic Models


Papers


Preprints


Open Code


Videos




ASPS Lecture


Papers


Preprints


Open Code


Videos




Distributed and Federated Learning


Papers


Preprints


Open Code


Videos




Transfer Learning


Papers


Preprints


Open Code


Videos




Voice Conversion


Papers


Preprints


Open Code


Videos




Graph Neural Networks


Papers


Preprints


Open Code


Videos




Language Resources, Metrics and Systems


Papers


Preprints


Open Code


Videos




Watermarking and Data Hiding


Papers


Preprints


Open Code


Videos




Signal and Information Processing over Graphs


Papers


Preprints


Open Code


Videos




Integrated Sensing and Communications


Papers


Preprints


Open Code


Videos




Audio Events Detection and Classification; Music Information Retrieval


Papers


Preprints


Open Code


Videos




Language Understanding and Computational Semantics - NLP Tasks


Papers


Preprints


Open Code


Videos




Physiological and Wearable Signal Processing


Papers


Preprints


Open Code


Videos




Speech Enhancement; Music Information Retrieval


Papers


Preprints


Open Code


Videos




Multimodal Medical Image Fusion and Analysis


Papers


Preprints


Open Code


Videos




Sparse/Low-Dimensional Signal Processing


Papers


Preprints


Open Code


Videos




Robust and Sustainable Machine Learning


Papers


Preprints


Open Code


Videos




Machine Learning for Image and Video Processing


Papers


Preprints


Open Code


Videos




Deep Learning Generalization


Papers


Preprints


Open Code


Videos




Distributed Processing and Federated Learning


Papers


Preprints


Open Code


Videos




Biological Image Analysis


Papers


Preprints


Open Code


Videos




Learning from Multimodal Data


Papers


Preprints


Open Code


Videos




Biometrics


Papers


Preprints


Open Code


Videos




Detection and Classification


Papers


Preprints


Open Code


Videos




Multimedia Coding


Papers


Preprints


Open Code


Videos




Anonymisation, Data Privacy and Hiding


Papers


Preprints


Open Code


Videos




Quality Assessment and Anomaly Detection


Papers


Preprints


Open Code


Videos




Signal Filtering, Reconstruction, Restoration and Enhancement


Papers


Preprints


Open Code


Videos




Speech Emotion Recognition and Analysis


Papers


Preprints


Open Code


Videos




Deep Generative Models


Papers


Preprints


Open Code


Videos




Context and LLM Speech Recognition


Papers


Preprints


Open Code


Videos




Music Information Retrieval


Papers


Preprints


Open Code


Videos




Multimodal Processing: Vision + Language


Papers


Preprints


Open Code


Videos




Environmental Sound Synthesis and Generation


Papers


Preprints


Open Code


Videos




Biomedical and Biological Image Processing


Papers


Preprints


Open Code


Videos




DoA Estimation


Papers


Preprints


Open Code


Videos




Tracking


Papers


Preprints


Open Code


Videos




Machine Learning for Communications


Papers


Preprints


Open Code


Videos




Image and Video Processing for Watermarking and Security


Papers


Preprints


Open Code


Videos




Self-Supervised Learning for Speech Processing


Papers


Preprints


Open Code


Videos




Deep Learning for Image and Video Processing


Papers


Preprints


Open Code


Videos




Image, Video, and 3D Content Generation


Papers


Preprints


Open Code


Videos




Classification of Acoustic Scenes and Events


Papers


Preprints


Open Code


Videos




Reinforcement Learning


Papers


Preprints


Open Code


Videos




Subspace and Manifold Learning


Papers


Preprints


Open Code


Videos




Active Noise Control and Echo Cancellation; Source Separation


Papers


Preprints


Open Code


Videos




Machine Learning, Detection and Classification


Papers


Preprints


Open Code


Videos




Machine Learning for Audio, Speech and Music Processing


Papers


Preprints


Open Code


Videos




Multimedia Generation and Synthesis


Papers


Preprints


Open Code


Videos




Medical Image Detection and Segmentation


Papers


Preprints


Open Code


Videos




Multimedia Forensics and Cybersecurity


Papers


Preprints


Open Code


Videos




Estimation Theory and Methods


Papers


Preprints


Open Code


Videos




Emerging Methods for Biomedical Image and Signal Processing


Papers


Preprints


Open Code


Videos




Text to Speech Generation


Papers


Preprints


Open Code


Videos




Audio Classification, Detection and Localization


Papers


Preprints


Open Code


Videos




Self-Supervised and Semi-Supervised Learning


Papers


Preprints


Open Code


Videos




Multichannel/Multimodal Speech Recognition


Papers


Preprints


Open Code


Videos




Speaker Verification


Papers


Preprints


Open Code


Videos




Speaker Diarization


Papers


Preprints


Open Code


Videos




Adversarial Machine Learning


Papers


Preprints


Open Code


Videos




Machine Learning Methods for Language


Papers


Preprints


Open Code


Videos




SPED: Signal Processing Education


Papers


Preprints


Open Code


Videos




Multimedia Quality of Experience


Papers


Preprints


Open Code


Videos




Domain-Enriched Learning for Medical Image Processing


Papers


Preprints


Open Code


Videos




Speech Enhancement and Separation


Papers


Preprints


Open Code


Videos




Image Denoising


Papers


Preprints


Open Code


Videos




ASPS Poster


Papers


Preprints


Open Code


Videos




ASR - New Algorithms and Approaches


Papers


Preprints


Open Code


Videos




Data Mining and Big Data


Papers


Preprints


Open Code


Videos




Language Understanding and Computational Semantics - Machine Learning


Papers


Preprints


Open Code


Videos




Explainable and Interpretable Machine Learning


Papers


Preprints


Open Code


Videos




Neuroimaging and Brain/Human-Computer Interfaces


Papers


Preprints


Open Code


Videos




Localization, DOA Estimation, Spatial Audio Recording and Reproduction


Papers


Preprints


Open Code


Videos




Perception and Processing for Autonomous Systems and Applications


Papers


Preprints


Open Code


Videos




Computational Imaging


Papers


Preprints


Open Code


Videos




Audio and Speech Quality and Intelligibility Measures; Music Analysis


Papers


Preprints


Open Code


Videos




Medical Image Formation, Reconstruction and Restoration


Papers


Preprints


Open Code


Videos




Audio and Speech Source Separation


Papers


Preprints


Open Code


Videos




Text-based Customization for Speech-to-Text


Papers


Preprints


Open Code


Videos




Deep Learning Models


Papers


Preprints


Open Code


Videos




Next-Gen Communication Systems


Papers


Preprints


Open Code


Videos




Image Restoration


Papers


Preprints


Open Code


Videos




Robustness and Trustworthy Machine Learning


Papers


Preprints


Open Code


Videos




Signal Processing over Networks


Papers


Preprints


Open Code


Videos




3D Understanding


Papers


Preprints


Open Code


Videos




Compressed Sensing and Machine Learning for Multi-Sensor Systems


Papers


Preprints


Open Code


Videos




LIMMITS: Multi-Speaker, Multi-Lingual Indic TTS with Voice Cloning


Papers


Preprints


Open Code


Videos




Natural Language Processing for Speech-to-Text


Papers


Preprints


Open Code


Videos




Resource Constrained Acoustic and Language Modeling


Papers


Preprints


Open Code


Videos




Dereverberation and RIR Estimation; Speech Enhancement and Restoration


Papers


Preprints


Open Code


Videos




Image/Video Super-Resolution


Papers


Preprints


Open Code


Videos




Matrix Factorization and Source Separation


Papers


Preprints


Open Code


Videos




Beamforming for Audio and Speech; Music Signal Analysis, Processing and Synthesis


Papers


Preprints


Open Code


Videos




Summarization, Retrieval and Language Learning


Papers


Preprints


Open Code


Videos




Sequential Learning and Sequential Decision Methods


Papers


Preprints


Open Code


Videos




MIMO and Massive MIMO Communication Systems


Papers


Preprints


Open Code


Videos




Multimodal Emotion/Sentiment Analysis


Papers


Preprints


Open Code


Videos




Human Understanding


Papers


Preprints


Open Code


Videos




Image and Video Synthesis


Papers


Preprints


Open Code


Videos




MIMO and High-Frequency Communications


Papers


Preprints


Open Code


Videos




Image and Video Super-Resolution


Papers


Preprints


Open Code


Videos




Spatial Audio Recording and Reproduction


Papers


Preprints


Open Code


Videos




Audio Signal Restoration and Speech Enhancement


Papers


Preprints


Open Code


Videos




Discourse and Dialog


Papers


Preprints


Open Code


Videos




Bayesian Signal Processing


Papers


Preprints


Open Code


Videos




Pattern Recognition and Classification


Papers


Preprints


Open Code


Videos




Key Word Spotting


Papers


Preprints


Open Code


Videos




Speech Analysis - Pitch, Spectrum and Voice Disorders


Papers


Preprints


Open Code


Videos




Grand Challenge on Hyperspectral Skin Vision


Papers


Preprints


Open Code


Videos




Robust Speech Recognition and Adaptation


Papers


Preprints


Open Code


Videos




Speech Analysis and Language Disorder Analysis


Papers


Preprints


Open Code


Videos




Aspects in Image/Video Processing and Analysis


Papers


Preprints


Open Code


Videos




DoA Estimation and Source Localization


Papers


Preprints


Open Code


Videos




Multimodal Processing of Language


Papers


Preprints


Open Code


Videos




Source separation; Music analysis


Papers


Preprints


Open Code


Videos




Machine Learning for Time Series Analysis


Papers


Preprints


Open Code


Videos




Multimedia Search and Retrieval


Papers


Preprints


Open Code


Videos




Anomaly Detection; Sound Event Detection and Localization


Papers


Preprints


Open Code


Videos




Acoustic Array and Signal Processing


Papers


Preprints


Open Code


Videos




Music Signal Analysis and Processing


Papers


Preprints


Open Code


Videos




Language Understanding and Computational Semantics - Language Models


Papers


Preprints


Open Code


Videos




Deep Learning Theory


Papers


Preprints


Open Code


Videos




Anti-Spoofing


Papers


Preprints


Open Code


Videos




Pose, Gesture, and Action in Multimedia


Papers


Preprints


Open Code


Videos




Sampling Theory, Compressed and Non-Uniform Sampling


Papers


Preprints


Open Code


Videos




MIMO and Massive MIMO Systems


Papers


Preprints


Open Code


Videos




Multimodal and Emerging Medical Signal Analysis


Papers


Preprints


Open Code


Videos




The RF Signal Separation Challenge


Papers


Preprints


Open Code


Videos




Signal Processing for Communications


Papers


Preprints


Open Code


Videos




Audio and Speech Modeling, Coding and Transmission; Spatial Audio Recording and Reproduction


Papers


Preprints


Open Code


Videos




Voice Conversion: Singing, Accent and Emotion


Papers


Preprints


Open Code


Videos




Other Machine Learning Applications


Papers


Preprints


Open Code


Videos




Speaker Recognition and Anonymization


Papers


Preprints


Open Code


Videos




Feature Extraction Selection and Learning


Papers


Preprints


Open Code


Videos




Music Information Retrieval; Quality and Intelligibility Measures


Papers


Preprints


Open Code


Videos




Learning Theory and Performance Bound


Papers


Preprints


Open Code


Videos




Human-Centric Multimedia


Papers


Preprints


Open Code


Videos




Multilingual Speech Recognition and Identification


Papers


Preprints


Open Code


Videos




Image Recognition and Detection


Papers


Preprints


Open Code


Videos




Signal Processing over Graphs and Networks


Papers


Preprints


Open Code


Videos




End-to-End Modeling for Automatic Speech Recognition


Papers


Preprints


Open Code


Videos




Segmentation, Tagging, and Parsing of Language


Papers


Preprints


Open Code


Videos




Detection


Papers


Preprints


Open Code


Videos




Audio-Language Processing and Audio Captioning


Papers


Preprints


Open Code


Videos




Action Recognition


Papers


Preprints


Open Code


Videos




Image, Video and Other Applications


Papers


Preprints


Open Code


Videos




Multimodal Information Based Speech Processing (MISP)


Papers


Preprints


Open Code


Videos




Next-Gen Communications and PHY Security


Papers


Preprints


Open Code


Videos




Network and System Security

Will soon be added



Target Source Extraction; Active Noise Control, Echo Reduction and Feedback Reduction




Machine Translation for Spoken and Written Language




Sound Events Detection, Description and Generation




Applied Cryptography




Machine/Deep Learning Methodologies for Multimedia




Speech Separation and Extraction




Signal Processing and Machine Learning for Communications




Audio Coding




Active Noise Control and Echo Cancellation




Bayesian Machine Learning




Advancing the Frontiers of Deep Learning for Low-Dose 3D Cone-Beam CT Reconstruction




Bioacoustics and Medical Acoustics; Audio Security




Acoustic Modeling for Automatic Speech Recognition




Multimodal Processing of Speech




IFS General




3D Image and Video Processing and Analysis




Deep Learning Training Methods




Key Word Spotting and Acoustic Event Detection




Coding, Information Theory, and Applications of Signal Processing for Communications




Speech Analysis




Music Separation; Audio for Multimedia and Audio Processing Systems




Machine Learning for Communications and Wireless Networks




Image and Video Coding/Compression




Bioinformatics and Biomedical Signal Processing




Audio-Visual Speech/Intent Recognition




Multimodal Clustering, Segmentation, and Summarization




Learning Theory and Methods




SP Cadenza Challenge: Music Demixing/Remixing for Hearing Aids




Radar Signal Processing




Biological and Medical Signal and Image Processing




Anti-Spoofing and Speaker Embedding




Speech Enhancement; Dereverberation and RIR Estimation




Segmentation




3D Generation




Multimedia Forensics




Speech Signal Improvement Challenge




Audio Deep Packet Loss Concealment Grand Challenge




Signal Processing Theory and Methods Journal Papers




Multi-Sensor and Multichannel Signal Processing




Array Processing and Beamforming




Sound Event Classification and Generation; Active Noise Control, Echo Reduction and Feedback Reduction




Deep Learning Fairness and Privacy




Sparsity and Low-Rank Models




Optimization Methods for Signal Processing




Multimodal Processing




Show and Tell Demos



Special Session



Model based Machine Learning for Wireless Communications and Sensing

Will soon be added



Exploiting Diversities in Advanced Array Systems: New Applications and Trends




Generative Semantic Communication: How Generative Models Enhance Semantic Communications




Quantum Machine Learning Algorithms and Applications on NISQ Devices




Robust Reconstruction Methods in Computational Imaging




Graphical Inference and Modeling in Dynamical Systems




Advancements in Integrated Sensing and Communication for Next-Generation Wireless Networks




Signal and Graph Processing for Autonomous Agents




Next-Generation Wi-Fi Sensing




Signal Processing Theory for Covert Communication and Cybersecurity




In-Context Learning Methods for Speech and Spoken Language Processing




Topological Signal Processing over Higher-Order Networks




Deepfakes and AI-Generated Content (AIGC) Detection and Forensics: Recent Advances




Recent Advances in AI-Powered Visual Computing and Multimodal Signal Processing for Metaverse Era




Algorithm-Hardware Co-Design of Neuromorphic Solutions for Signal Processing Applications




Automotive Radar Signal Processing for Autonomous Driving




Learning with Incomplete Medical Data




Signal Processing and Machine Learning for Collective Intelligence




Variational Inference and Approximate Bayesian Techniques




Efficient Modeling of Long Sequences with Applications to Speech and Audio




Decentralized Learning with Resource-Constrained Communication




Localization and Sensing based on Signals from Terrestrial and Non-Terrestrial Networks




Signal Processing and Machine Learning for Understanding Brain Dynamics


---

## Key Terms


Key Terms

---

## Star History



Star History Chart