Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ultramarine-indigo/awesome-AI-for-emotion-recognition-papers

A list of papers (with available code), tutorials, and surveys on recent AI for emotion recognition (AI4ER)
https://github.com/ultramarine-indigo/awesome-AI-for-emotion-recognition-papers

List: awesome-AI-for-emotion-recognition-papers

Last synced: 16 days ago
JSON representation

A list of papers (with available code), tutorials, and surveys on recent AI for emotion recognition (AI4ER)

Awesome Lists containing this project

README

        

# awesome-AI-for-emotion-recognition-papers

A list of papers (with available code), tutorials, and surveys on recent AI for emotion recognition (AI4ER)

- [Emotion Recognition Surveys](#emotion-recognition-surveys)
- [General Emotion Recognition Survey](#general-emotion-recognition-survey)
- [Multimodal](#multimodal)
- [Multimodal Speech](#multimodal-speech)
- [Speech](#speech)
- [Conversation-Text](#conversation-text)
- [2024 papers](#2024-papers)
- [ICASSP 2024](#icassp-2024)
- [speech](#speech-1)
- [conversation](#conversation)
- [Multimodal Speech(speech+text)](#multimodal-speechspeechtext)
- [Multimodal](#multimodal-1)
- [others](#others)
- [2023 papers](#2023-papers)
- [EMNLP2023](#emnlp2023)
- [NeurIPS 2023](#neurips-2023)
- [interspeech 2023](#interspeech-2023)
- [Multimodal Speech(speech+text)](#multimodal-speechspeechtext-1)
- [Speech](#speech-2)
- [Others](#others-1)
- [ICASSP 2023](#icassp-2023)
- [Multimodal Speech (speech+text)](#multimodal-speech-speechtext)
- [Multimodal Speech-Visual](#multimodal-speech-visual)
- [Speech](#speech-3)
- [Conversation](#conversation-1)
- [Other](#other)
- [IJCAI 2023](#ijcai-2023)
- [AAAI 2023](#aaai-2023)
- [Conversation](#conversation-2)
- [Other](#other-1)
- [ACL 2023](#acl-2023)
- [Multimodal](#multimodal-2)
- [Conversation](#conversation-3)
- [Other](#other-2)
- [2022 papers](#2022-papers)
- [AAAI 2022](#aaai-2022)
- [Multimodal](#multimodal-3)
- [Conversation](#conversation-4)
- [IJCAI 2022](#ijcai-2022)
- [Speech](#speech-4)
- [Conversation](#conversation-5)
- [Other](#other-3)
- [ACM MM 2022](#acm-mm-2022)
- [Multimodal](#multimodal-4)
- [Vision](#vision)
- [Speech](#speech-5)
- [Other](#other-4)
- [NAACL 2022](#naacl-2022)
- [Multimodal](#multimodal-5)
- [2020 papers](#2020-papers)
- [AAAI 2020](#aaai-2020)
- [Multimodal](#multimodal-6)

### Emotion Recognition Surveys

#### General Emotion Recognition Survey- [awesome-AI-for-emotion-recognition-papers]

- Emotion Recognition and Detection Methods: A Comprehensive Survey [paper](https://iecscience.org/jpapers/46)
- A systematic review on affective computing: emotion models, databases, and recent advances [paper](https://arxiv.org/abs/2203.06935)

#### Multimodal

- Multimodal Emotion Recognition using Deep Learning [paper](https://www.jastt.org/index.php/jasttpath/article/view/91)
- Emotion recognition using multi-modal data and machine learning techniques: A tutorial and review [paper](https://drive.google.com/file/d/1wGagPpwhGPKpOVpcVmMLIKyU2fUWDYI-/view)
- Deep learning-based multimodal emotion recognition from audio, visual, and text modalities: A systematic review of recent advancements and future prospects [paper](https://www.sciencedirect.com/science/article/abs/pii/S0957417423021942)
- Emotion recognition from unimodal to multimodal analysis: A review

#### Multimodal Speech

- Deep Multimodal Emotion Recognition on Human Speech: A Review [paper](https://www.mdpi.com/2076-3417/11/17/7962)

#### Speech

- Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers [paper](https://www.researchgate.net/profile/John-James-3/post/What-are-the-public-datasets-available-for-emotion-recognition/attachment/5f3771e2ce377e00016d4946/AS%3A924666805886978%401597469154666/download/Speech-emotion-recognition--Emotional-models--databases--feat_2020_Speech-Co.pdf)
- A Comprehensive Review of Speech Emotion Recognition Systems [paper](https://ieeexplore.ieee.org/iel7/6287639/9312710/09383000.pdf)
- Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models [paper](https://www.mdpi.com/1424-8220/21/4/1249)
- A Review on Speech Emotion Recognition Using Deep Learning and Attention Mechanism [paper](https://www.mdpi.com/2079-9292/10/10/1163)

#### Conversation-Text

-

### 2024 papers

#### ICASSP 2024

##### speech

- TRUST-SER: On The Trustworthiness Of Fine-Tuning Pre-Trained Speech Embeddings For Speech Emotion Recognition [paper](https://arxiv.org/pdf/2305.11229) [code](https://github.com/usc-sail/trust-ser)
- Emohrnet: High-Resolution Neural Network Based Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10446976)
- Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation [paper](https://arxiv.org/pdf/2402.11747)
- Investigating Salient Representations and Label Variance in Dimensional Speech Emotion Analysis [paper](https://arxiv.org/html/2312.16180v1)
- Adaptive Speech Emotion Representation Learning Based On Dynamic Graph [paper](https://ieeexplore.ieee.org/document/10447829)
- Leveraging Speech PTM, Text LLM, And Emotional TTS For Speech Emotion Recognition [paper](https://arxiv.org/pdf/2309.10294)
- Enhancing Two-Stage Finetuning for Speech Emotion Recognition Using Adapters [paper](https://ieeexplore.ieee.org/document/10446645)
- Frame-Level Emotional State Alignment Method for Speech Emotion Recognition [paper](https://arxiv.org/pdf/2312.16383) [code](https://github.com/ASolitaryMan/HFLEA)
- Gradient-Based Dimensionality Reduction for Speech Emotion Recognition Using Deep Networks [paper](https://ieeexplore.ieee.org/document/10447616) [code](https://github.com/hxwangnus/Grad-based-Dim-Red-for-SER)
- Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition [paper](https://arxiv.org/html/2401.10536v1)
- Disentanglement Network: Disentangle the Emotional Features from Acoustic Features for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10448044)
- Improving Speaker-Independent Speech Emotion Recognition using Dynamic Joint Distribution Adaptation [paper](https://arxiv.org/html/2401.09752v1)
- Comparing data-Driven and Handcrafted Features for Dimensional Emotion Recognition [paper](https://publications.idiap.ch/attachments/papers/2024/Vlasenko_ICASSP_2024.pdf)
- Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition [paper](https://arxiv.org/html/2401.12925v1)
- Balancing Speaker-Rater Fairness for Gender-Neutral Speech Emotion Recognition [paper](https://www.researchgate.net/profile/Woan-Shiuan-Chien/publication/376799317_Balancing_Speaker-Rater_Fairness_For_Gender-Neutral_Speech_Emotion_Recognition/links/6589283d0bb2c7472b0d10ee/Balancing-Speaker-Rater-Fairness-For-Gender-Neutral-Speech-Emotion-Recognition.pdf)
- Prompting Audios Using Acoustic Properties for Emotion Representation [paper](https://arxiv.org/pdf/2310.02298)
- Speech Emotion Recognition with Distilled Prosodic and Linguistic Affect Representations [paper](https://arxiv.org/pdf/2309.04849)
- Generalization of Self-Supervised Learning-Based Representations for Cross-Domain Speech Emotion Recognition [paper](https://ecs.utdallas.edu/research/researchlabs/msp-lab/publications/Naini_2024.pdf)
- Dynamic Speech Emotion Recognition Using A Conditional Neural Process [paper](https://ecs.utdallas.edu/research/researchlabs/msp-lab/publications/Martinez-Lucas_2024.pdf)
- Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition [paper](https://arxiv.org/pdf/2401.11017)
- Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting [paper](https://arxiv.org/pdf/2309.08108)
- MS-SENet: Enhancing Speech Emotion Recognition Through Multi-Scale Feature Fusion with Squeeze-and-Excitation Blocks [paper](https://arxiv.org/html/2312.11974v2)
- Cubic Knowledge Distillation for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447713) [code](https://github.com/Fly1toMoon/Cubic-Knowledge-Distillation)
- Improving Speech Emotion Recognition with Unsupervised Speaking Style Transfer [paper](https://arxiv.org/pdf/2211.08843)
- Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition [paper](https://arxiv.org/pdf/2403.19224) [code](https://github.com/ECNU-Cross-Innovation-Lab/ENT)
- Multi-Source Unsupervised Transfer Components Learning for Cross-Domain Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10446499)
- Self-Supervised Domain Exploration with an Optimal Transport Regularization for Open Set Cross-Domain Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447482)
- Towards Improving Speech Emotion Recognition Using Synthetic Data Augmentation from Emotion Conversion [paper](https://hal.science/hal-04364976/document)

##### conversation

- Esihgnn: Event-State Interactions Infused Heterogeneous Graph Neural Network for Conversational Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447592)
- MCM-CSD: Multi-Granularity Context Modeling with Contrastive Speaker Detection for Emotion Recognition in Real-Time Conversation [paper](https://ieeexplore.ieee.org/document/10446410) [code](https://github.com/WHOISJENNY/MCM-CSD)
- SERC-GCN: Speech Emotion Recognition In Conversation Using Graph Convolutional Networks [paper](https://www.academia.edu/download/110735232/ICASSP_2024_1_.pdf)
- Conversation Clique-Based Model for Emotion Recognition In Conversation [paper](https://ieeexplore.ieee.org/document/10446226)
- Speaker-Centric Multimodal Fusion Networks for Emotion Recognition in Conversations [paper](https://ieeexplore.ieee.org/document/10447720)

##### Multimodal Speech(speech+text)

- Large Language Model-Based Emotional Speech Annotation Using Context and Acoustic Feature for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10448316)
- MF-AED-AEC: Speech Emotion Recognition by Leveraging Multimodal Fusion, Asr Error Detection, and Asr Error Correction [paper](https://arxiv.org/html/2401.13260v1)
- GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Accurate Speech Emotion Recognition [paper](https://arxiv.org/html/2306.07848v10)

##### Multimodal

- Fine-Grained Disentangled Representation Learning For Multimodal Emotion Recognition [paper](https://arxiv.org/html/2312.13567v1)
- Improving Multi-Modal Emotion Recognition Using Entropy-Based Fusion and Pruning-Based Network Architecture Optimization [paper](https://ieeexplore.ieee.org/document/10447231)
- Multi-Grained Multimodal Interaction Network for Sentiment Analysis [paper](https://ieeexplore.ieee.org/document/10446351)
- Fusing Modality-Specific Representations and Decisions for Multimodal Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447035)
- AttA-NET: Attention Aggregation Network for Audio-Visual Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447640) [code](https://github.com/NariFan2002/AttA-NET)
- MMRBN: Rule-Based Network for Multimodal Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447930)
- Inter-Modality and Intra-Sample Alignment for Multi-Modal Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10446571)
- RL-EMO: A Reinforcement Learning Framework for Multimodal Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10446459) [code](https://github.com/zyh9929/RL-EMO)
- Multi-Modal Emotion Recognition Using Multiple Acoustic Features and Dual Cross-Modal Transformer [paper](https://ieeexplore.ieee.org/document/10447830)

##### others

- AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models [paper](https://arxiv.org/pdf/2309.10787)

### 2023 papers

#### EMNLP2023

- Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis [paper](https://arxiv.org/abs/2310.05804) [code](https://github.com/Haoyu-ha/ALMT)

#### NeurIPS 2023

- Incomplete Multimodality-Diffused Emotion Recognition [paper](https://proceedings.neurips.cc/paper_files/paper/2023/hash/372cb7805eaccb2b7eed641271a30eec-Abstract-Conference.html) [code](https://github.com/mdswyz/IMDer)

#### interspeech 2023

##### Multimodal Speech(speech+text)

- LanSER: Language-Model Supported Speech Emotion Recognition [paper](https://arxiv.org/abs/2309.03978)
- Fine-tuned RoBERTa Model with a CNN-LSTM Network for Conversational Emotion [paper](https://www.isca-speech.org/archive/interspeech_2023/luo23_interspeech.html)
- Emotion Label Encoding Using Word Embeddings for Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/stanley23_interspeech.html)
- Discrimination of the Different Intents Carried by the Same Text Through Integrating Multimodal Information [paper](https://www.isca-speech.org/archive/interspeech_2023/li23ia_interspeech.html)
- Meta-domain Adversarial Contrastive Learning for Alleviating Individual Bias in Self-sentiment Predictions [paper](https://www.isca-speech.org/archive/interspeech_2023/li23f_interspeech.html)
- SWRR: Feature Map Classifier Based on Sliding Window Attention and High-Response Feature Reuse for Multimodal Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/zhao23b_interspeech.html)
- Focus-attention-enhanced Crossmodal Transformer with Metric Learning for Multimodal Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/kim23c_interspeech.html)
- Learning Emotional Representations from Imbalanced Speech Data for Speech Emotion Recognition and Emotional Text-to-Speech [paper](https://www.isca-speech.org/archive/interspeech_2023/wang23ka_interspeech.html)
- MMER: Multimodal Multi-task Learning for Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/ghosh23b_interspeech.html)
- A Dual Attention-based Modality-Collaborative Fusion Network for Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/zhang23g_interspeech.html) [code](https://github.com/zxiaohen/ Speech-emotion-recognition-MCFN)
- Focus-attention-enhanced Crossmodal Transformer with Metric Learning for Multimodal Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/kim23c_interspeech.html)
- Speaker-aware Cross-modal Fusion Architecture for Conversational Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/zhao23e_interspeech.html)
- Emotion Prompting for Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/zhou23f_interspeech.html)
- EmotionNAS: Two-stream Neural Architecture Search for Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/sun23d_interspeech.html) *
- Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models [paper](https://www.isca-speech.org/archive/interspeech_2023/deoliveira23_interspeech.html)
- Leveraging Label Information for Multimodal Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/wang23ma_interspeech.html)
- Improving Joint Speech and Emotion Recognition Using Global Style Tokens [paper](https://www.isca-speech.org/archive/interspeech_2023/kyung23_interspeech.html)
- Dual Memory Fusion for Multimodal Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/prisayad23_interspeech.html) *

##### Speech

- Multi-Scale Temporal Transformer For Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/li23m_interspeech.html) *
- Speech Emotion Recognition by Estimating Emotional Label Sequences with Phoneme Class Attribute [paper](https://www.isca-speech.org/archive/interspeech_2023/nagase23_interspeech.html)
- Unsupervised Transfer Components Learning for Cross-Domain Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/jiang23_interspeech.html)
- Speech Emotion Recognition using Decomposed Speech via Multi-task Learning [paper](https://www.isca-speech.org/archive/interspeech_2023/hsu23_interspeech.html)

##### Others

- Cross-Lingual Cross-Age Adaptation for Low-Resource Elderly Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/cahyawijaya23_interspeech.html)
- MetricAug: A Distortion Metric-Lead Augmentation Strategy for Training Noise-Robust Speech Emotion [paper](https://www.isca-speech.org/archive/interspeech_2023/wu23c_interspeech.html) [code](https://github.com/crowpeter/MetricAug) *
- Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations [paper](https://www.isca-speech.org/archive/interspeech_2023/wu23_interspeech.html)
- Two-stage Finetuning of Wav2vec 2.0 for Speech Emotion Recognition with ASR and Gender Pretraining [paper](https://www.isca-speech.org/archive/interspeech_2023/gao23d_interspeech.html)
- Diverse Feature Mapping and Fusion via Multitask Learning for Multilingual Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/lee23g_interspeech.html)
- Hybrid Dataset for Speech Emotion Recognition in Russian Language [paper](https://www.isca-speech.org/archive/interspeech_2023/kondratenko23_interspeech.html)

#### ICASSP 2023

##### Multimodal Speech (speech+text)

- Exploring Complementary Features in Multi-Modal Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10096709)
- Cross-Modal Fusion Techniques for Utterance-Level Emotion Recognition from Text and Speech [paper](https://arxiv.org/abs/2302.02447)
- Using Auxiliary Tasks In Multimodal Fusion of Wav2vec 2.0 And Bert for Multimodal Emotion Recognition [paper](https://arxiv.org/abs/2302.13661)
- Robust multi-modal speech emotion recognition with ASR error adaptation [paper](https://ieeexplore.ieee.org/document/10094839)
- Multilevel Transformer for Multimodal Emotion Recognition [paper](https://arxiv.org/abs/2211.07711)
- MGAT: Multi-Granularity Attention Based Transformers for Multi-Modal Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10095855)
- Knowledge-Aware Bayesian Co-Attention for Multimodal Emotion Recognition [paper](https://arxiv.org/abs/2302.09856)
- Exploiting Modality-Invariant Feature for Robust Multimodal Emotion Recognition with Missing Modalities [paper](https://arxiv.org/abs/2210.15359) [code](https://github.com/ZhuoYulang/IF-MMIN)
- Multimodal Emotion Recognition Based on Deep Temporal Features Using Cross-Modal Transformer and Self-Attention [paper](https://ieeexplore.ieee.org/document/10096937) [code](https://github.com/bubaimaji/cmt-mser)
- Exploring Attention Mechanisms for Multimodal Emotion Recognition in an Emergency Call Center Corpus [paper](https://arxiv.org/abs/2306.07115)

##### Multimodal Speech-Visual

- Recursive Joint Attention for Audio-Visual Fusion in Regression Based Emotion Recognition [paper](https://arxiv.org/abs/2304.07958) [code](https://github.com/praveena2j/RecurrentJointAttentionwithLSTMs)
- Learning Cross-Modal Audiovisual Representations with Ladder Networks for Emotion Recognition [paper](https://ecs.utdallas.edu/research/researchlabs/msp-lab/publications/Goncalves_2023.pdf)

##### Speech

- DST: Deformable Speech Transformer for Emotion Recognition [paper](https://arxiv.org/abs/2302.13729)
- Multiple Acoustic Features Speech Emotion Recognition Using Cross-Attention Transformer [paper](https://ieeexplore.ieee.org/document/10095777)
- Speech Emotion Recognition Via Two-Stream Pooling Attention With Discriminative Channel Weighting [paper](https://ieeexplore.ieee.org/document/10095588)
- Speech Emotion Recognition via Heterogeneous Feature Learning [paper](https://ieeexplore.ieee.org/document/10095566)
- Pre-Trained Model Representations and Their Robustness Against Noise for Speech Emotion Analysis [paper](https://arxiv.org/abs/2303.03177)
- Learning Robust Self-Attention Features for Speech Emotion Recognition with Label-Adaptive Mixup [paper](https://arxiv.org/abs/2305.06273) [code](https://github.com/leitro/LabelAdaptiveMixup-SER)
- Hierarchical Network with Decoupled Knowledge Distillation for Speech Emotion Recognition [paper](https://arxiv.org/abs/2303.05134)
- Adapting a Self-Supervised Speech Representation for Noisy Speech Emotion Recognition by Using Contrastive Teacher-Student Learning [paper](https://ecs.utdallas.edu/research/researchlabs/msp-lab/publications/Leem_2023.pdf)
- Fast Yet Effective Speech Emotion Recognition with Self-Distillation [paper](https://arxiv.org/abs/2210.14636) [code](https://github.com/leibniz-future-lab/SelfDistill-SER)
- General or Specific? Investigating Effective Privacy Protection in Federated Learning for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10096844)
- Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing [paper](https://arxiv.org/abs/2211.01756) [code](https://github.com/skakouros/s3prl_attentive_correlation)
- Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition [paper](https://arxiv.org/abs/2211.08233) [code](https://github.com/Jiaxin-Ye/TIM-Net_SER)
- Phonetic Anchor-Based Transfer Learning to Facilitate Unsupervised Cross-Lingual Speech Emotion Recognition [paper](https://ecs.utdallas.edu/research/researchlabs/msp-lab/publications/Upadhyay_2023.pdf)
- Knowledge Transfer for on-Device Speech Emotion Recognition With Neural Structured Learning [paper](https://arxiv.org/abs/2210.14977) [code](https://github.com/glam-imperial/NSL-SER)
- Speech Emotion Recognition Based on Low-Level Auto-Extracted Time-Frequency Features [paper](https://ieeexplore.ieee.org/document/10095260)
- Role of Lexical Boundary Information in Chunk-Level Segmentation for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10096861)
- Zero-Shot Speech Emotion Recognition Using Generative Learning with Reconstructed Prototypes [paper](https://ieeexplore.ieee.org/document/10094888)
- A Generalized Subspace Distribution Adaptation Framework for Cross-Corpus Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10097258)
- exploring Wav2vec 2.0 Fine Tuning for Improved Speech Emotion Recognition [paper](https://arxiv.org/abs/2110.06309) [code](https://github.com/b04901014/FT-w2v2-ser)
- DWFormer: Dynamic Window Transformer for Speech Emotion Recognition [paper](https://arxiv.org/abs/2303.01694) [code](https://github.com/scutcsq/DWFormer)
- Deep Implicit Distribution Alignment Networks for cross-Corpus Speech Emotion Recognition [paper](https://arxiv.org/abs/2302.08921)
- Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-Trained Representations [paper](https://arxiv.org/abs/2302.13277) [code](https://github.com/ECNU-Cross-Innovation-Lab/ShiftSER)
- Designing and Evaluating Speech Emotion Recognition Systems: A Reality Check Case Study with IEMOCAP [paper](https://arxiv.org/abs/2304.00860)
- EMIX: A Data Augmentation Method for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10096789)
- Multi-View Learning for Speech Emotion Recognition with Categorical Emotion, Categorical Sentiment, and Dimensional Scores [paper](https://ieeexplore.ieee.org/document/10096700)
- An Empirical Study and Improvement for Speech Emotion Recognition [paper](https://arxiv.org/abs/2304.03899)
- Towards Learning Emotion Information from Short Segments of Speech [paper](https://ieeexplore.ieee.org/document/10095892)

##### Conversation

- Knowledge-Aware Graph Convolutional Network with Utterance-Specific Window Search for Emotion Recognition In Conversations [paper](https://ieeexplore.ieee.org/document/10095097)
- Multi-Scale Receptive Field Graph Model for Emotion Recognition in Conversations [paper](https://ieeexplore.ieee.org/document/10094596)
- SDTN: Speaker Dynamics Tracking Network for Emotion Recognition in Conversation [paper](https://ieeexplore.ieee.org/document/10094810)
- Emotion Recognition in Conversation from Variable-Length Context [paper](https://ieeexplore.ieee.org/document/10096161)

##### Other

- Ensemble Knowledge Distillation of Self-Supervised Speech Models [paper](https://arxiv.org/abs/2302.12757)
- Domain Adaptation without Catastrophic Forgetting on a Small-Scale Partially-Labeled Corpus for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10096578)
- Shuffleaugment: A Data Augmentation Method Using Time Shuffling [paper](https://ieeexplore.ieee.org/document/10096927)
- Achieving Fair Speech Emotion Recognition via Perceptual Fairness [paper](https://www.researchgate.net/profile/Woan-Shiuan-Chien/publication/368719577_Achieving_Fair_Speech_Emotion_Recognition_via_Perceptual_Fairness/links/6455dcef97449a0e1a7dd51d/Achieving-Fair-Speech-Emotion-Recognition-via-Perceptual-Fairness.pdf)
- Unsupervised Domain Adaptation for Preference Learning Based Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10094301)
- Using Emotion Embeddings to Transfer Knowledge between Emotions, Languages, and Annotation Formats [paper](https://arxiv.org/abs/2211.00171) [code](https://github.com/gchochla/Demux-MEmo)
- QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis [paper](https://ieeexplore.ieee.org/document/10095623)
- A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition [paper](https://arxiv.org/abs/2303.08027)

#### IJCAI 2023

- Mimicking the Thinking Process for Emotion Recognition in Conversation with Prompts and Paraphrasing [paper](https://arxiv.org/abs/2306.06601)

#### AAAI 2023

##### Conversation

- SKIER: A Symbolic Knowledge Integrated Model for Conversational Emotion Recognition [paper](https://ojs.aaai.org/index.php/AAAI/article/view/26541)
- BERT-ERC: Fine-Tuning BERT Is Enough for Emotion Recognition in Conversation [paper](https://ojs.aaai.org/index.php/AAAI/article/view/26582)

##### Other

- Feature Normalization and Cartography-Based Demonstrations for Prompt-Based Fine-Tuning on Emotion-Related Tasks [paper](https://ojs.aaai.org/index.php/AAAI/article/view/26514)

#### ACL 2023

##### Multimodal

- Layer-wise Fusion with Modality Independence Modeling for Multi-modal Emotion Recognition [paper](https://aclanthology.org/2023.acl-long.39/) [code](https://github.com/sunjunaimer/LFMIM)
- ConKI: Contrastive Knowledge Injection for Multimodal Sentiment Analysis [paper](https://aclanthology.org/2023.findings-acl.860/)
- ConFEDE: Contrastive Feature Decomposition for Multimodal Sentiment Analysis [paper](https://aclanthology.org/2023.acl-long.421/)
- Topic and Style-aware Transformer for Multimodal Emotion Recognition [paper](https://aclanthology.org/2023.findings-acl.130/)
- Self-adaptive Context and Modal-interaction Modeling For Multimodal Emotion Recognition [paper](https://aclanthology.org/2023.findings-acl.390/)
- QAP: A Quantum-Inspired Adaptive-Priority-Learning Model for Multimodal Emotion Recognition [paper](https://aclanthology.org/2023.findings-acl.772/)

##### Conversation

- Supervised Adversarial Contrastive Learning for Emotion Recognition in Conversations [paper](https://arxiv.org/abs/2306.01505) [code](https://github.com/zerohd4869/sacl)
- MultiEMO: An Attention-Based Correlation-Aware Multimodal Fusion Framework for Emotion Recognition in Conversations [paper](https://aclanthology.org/2023.acl-long.824/)
- DualGATs: Dual Graph Attention Networks for Emotion Recognition in Conversations [paper](https://aclanthology.org/2023.acl-long.408/)
- A Cross-Modality Context Fusion and Semantic Refinement Network for Emotion Recognition in Conversation [paper](https://aclanthology.org/2023.acl-long.732/)
- A Facial Expression-Aware Multimodal Multi-task Learning Framework for Emotion Recognition in Multi-party Conversations [paper](https://aclanthology.org/2023.acl-long.861/)

##### Other

- Label-Aware Hyperbolic Embeddings for Fine-grained Emotion Classification [paper](https://arxiv.org/abs/2306.14822) [code](https://github.com/dinobby/HypEmo)
- Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression [paper](https://arxiv.org/abs/2306.06760)

### 2022 papers

#### AAAI 2022

##### Multimodal

- Tailor Versatile Multi-Modal Learning for Multi-Label Emotion Recognition [paper](https://ojs.aaai.org/index.php/AAAI/article/view/20895)

##### Conversation

- Hybrid Curriculum Learning for Emotion Recognition in Conversation [paper](https://ojs.aaai.org/index.php/AAAI/article/view/21413)
- Contrast and Generation Make BART a Good Dialogue Emotion Recognizer [paper](https://ojs.aaai.org/index.php/AAAI/article/view/21348)
- Is Discourse Role Important for Emotion Recognition in Conversation? [paper](https://ojs.aaai.org/index.php/AAAI/article/view/21361)

#### IJCAI 2022

##### Speech

- CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for Single-Corpus and Cross-Corpus Speech Emotion Recognition [paper](https://arxiv.org/abs/2207.10644) [code](https://github.com/MLDMXM2017/CTLMTNet)

##### Conversation

- Speaker-Guided Encoder-Decoder Framework for Emotion Recognition in Conversation [paper](https://arxiv.org/abs/2206.03173)
- CauAIN: Causal Aware Interaction Network for Emotion Recognition in Conversations [paper](https://www.ijcai.org/proceedings/2022/0628.pdf)

##### Other

- Online ECG Emotion Recognition for Unknown Subjects via Hypergraph-Based Transfer Learning [paper](https://www.ijcai.org/proceedings/2022/0509.pdf)

#### ACM MM 2022

##### Multimodal

- Counterfactual Reasoning for Out-of-distribution Multimodal Sentiment Analysis [paper](https://arxiv.org/abs/2207.11652)
- Unified Multi-modal Pre-training for Few-shot Sentiment Analysis with Prompt-based Learning [paper](https://dl.acm.org/doi/10.1145/3503161.3548306)

##### Vision

- Towards Unbiased Visual Emotion Recognition via Causal Intervention [paper](https://arxiv.org/abs/2107.12096)

##### Speech

- Unsupervised Domain Adaptation Integrating Transformer and Mutual Information for Cross-Corpus Speech Emotion Recognition [paper](https://dl.acm.org/doi/10.1145/3503161.3548328)

##### Other

- SER30K: A Large-Scale Dataset for Sticker Emotion Recognition [paper](https://dl.acm.org/doi/10.1145/3503161.3548407) [code](https://github.com/nku-shengzheliu/SER30K)

#### NAACL 2022

##### Multimodal

- COGMEN: COntextualized GNN based Multimodal Emotion recognitioN [paper](https://arxiv.org/abs/2205.02455) [code](https://github.com/exploration-lab/cogmen)

### 2020 papers

#### AAAI 2020

##### Multimodal

- M3ER: Multiplicative Multimodal Emotion Recognition using Facial, Textual, and Speech Cues [paper](https://ojs.aaai.org/index.php/AAAI/article/view/5492)