https://github.com/ultramarine-indigo/awesome-AI-for-emotion-recognition-papers

A list of papers (with available code), tutorials, and surveys on recent AI for emotion recognition (AI4ER)
https://github.com/ultramarine-indigo/awesome-AI-for-emotion-recognition-papers
Last synced: 6 months ago
JSON representation
A list of papers (with available code), tutorials, and surveys on recent AI for emotion recognition (AI4ER)
Host: GitHub
URL: https://github.com/ultramarine-indigo/awesome-AI-for-emotion-recognition-papers
Owner: ultramarine-indigo
Created: 2023-10-24T12:26:01.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-04-02T14:22:13.000Z (about 1 year ago)
Last Synced: 2024-04-18T13:41:45.371Z (about 1 year ago)
Homepage:
Size: 41 KB
Stars: 5
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

ultimate-awesome - awesome-AI-for-emotion-recognition-papers - A list of papers (with available code), tutorials, and surveys on recent AI for emotion recognition (AI4ER). (Other Lists / Julia Lists)
README

        # awesome-AI-for-emotion-recognition-papers

A  list of papers (with available code), tutorials, and surveys on recent AI for emotion recognition (AI4ER)

- [Emotion Recognition Surveys](#emotion-recognition-surveys)

  - [General Emotion Recognition Survey](#general-emotion-recognition-survey)

  - [Multimodal](#multimodal)

  - [Multimodal Speech](#multimodal-speech)

  - [Speech](#speech)

  - [Conversation-Text](#conversation-text)

- [2024 papers](#2024-papers)

  - [ICASSP 2024](#icassp-2024)

    - [speech](#speech-1)

    - [conversation](#conversation)

    - [Multimodal Speech（speech+text）](#multimodal-speechspeechtext)

    - [Multimodal](#multimodal-1)

    - [others](#others)

- [2023 papers](#2023-papers)

  - [EMNLP2023](#emnlp2023)

  - [NeurIPS 2023](#neurips-2023)

  - [interspeech 2023](#interspeech-2023)

    - [Multimodal Speech（speech+text）](#multimodal-speechspeechtext-1)

    - [Speech](#speech-2)

    - [Others](#others-1)

  - [ICASSP 2023](#icassp-2023)

    - [Multimodal Speech （speech+text）](#multimodal-speech-speechtext)

    - [Multimodal Speech-Visual](#multimodal-speech-visual)

    - [Speech](#speech-3)

    - [Conversation](#conversation-1)

    - [Other](#other)

  - [IJCAI 2023](#ijcai-2023)

  - [AAAI 2023](#aaai-2023)

    - [Conversation](#conversation-2)

    - [Other](#other-1)

  - [ACL 2023](#acl-2023)

    - [Multimodal](#multimodal-2)

    - [Conversation](#conversation-3)

    - [Other](#other-2)

- [2022 papers](#2022-papers)

  - [AAAI 2022](#aaai-2022)

    - [Multimodal](#multimodal-3)

    - [Conversation](#conversation-4)

  - [IJCAI 2022](#ijcai-2022)

    - [Speech](#speech-4)

    - [Conversation](#conversation-5)

    - [Other](#other-3)

  - [ACM MM 2022](#acm-mm-2022)

    - [Multimodal](#multimodal-4)

    - [Vision](#vision)

    - [Speech](#speech-5)

    - [Other](#other-4)

  - [NAACL 2022](#naacl-2022)

    - [Multimodal](#multimodal-5)

- [2020 papers](#2020-papers)

  - [AAAI 2020](#aaai-2020)

    - [Multimodal](#multimodal-6)

### Emotion Recognition Surveys

#### General Emotion Recognition Survey- [awesome-AI-for-emotion-recognition-papers]

- Emotion Recognition and Detection Methods: A Comprehensive Survey [paper](https://iecscience.org/jpapers/46)

- A systematic review on affective computing: emotion models, databases, and recent advances [paper](https://arxiv.org/abs/2203.06935)

#### Multimodal 

- Multimodal Emotion Recognition using Deep Learning [paper](https://www.jastt.org/index.php/jasttpath/article/view/91)

- Emotion recognition using multi-modal data and machine learning techniques: A tutorial and review [paper](https://drive.google.com/file/d/1wGagPpwhGPKpOVpcVmMLIKyU2fUWDYI-/view)

- Deep learning-based multimodal emotion recognition from audio, visual, and text modalities: A systematic review of recent advancements and future prospects [paper](https://www.sciencedirect.com/science/article/abs/pii/S0957417423021942)

- Emotion recognition from unimodal to multimodal analysis: A review 

#### Multimodal Speech

- Deep Multimodal Emotion Recognition on Human Speech: A Review [paper](https://www.mdpi.com/2076-3417/11/17/7962)

#### Speech

- Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers [paper](https://www.researchgate.net/profile/John-James-3/post/What-are-the-public-datasets-available-for-emotion-recognition/attachment/5f3771e2ce377e00016d4946/AS%3A924666805886978%401597469154666/download/Speech-emotion-recognition--Emotional-models--databases--feat_2020_Speech-Co.pdf)

- A Comprehensive Review of Speech Emotion Recognition Systems [paper](https://ieeexplore.ieee.org/iel7/6287639/9312710/09383000.pdf)

- Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models [paper](https://www.mdpi.com/1424-8220/21/4/1249)

- A Review on Speech Emotion Recognition Using Deep Learning and Attention Mechanism [paper](https://www.mdpi.com/2079-9292/10/10/1163)

#### Conversation-Text

- 

### 2024 papers

#### ICASSP 2024

##### speech

- TRUST-SER: On The Trustworthiness Of Fine-Tuning Pre-Trained Speech Embeddings For Speech Emotion Recognition [paper](https://arxiv.org/pdf/2305.11229) [code](https://github.com/usc-sail/trust-ser)

- Emohrnet: High-Resolution Neural Network Based Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10446976)

- Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation [paper](https://arxiv.org/pdf/2402.11747)

- Investigating Salient Representations and Label Variance in Dimensional Speech Emotion Analysis [paper](https://arxiv.org/html/2312.16180v1)

- Adaptive Speech Emotion Representation Learning Based On Dynamic Graph [paper](https://ieeexplore.ieee.org/document/10447829)

- Leveraging Speech PTM, Text LLM, And Emotional TTS For Speech Emotion Recognition [paper](https://arxiv.org/pdf/2309.10294)

- Enhancing Two-Stage Finetuning for Speech Emotion Recognition Using Adapters [paper](https://ieeexplore.ieee.org/document/10446645)

- Frame-Level Emotional State Alignment Method for Speech Emotion Recognition [paper](https://arxiv.org/pdf/2312.16383) [code](https://github.com/ASolitaryMan/HFLEA) 

- Gradient-Based Dimensionality Reduction for Speech Emotion Recognition Using Deep Networks [paper](https://ieeexplore.ieee.org/document/10447616) [code](https://github.com/hxwangnus/Grad-based-Dim-Red-for-SER) 

- Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition [paper](https://arxiv.org/html/2401.10536v1)

- Disentanglement Network: Disentangle the Emotional Features from Acoustic Features for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10448044)

- Improving Speaker-Independent Speech Emotion Recognition using Dynamic Joint Distribution Adaptation [paper](https://arxiv.org/html/2401.09752v1)

- Comparing data-Driven and Handcrafted Features for Dimensional Emotion Recognition [paper](https://publications.idiap.ch/attachments/papers/2024/Vlasenko_ICASSP_2024.pdf)

- Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition [paper](https://arxiv.org/html/2401.12925v1)

- Balancing Speaker-Rater Fairness for Gender-Neutral Speech Emotion Recognition [paper](https://www.researchgate.net/profile/Woan-Shiuan-Chien/publication/376799317_Balancing_Speaker-Rater_Fairness_For_Gender-Neutral_Speech_Emotion_Recognition/links/6589283d0bb2c7472b0d10ee/Balancing-Speaker-Rater-Fairness-For-Gender-Neutral-Speech-Emotion-Recognition.pdf)

- Prompting Audios Using Acoustic Properties for Emotion Representation [paper](https://arxiv.org/pdf/2310.02298)

- Speech Emotion Recognition with Distilled Prosodic and Linguistic Affect Representations [paper](https://arxiv.org/pdf/2309.04849)

- Generalization of Self-Supervised Learning-Based Representations for Cross-Domain Speech Emotion Recognition [paper](https://ecs.utdallas.edu/research/researchlabs/msp-lab/publications/Naini_2024.pdf)

- Dynamic Speech Emotion Recognition Using A Conditional Neural Process [paper](https://ecs.utdallas.edu/research/researchlabs/msp-lab/publications/Martinez-Lucas_2024.pdf)

- Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition [paper](https://arxiv.org/pdf/2401.11017)

- Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting [paper](https://arxiv.org/pdf/2309.08108)

- MS-SENet: Enhancing Speech Emotion Recognition Through Multi-Scale Feature Fusion with Squeeze-and-Excitation Blocks [paper](https://arxiv.org/html/2312.11974v2)

- Cubic Knowledge Distillation for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447713) [code](https://github.com/Fly1toMoon/Cubic-Knowledge-Distillation)

- Improving Speech Emotion Recognition with Unsupervised Speaking Style Transfer [paper](https://arxiv.org/pdf/2211.08843)

- Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition [paper](https://arxiv.org/pdf/2403.19224) [code](https://github.com/ECNU-Cross-Innovation-Lab/ENT) 

- Multi-Source Unsupervised Transfer Components Learning for Cross-Domain Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10446499)

- Self-Supervised Domain Exploration with an Optimal Transport Regularization for Open Set Cross-Domain Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447482) 

- Towards Improving Speech Emotion Recognition Using Synthetic Data Augmentation from Emotion Conversion [paper](https://hal.science/hal-04364976/document)

##### conversation

- Esihgnn: Event-State Interactions Infused Heterogeneous Graph Neural Network for Conversational Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447592)

- MCM-CSD: Multi-Granularity Context Modeling with Contrastive Speaker Detection for Emotion Recognition in Real-Time Conversation [paper](https://ieeexplore.ieee.org/document/10446410) [code](https://github.com/WHOISJENNY/MCM-CSD)

- SERC-GCN: Speech Emotion Recognition In Conversation Using Graph Convolutional Networks [paper](https://www.academia.edu/download/110735232/ICASSP_2024_1_.pdf)

- Conversation Clique-Based Model for Emotion Recognition In Conversation [paper](https://ieeexplore.ieee.org/document/10446226)

- Speaker-Centric Multimodal Fusion Networks for Emotion Recognition in Conversations [paper](https://ieeexplore.ieee.org/document/10447720)

##### Multimodal Speech（speech+text） 

- Large Language Model-Based Emotional Speech Annotation Using Context and Acoustic Feature for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10448316)

- MF-AED-AEC: Speech Emotion Recognition by Leveraging Multimodal Fusion, Asr Error Detection, and Asr Error Correction [paper](https://arxiv.org/html/2401.13260v1)

- GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Accurate Speech Emotion Recognition [paper](https://arxiv.org/html/2306.07848v10) 

##### Multimodal

- Fine-Grained Disentangled Representation Learning For Multimodal Emotion Recognition [paper](https://arxiv.org/html/2312.13567v1)

- Improving Multi-Modal Emotion Recognition Using Entropy-Based Fusion and Pruning-Based Network Architecture Optimization [paper](https://ieeexplore.ieee.org/document/10447231)

- Multi-Grained Multimodal Interaction Network for Sentiment Analysis [paper](https://ieeexplore.ieee.org/document/10446351)

- Fusing Modality-Specific Representations and Decisions for Multimodal Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447035)

- AttA-NET: Attention Aggregation Network for Audio-Visual Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447640) [code](https://github.com/NariFan2002/AttA-NET)

- MMRBN: Rule-Based Network for Multimodal Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10447930) 

- Inter-Modality and Intra-Sample Alignment for Multi-Modal Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10446571)

- RL-EMO: A Reinforcement Learning Framework for Multimodal Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10446459) [code](https://github.com/zyh9929/RL-EMO) 

- Multi-Modal Emotion Recognition Using Multiple Acoustic Features and Dual Cross-Modal Transformer [paper](https://ieeexplore.ieee.org/document/10447830)

##### others

- AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models [paper](https://arxiv.org/pdf/2309.10787)

### 2023 papers

#### EMNLP2023

- Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis [paper](https://arxiv.org/abs/2310.05804) [code](https://github.com/Haoyu-ha/ALMT)

#### NeurIPS 2023 

- Incomplete Multimodality-Diffused Emotion Recognition [paper](https://proceedings.neurips.cc/paper_files/paper/2023/hash/372cb7805eaccb2b7eed641271a30eec-Abstract-Conference.html)  [code](https://github.com/mdswyz/IMDer)

#### interspeech 2023

##### Multimodal Speech（speech+text）

- LanSER: Language-Model Supported Speech Emotion Recognition [paper](https://arxiv.org/abs/2309.03978)

- Fine-tuned RoBERTa Model with a CNN-LSTM Network for Conversational Emotion [paper](https://www.isca-speech.org/archive/interspeech_2023/luo23_interspeech.html) 

- Emotion Label Encoding Using Word Embeddings for Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/stanley23_interspeech.html)

- Discrimination of the Different Intents Carried by the Same Text Through Integrating Multimodal Information [paper](https://www.isca-speech.org/archive/interspeech_2023/li23ia_interspeech.html)

- Meta-domain Adversarial Contrastive Learning for Alleviating Individual Bias in Self-sentiment Predictions [paper](https://www.isca-speech.org/archive/interspeech_2023/li23f_interspeech.html)

- SWRR: Feature Map Classifier Based on Sliding Window Attention and High-Response Feature Reuse for Multimodal Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/zhao23b_interspeech.html)

- Focus-attention-enhanced Crossmodal Transformer with Metric Learning for Multimodal Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/kim23c_interspeech.html)

- Learning Emotional Representations from Imbalanced Speech Data for Speech Emotion Recognition and Emotional Text-to-Speech [paper](https://www.isca-speech.org/archive/interspeech_2023/wang23ka_interspeech.html)

- MMER: Multimodal Multi-task Learning for Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/ghosh23b_interspeech.html)

- A Dual Attention-based Modality-Collaborative Fusion Network for Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/zhang23g_interspeech.html) [code](https://github.com/zxiaohen/ Speech-emotion-recognition-MCFN)

- Focus-attention-enhanced Crossmodal Transformer with Metric Learning for Multimodal Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/kim23c_interspeech.html)

- Speaker-aware Cross-modal Fusion Architecture for Conversational Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/zhao23e_interspeech.html)

- Emotion Prompting for Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/zhou23f_interspeech.html)

- EmotionNAS: Two-stream Neural Architecture Search for Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/sun23d_interspeech.html) *

- Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models [paper](https://www.isca-speech.org/archive/interspeech_2023/deoliveira23_interspeech.html)

- Leveraging Label Information for Multimodal Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/wang23ma_interspeech.html)

- Improving Joint Speech and Emotion Recognition Using Global Style Tokens [paper](https://www.isca-speech.org/archive/interspeech_2023/kyung23_interspeech.html)

- Dual Memory Fusion for Multimodal Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/prisayad23_interspeech.html) *

##### Speech

- Multi-Scale Temporal Transformer For Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/li23m_interspeech.html) *

- Speech Emotion Recognition by Estimating Emotional Label Sequences with Phoneme Class Attribute [paper](https://www.isca-speech.org/archive/interspeech_2023/nagase23_interspeech.html)

- Unsupervised Transfer Components Learning for Cross-Domain Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/jiang23_interspeech.html)

- Speech Emotion Recognition using Decomposed Speech via Multi-task Learning [paper](https://www.isca-speech.org/archive/interspeech_2023/hsu23_interspeech.html)

##### Others

- Cross-Lingual Cross-Age Adaptation for Low-Resource Elderly Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/cahyawijaya23_interspeech.html)

- MetricAug: A Distortion Metric-Lead Augmentation Strategy for Training Noise-Robust Speech Emotion [paper](https://www.isca-speech.org/archive/interspeech_2023/wu23c_interspeech.html)  [code](https://github.com/crowpeter/MetricAug) *

- Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations [paper](https://www.isca-speech.org/archive/interspeech_2023/wu23_interspeech.html)

- Two-stage Finetuning of Wav2vec 2.0 for Speech Emotion Recognition with ASR and Gender Pretraining [paper](https://www.isca-speech.org/archive/interspeech_2023/gao23d_interspeech.html)

- Diverse Feature Mapping and Fusion via Multitask Learning for Multilingual Speech Emotion Recognition [paper](https://www.isca-speech.org/archive/interspeech_2023/lee23g_interspeech.html)

- Hybrid Dataset for Speech Emotion Recognition in Russian Language [paper](https://www.isca-speech.org/archive/interspeech_2023/kondratenko23_interspeech.html)

#### ICASSP 2023

##### Multimodal Speech （speech+text）

- Exploring Complementary Features in Multi-Modal Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10096709)

- Cross-Modal Fusion Techniques for Utterance-Level Emotion Recognition from Text and Speech [paper](https://arxiv.org/abs/2302.02447)

- Using Auxiliary Tasks In Multimodal Fusion of Wav2vec 2.0 And Bert for Multimodal Emotion Recognition [paper](https://arxiv.org/abs/2302.13661)

- Robust multi-modal speech emotion recognition with ASR error adaptation [paper](https://ieeexplore.ieee.org/document/10094839)

- Multilevel Transformer for Multimodal Emotion Recognition [paper](https://arxiv.org/abs/2211.07711)

- MGAT: Multi-Granularity Attention Based Transformers for Multi-Modal Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10095855)

- Knowledge-Aware Bayesian Co-Attention for Multimodal Emotion Recognition [paper](https://arxiv.org/abs/2302.09856) 

- Exploiting Modality-Invariant Feature for Robust Multimodal Emotion Recognition with Missing Modalities [paper](https://arxiv.org/abs/2210.15359) [code](https://github.com/ZhuoYulang/IF-MMIN)

- Multimodal Emotion Recognition Based on Deep Temporal Features Using Cross-Modal Transformer and Self-Attention [paper](https://ieeexplore.ieee.org/document/10096937) [code](https://github.com/bubaimaji/cmt-mser)

- Exploring Attention Mechanisms for Multimodal Emotion Recognition in an Emergency Call Center Corpus [paper](https://arxiv.org/abs/2306.07115) 

##### Multimodal Speech-Visual

- Recursive Joint Attention for Audio-Visual Fusion in Regression Based Emotion Recognition [paper](https://arxiv.org/abs/2304.07958) [code](https://github.com/praveena2j/RecurrentJointAttentionwithLSTMs)

- Learning Cross-Modal Audiovisual Representations with Ladder Networks for Emotion Recognition [paper](https://ecs.utdallas.edu/research/researchlabs/msp-lab/publications/Goncalves_2023.pdf)

##### Speech

- DST: Deformable Speech Transformer for Emotion Recognition [paper](https://arxiv.org/abs/2302.13729)

- Multiple Acoustic Features Speech Emotion Recognition Using Cross-Attention Transformer [paper](https://ieeexplore.ieee.org/document/10095777)

- Speech Emotion Recognition Via Two-Stream Pooling Attention With Discriminative Channel Weighting [paper](https://ieeexplore.ieee.org/document/10095588)

- Speech Emotion Recognition via Heterogeneous Feature Learning [paper](https://ieeexplore.ieee.org/document/10095566)

- Pre-Trained Model Representations and Their Robustness Against Noise for Speech Emotion Analysis [paper](https://arxiv.org/abs/2303.03177)

- Learning Robust Self-Attention Features for Speech Emotion Recognition with Label-Adaptive Mixup [paper](https://arxiv.org/abs/2305.06273) [code](https://github.com/leitro/LabelAdaptiveMixup-SER)

- Hierarchical Network with Decoupled Knowledge Distillation for Speech Emotion Recognition [paper](https://arxiv.org/abs/2303.05134)

- Adapting a Self-Supervised Speech Representation for Noisy Speech Emotion Recognition by Using Contrastive Teacher-Student Learning [paper](https://ecs.utdallas.edu/research/researchlabs/msp-lab/publications/Leem_2023.pdf)

- Fast Yet Effective Speech Emotion Recognition with Self-Distillation [paper](https://arxiv.org/abs/2210.14636) [code](https://github.com/leibniz-future-lab/SelfDistill-SER)

- General or Specific? Investigating Effective Privacy Protection in Federated Learning for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10096844)

- Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing [paper](https://arxiv.org/abs/2211.01756) [code](https://github.com/skakouros/s3prl_attentive_correlation)

- Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition [paper](https://arxiv.org/abs/2211.08233) [code](https://github.com/Jiaxin-Ye/TIM-Net_SER)

- Phonetic Anchor-Based Transfer Learning to Facilitate Unsupervised Cross-Lingual Speech Emotion Recognition [paper](https://ecs.utdallas.edu/research/researchlabs/msp-lab/publications/Upadhyay_2023.pdf)

- Knowledge Transfer for on-Device Speech Emotion Recognition With Neural Structured Learning [paper](https://arxiv.org/abs/2210.14977) [code](https://github.com/glam-imperial/NSL-SER)

- Speech Emotion Recognition Based on Low-Level Auto-Extracted Time-Frequency Features [paper](https://ieeexplore.ieee.org/document/10095260)

- Role of Lexical Boundary Information in Chunk-Level Segmentation for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10096861)

- Zero-Shot Speech Emotion Recognition Using Generative Learning with Reconstructed Prototypes [paper](https://ieeexplore.ieee.org/document/10094888)

- A Generalized Subspace Distribution Adaptation Framework for Cross-Corpus Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10097258)

- exploring Wav2vec 2.0 Fine Tuning for Improved Speech Emotion Recognition [paper](https://arxiv.org/abs/2110.06309) [code](https://github.com/b04901014/FT-w2v2-ser)

- DWFormer: Dynamic Window Transformer for Speech Emotion Recognition [paper](https://arxiv.org/abs/2303.01694) [code](https://github.com/scutcsq/DWFormer)

- Deep Implicit Distribution Alignment Networks for cross-Corpus Speech Emotion Recognition [paper](https://arxiv.org/abs/2302.08921)

- Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-Trained Representations [paper](https://arxiv.org/abs/2302.13277) [code](https://github.com/ECNU-Cross-Innovation-Lab/ShiftSER)

- Designing and Evaluating Speech Emotion Recognition Systems: A Reality Check Case Study with IEMOCAP [paper](https://arxiv.org/abs/2304.00860)

- EMIX: A Data Augmentation Method for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10096789)

- Multi-View Learning for Speech Emotion Recognition with Categorical Emotion, Categorical Sentiment, and Dimensional Scores [paper](https://ieeexplore.ieee.org/document/10096700)

- An Empirical Study and Improvement for Speech Emotion Recognition [paper](https://arxiv.org/abs/2304.03899)

- Towards Learning Emotion Information from Short Segments of Speech [paper](https://ieeexplore.ieee.org/document/10095892)

##### Conversation

- Knowledge-Aware Graph Convolutional Network with Utterance-Specific Window Search for Emotion Recognition In Conversations [paper](https://ieeexplore.ieee.org/document/10095097)

- Multi-Scale Receptive Field Graph Model for Emotion Recognition in Conversations [paper](https://ieeexplore.ieee.org/document/10094596)

- SDTN: Speaker Dynamics Tracking Network for Emotion Recognition in Conversation [paper](https://ieeexplore.ieee.org/document/10094810)

- Emotion Recognition in Conversation from Variable-Length Context [paper](https://ieeexplore.ieee.org/document/10096161)

##### Other

- Ensemble Knowledge Distillation of Self-Supervised Speech Models [paper](https://arxiv.org/abs/2302.12757)

- Domain Adaptation without Catastrophic Forgetting on a Small-Scale Partially-Labeled Corpus for Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10096578)

- Shuffleaugment: A Data Augmentation Method Using Time Shuffling [paper](https://ieeexplore.ieee.org/document/10096927)

- Achieving Fair Speech Emotion Recognition via Perceptual Fairness [paper](https://www.researchgate.net/profile/Woan-Shiuan-Chien/publication/368719577_Achieving_Fair_Speech_Emotion_Recognition_via_Perceptual_Fairness/links/6455dcef97449a0e1a7dd51d/Achieving-Fair-Speech-Emotion-Recognition-via-Perceptual-Fairness.pdf)

- Unsupervised Domain Adaptation for Preference Learning Based Speech Emotion Recognition [paper](https://ieeexplore.ieee.org/document/10094301) 

- Using Emotion Embeddings to Transfer Knowledge between Emotions, Languages, and Annotation Formats [paper](https://arxiv.org/abs/2211.00171) [code](https://github.com/gchochla/Demux-MEmo) 

- QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis [paper](https://ieeexplore.ieee.org/document/10095623)

- A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition [paper](https://arxiv.org/abs/2303.08027)

#### IJCAI 2023

- Mimicking the Thinking Process for Emotion Recognition in Conversation with Prompts and Paraphrasing [paper](https://arxiv.org/abs/2306.06601)

#### AAAI 2023

##### Conversation

- SKIER: A Symbolic Knowledge Integrated Model for Conversational Emotion Recognition [paper](https://ojs.aaai.org/index.php/AAAI/article/view/26541)

- BERT-ERC: Fine-Tuning BERT Is Enough for Emotion Recognition in Conversation [paper](https://ojs.aaai.org/index.php/AAAI/article/view/26582)

##### Other

- Feature Normalization and Cartography-Based Demonstrations for Prompt-Based Fine-Tuning on Emotion-Related Tasks [paper](https://ojs.aaai.org/index.php/AAAI/article/view/26514)

#### ACL 2023

##### Multimodal

- Layer-wise Fusion with Modality Independence Modeling for Multi-modal Emotion Recognition [paper](https://aclanthology.org/2023.acl-long.39/) [code](https://github.com/sunjunaimer/LFMIM)

- ConKI: Contrastive Knowledge Injection for Multimodal Sentiment Analysis [paper](https://aclanthology.org/2023.findings-acl.860/)

- ConFEDE: Contrastive Feature Decomposition for Multimodal Sentiment Analysis [paper](https://aclanthology.org/2023.acl-long.421/)

- Topic and Style-aware Transformer for Multimodal Emotion Recognition [paper](https://aclanthology.org/2023.findings-acl.130/)

- Self-adaptive Context and Modal-interaction Modeling For Multimodal Emotion Recognition [paper](https://aclanthology.org/2023.findings-acl.390/)

- QAP: A Quantum-Inspired Adaptive-Priority-Learning Model for Multimodal Emotion Recognition [paper](https://aclanthology.org/2023.findings-acl.772/)

##### Conversation

- Supervised Adversarial Contrastive Learning for Emotion Recognition in Conversations [paper](https://arxiv.org/abs/2306.01505) [code](https://github.com/zerohd4869/sacl)

- MultiEMO: An Attention-Based Correlation-Aware Multimodal Fusion Framework for Emotion Recognition in Conversations [paper](https://aclanthology.org/2023.acl-long.824/)

- DualGATs: Dual Graph Attention Networks for Emotion Recognition in Conversations [paper](https://aclanthology.org/2023.acl-long.408/)

- A Cross-Modality Context Fusion and Semantic Refinement Network for Emotion Recognition in Conversation [paper](https://aclanthology.org/2023.acl-long.732/)

- A Facial Expression-Aware Multimodal Multi-task Learning Framework for Emotion Recognition in Multi-party Conversations [paper](https://aclanthology.org/2023.acl-long.861/)

##### Other

- Label-Aware Hyperbolic Embeddings for Fine-grained Emotion Classification [paper](https://arxiv.org/abs/2306.14822) [code](https://github.com/dinobby/HypEmo)

- Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression [paper](https://arxiv.org/abs/2306.06760)

### 2022 papers

#### AAAI 2022

##### Multimodal

- Tailor Versatile Multi-Modal Learning for Multi-Label Emotion Recognition [paper](https://ojs.aaai.org/index.php/AAAI/article/view/20895)

##### Conversation

- Hybrid Curriculum Learning for Emotion Recognition in Conversation [paper](https://ojs.aaai.org/index.php/AAAI/article/view/21413)

- Contrast and Generation Make BART a Good Dialogue Emotion Recognizer [paper](https://ojs.aaai.org/index.php/AAAI/article/view/21348)

- Is Discourse Role Important for Emotion Recognition in Conversation?  [paper](https://ojs.aaai.org/index.php/AAAI/article/view/21361)

#### IJCAI 2022

##### Speech

- CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for Single-Corpus and Cross-Corpus Speech Emotion Recognition [paper](https://arxiv.org/abs/2207.10644) [code](https://github.com/MLDMXM2017/CTLMTNet)

##### Conversation

- Speaker-Guided Encoder-Decoder Framework for Emotion Recognition in Conversation [paper](https://arxiv.org/abs/2206.03173)

- CauAIN: Causal Aware Interaction Network for Emotion Recognition in Conversations [paper](https://www.ijcai.org/proceedings/2022/0628.pdf)

##### Other

- Online ECG Emotion Recognition for Unknown Subjects via Hypergraph-Based Transfer Learning [paper](https://www.ijcai.org/proceedings/2022/0509.pdf)

#### ACM MM 2022

##### Multimodal

- Counterfactual Reasoning for Out-of-distribution Multimodal Sentiment Analysis [paper](https://arxiv.org/abs/2207.11652)

- Unified Multi-modal Pre-training for Few-shot Sentiment Analysis with Prompt-based Learning [paper](https://dl.acm.org/doi/10.1145/3503161.3548306)

##### Vision

- Towards Unbiased Visual Emotion Recognition via Causal Intervention [paper](https://arxiv.org/abs/2107.12096)

##### Speech

- Unsupervised Domain Adaptation Integrating Transformer and Mutual Information for Cross-Corpus Speech Emotion Recognition [paper](https://dl.acm.org/doi/10.1145/3503161.3548328)

##### Other

- SER30K: A Large-Scale Dataset for Sticker Emotion Recognition [paper](https://dl.acm.org/doi/10.1145/3503161.3548407) [code](https://github.com/nku-shengzheliu/SER30K)

#### NAACL 2022

##### Multimodal

- COGMEN: COntextualized GNN based Multimodal Emotion recognitioN [paper](https://arxiv.org/abs/2205.02455) [code](https://github.com/exploration-lab/cogmen)

### 2020 papers

#### AAAI 2020

##### Multimodal

- M3ER: Multiplicative Multimodal Emotion Recognition using Facial, Textual, and Speech Cues [paper](https://ojs.aaai.org/index.php/AAAI/article/view/5492)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ultramarine-indigo/awesome-AI-for-emotion-recognition-papers

Awesome Lists containing this project

README