EmoShiftNet: Enhancing Emotion Recognition with Emotion Shift Detection in Multi-Party Conversations

Nirujan, Hinduja

EmoShiftNet: Enhancing Emotion Recognition with Emotion Shift Detection in Multi-Party Conversations

Nirujan, Hinduja

URI: http://dlib.iit.ac.lk/xmlui/handle/123456789/2932

Date: 2025

Abstract:

Emotion recognition in conversations (ERC) plays a critical role in building emotionally intelligent systems for mental health monitoring, virtual assistants, and human-computer interaction (HCI). However, the majority of existing ERC models struggle to detect emotion shifts — the subtle, dynamic changes in emotional states that occur naturally in multi-party conversations (MPCs). Recognizing these shifts is challenging due to their context-dependent nature and the imbalance in datasets where dominant emotions like ""neutral"" overshadow minority classes, ultimately limiting the overall system performance. In this study, the author proposes EmoShiftNet, a multi-task learning (MTL) based framework designed to jointly perform emotion classification and emotion shift detection. To achieve this, the author integrates multimodal features, including contextualized text embeddings from a BERT transformer, acoustic features such as MFCCs, pitch, and loudness, and temporal features like utterance duration and pause intervals. Attention-based fusion techniques and a customized multi-task loss function are utilized to enhance the model's sensitivity to both static emotional states and dynamic emotional transitions across conversation turns. Based on experiments conducted using the Multimodal EmotionLines Dataset (MELD) dataset, EmoShiftNet achieves a 70.03% emotion recognition accuracy while demonstrating significant improvements in F1-score and minority class detection compared to traditional single-task baselines. Ablation studies further validate that incorporating emotion shift detection as an auxiliary task improves the contextual understanding and robustness of ERC systems, highlighting the critical role of modeling emotional dynamics in conversational AI.

Show full item record