Moodium: From Words to Feelings, An LLM-Based Culturally Aware Framework for Emotion Classification and Prediction
Parsa Besharat & Volker Göhler
August 2025
Abstract
Emotion recognition is a critical component of affective computing, enabling systems to respond with empathy and adapt to human states. However, existing approaches struggle with cultural bias, incomplete modality fusion, and a lack of interpretability—especially in real-world, multilingual environments. This paper addresses these challenges by reviewing recent advances in multimodal emotion classification and highlighting the emerging role of Large Language Models (LLMs) in this domain. Building on these insights, we propose a culturally aware, LLM-integrated framework that fuses audio, visual, and textual data using a staged attention mechanism with adaptive gating. While full end-to-end training is still in progress, we present a detailed architecture and demonstrate the system’s potential through simulated outputs interpreted by a pretrained LLM. Our work lays the foundation for more robust, explainable, and culturally sensitive emotion-aware systems.