Objective: The previous study explored the use of multimodality for accurate emotion predictions. However, limited research has addressed real-time implementation due to the challenges of simultaneous emotion recognition. To tackle this issue, we propose a real-time multimodal emotion recognition system based on multithreaded weighted average fusion. Background: Emotion recognition stands as a crucial component in human-machine interaction. Challenges arise in emotion recognition due to the diverse expressions of emotions across various forms such as visual cues, auditory signals, text, and physiological responses. Recent advances in the field highlight that combining multimodal inputs, such as voice, speech, and EEG signals, yields superior results compared to unimodal approaches. Method: We have constructed a multithreaded system to facilitate the simultaneous utilization of diverse modalities, ensuring continuous synchronization. Building upon previous work, we have enhanced our approach by incorporating weighted average fusion alongside the multithreaded system. This enhancement allows us to predict emotions based on the highest probability score. Results: Our implementation demonstrated the ability of the proposed model to recognize and predict user emotions in real-time, resulting in improved accuracy in emotion recognition. Conclusion: This technology has the potential to enrich user experiences and applications by enabling real-time understanding and response to human emotions. Application: The proposed real-time multimodal emotion recognition system holds promising applications in various domains, including human-computer interaction, healthcare, and entertainment.