Sparse Coding–Driven Deep Learning for Robust Emotional State Recognition from Multichannel Speech

Authors

  • Ch.Suneetha Assistant Professor, Department of IT, Anil Neerukonda Institute of Technology & Science (A), Visakhapatnam, Andhra Pradesh, India Author
  • Vijay Keerthika Assistant Professor, Department of CSE (AI & ML), MLR Institute Of Technology, Dundigal, Hyderabad, India Author
  • M. Harshini Assistant Professor, Department of IT, MLR Institute of Technology, Dundigal, Hyderabad, India Author

Keywords:

Sparse coding, Deep Learning, Neural network, CNN, KT Emotion recognition

Abstract

Human emotions can be read from a person's face, words, actions (gesture/posture), or even their heart rate. Due to recent advancements in Machine Learning and data fusion, we can now equip computers with the ability to comprehend, identify, and evaluate human sentiment. Emotional state recognition and Stress disorder diagnosis from speech signals have both been concerns for the recent decade. An increasingly useful computer-aided method for identifying emotional disorders is emotion recognition based on multichannel neurophysiologic inputs, a difficult pattern recognition challenge. Correlation information between channels and frequency components is underutilized by conventional fusion techniques. This paper reveals that deep neural networks trained on emotion data can align with prior domain knowledge and acquire representations that are more accurate than those obtained using hand-crafted techniques. Emotional state identification was the focus of this dissertation, which develops the proposed model named Sparse Coding Technique-Deep Learning (SCT-DL) network models. This is done through two methods named the Convolutional-Recurrent Neural Network (CR-NN) which is a deep learning model that can extract task-related characteristics, extract correlated data between channels and incorporate the contextual information gained from this analysis. Due to the complexity of deep belief networks, limited data sets such as the voice database are incompatible with this type of model. Hence the second method named Knowledge Transmission (KT) which is implemented to deal with the issue of limited data. The purpose is to enhance learning by drawing information from multiple source tasks and applying it to a single target activity. The proposed models have statistically and experimentally been proven to be more effective than most state-of-the-art techniques currently available for recognizing emotional states.

Downloads

Published

2025-12-31

How to Cite

Ch.Suneetha, Vijay Keerthika, & M. Harshini. (2025). Sparse Coding–Driven Deep Learning for Robust Emotional State Recognition from Multichannel Speech. Synthesis: A Multidisciplinary Research Journal, 3(4), 9-20. https://www.macawpublications.com/Journals/index.php/SMRJ/article/view/212

Share