TY - JOUR
T1 - Mobile Virtual Assistant for Multi-Modal Depression-Level Stratification
AU - Wu, Eric Hsiao Kuang
AU - Gao, Ting Yu
AU - Chung, Chia Ru
AU - Chen, Chun Chuan
AU - Tsai, Chia Fen
AU - Yeh, Shih Ching
N1 - Publisher Copyright:
IEEE
PY - 2024
Y1 - 2024
N2 - Depression not only afflicts hundreds of millions of people but also contributes to a global disability and healthcare burden. The primary method of diagnosing depression relies on the judgment of medical professionals in clinical interviews with patients, which is subjective and time-consuming. Recent studies have demonstrated that text, audio, facial attributes, heart rate, and eye movement could be utilized for depression-level stratification. In this paper, we construct a virtual assistant for automatic depression-level stratification on mobile devices that can actively guide users through voice dialogue and change conversation content using emotion perception. During the conversation, features from text, audio, facial attributes, heart rate, and eye movement are extracted for multi-modal depression-level stratification. We utilize a feature-level fusion framework to integrate five modalities and the deep neural network to classify the varying levels of depression, which include healthy, mild, moderate, or severe depression, as well as bipolar disorder (formerly called manic depression). With outcome data from 168 subjects, experimental results reveal that the total accuracy of feature-level fusion with five modal features achieves the highest accuracy of 90.26 percent.
AB - Depression not only afflicts hundreds of millions of people but also contributes to a global disability and healthcare burden. The primary method of diagnosing depression relies on the judgment of medical professionals in clinical interviews with patients, which is subjective and time-consuming. Recent studies have demonstrated that text, audio, facial attributes, heart rate, and eye movement could be utilized for depression-level stratification. In this paper, we construct a virtual assistant for automatic depression-level stratification on mobile devices that can actively guide users through voice dialogue and change conversation content using emotion perception. During the conversation, features from text, audio, facial attributes, heart rate, and eye movement are extracted for multi-modal depression-level stratification. We utilize a feature-level fusion framework to integrate five modalities and the deep neural network to classify the varying levels of depression, which include healthy, mild, moderate, or severe depression, as well as bipolar disorder (formerly called manic depression). With outcome data from 168 subjects, experimental results reveal that the total accuracy of feature-level fusion with five modal features achieves the highest accuracy of 90.26 percent.
KW - Biomedical monitoring
KW - Convolutional neural networks
KW - Depression
KW - Face recognition
KW - Feature extraction
KW - Virtual assistants
KW - Visualization
KW - depression recognition
KW - multi-modal fusion
KW - virtual human
UR - http://www.scopus.com/inward/record.url?scp=85202780284&partnerID=8YFLogxK
U2 - 10.1109/TAFFC.2024.3451114
DO - 10.1109/TAFFC.2024.3451114
M3 - 期刊論文
AN - SCOPUS:85202780284
SN - 1949-3045
SP - 1
EP - 14
JO - IEEE Transactions on Affective Computing
JF - IEEE Transactions on Affective Computing
ER -