1 Scopus citations

Abstract

Depression not only afflicts hundreds of millions of people but also contributes to a global disability and healthcare burden. The primary method of diagnosing depression relies on the judgment of medical professionals in clinical interviews with patients, which is subjective and time-consuming. Recent studies have demonstrated that text, audio, facial attributes, heart rate, and eye movement could be utilized for depression-level stratification. In this paper, we construct a virtual assistant for automatic depression-level stratification on mobile devices that can actively guide users through voice dialogue and change conversation content using emotion perception. During the conversation, features from text, audio, facial attributes, heart rate, and eye movement are extracted for multi-modal depression-level stratification. We utilize a feature-level fusion framework to integrate five modalities and the deep neural network to classify the varying levels of depression, which include healthy, mild, moderate, or severe depression, as well as bipolar disorder (formerly called manic depression). With outcome data from 168 subjects, experimental results reveal that the total accuracy of feature-level fusion with five modal features achieves the highest accuracy of 90.26 percent.

Original languageEnglish
Pages (from-to)1-14
Number of pages14
JournalIEEE Transactions on Affective Computing
DOIs
StateAccepted/In press - 2024

Keywords

  • Biomedical monitoring
  • Convolutional neural networks
  • Depression
  • Face recognition
  • Feature extraction
  • Virtual assistants
  • Visualization
  • depression recognition
  • multi-modal fusion
  • virtual human

Fingerprint

Dive into the research topics of 'Mobile Virtual Assistant for Multi-Modal Depression-Level Stratification'. Together they form a unique fingerprint.

Cite this