A new research collaboration seeking to break down that electronic divide with artificial intelligence-based tools, bringing together computer scientists from the University of Maryland, College Park and behavioral health clinical experts from the University of Maryland School of Medicine at the University of Maryland, Baltimore. It has been estimated that around 15.5% of the population suffers from mental illness globally, and these numbers are rising continuously. There is, however, a worldwide shortage of mental health providers. This, combined with issues related to affordability and reachability, has resulted in more than 50% of the mental health patients remaining untreated. The mental health landscape became even bleaker during the COVID-19 pandemic. However, the rapid expansion of telemental health services, especially during the pandemic, has increased access to clinical care options and introduced the opportunity to use artificial intelligence (AI) based strategies to improve the quality of human-delivered mental health services. Telemental health is the process of providing psychotherapy remotely, typically utilizing HIPAA compliant video conferencing. Given that it relies a lot on technology, experienced human therapists face challenges engaging with patients due to unfamiliarity with the setup as well as other factors. For instance, in a telemental health session, the therapist has limited visual data (e.g., a therapist can only view the patient's face rather than full body language), so the therapist has fewer non-verbal cues to guide their responses. It is also more difficult for the therapist to estimate attentiveness since eye contact required during in-person sessions is replaced with the patient looking at a camera or screen. Video conferencing discussions may also appear more stilted, especially if there is inadequate internet connection or technological challenges. Such challenges make it difficult for a therapist to perceive the patient's several mental health indicators like engagement level, valence and arousal. Our collaboration has developed multimodal AI-based framework for modeling patient and caregive engagement and affect. This automated engagement tool will give feedback to the provider in real-time, ultimately enhancing provider engagement training and improving quality of care.


  1. We developed a novel framework that models telemental health session videos. Our algorithm takes into account different components of engagement defined in the psychology literature, namely- Affective and Cognitive engagement. These components are incorporated as the modalities in our multimodal framework.
  2. We developed a novel regression-based framework that can capture psychology-inspired cues capable of perceiving the different important psychological indicators useful for a psychotherapist, namely, patient engagement, valence, and arousal. Our focus in this work is to only understand the patient's mental health state and not of the therapist. The input to the proposed framework would be the patient’s visual, audio, and text data, while the output would be the desired psychological indicator (engagement/valence-arousal).
  3. We released a new dataset, MEDICA (Multimodal Engagement Detection In Clinical Analysis), to enhance mental health research, specifically towards understanding the engagement levels of patients attending the therapy sessions. To the best of our knowledge, there is no other multimodal dataset that caters specifically to the needs of mental health-based research. Additionally, while there are some image-based or sensory information-based datasets, there is no dataset that addresses the possibility of exploring engagement detection using visual, audio, and text modalities. MEDICA is a dataset that is a collection of around 1299 short video clips obtained from mock mental health therapy sessions conducted between an actor (who acts like a patient) and a real therapist, which is used by medical schools in their psychiatry curriculum.

Media Coverage

Technical Report

title={DeepTMH: Multimodal Semi-supervised framework leveraging Affective and Cognitive engagement for Telemental Health},
author={Guhan, Pooja and Awasthi, Naman and Das, Ritwika and Agarwal, Manas and McDonald, Kathryn and Bussell, Kristin and Manocha, Dinesh  and Reeves, Gloria and Bera, Aniket},
Paper Link Video


  • State of Maryland: MPower Grant 2020 (Developing an Artificial Intelligence Tool to improve Caregiver Engagement for Rural Child Behavioral Health Services)
  • Maryland Department of Health (The Resilience Project: Embodied Virtual Reality (VR) Agent Research to measure Adaptive Stress Response for individuals in a high-risk occupation)