Challenges and Opportunities for Machine Learning Classification of Behavior and Mental State from Images
This work identifies key problems for researchers and practitioners in psychiatry and behavioral sciences aiming to use imaging for diagnostics, but it is incremental as it reviews existing challenges without presenting new solutions.
The paper addresses the challenges in developing computer vision classifiers for detecting nonverbal social behavior and mental states from images, highlighting issues like data heterogeneity, bias, and privacy, and discusses opportunities using current research methods such as data augmentation and federated learning.
Computer Vision (CV) classifiers which distinguish and detect nonverbal social human behavior and mental state can aid digital diagnostics and therapeutics for psychiatry and the behavioral sciences. While CV classifiers for traditional and structured classification tasks can be developed with standard machine learning pipelines for supervised learning consisting of data labeling, preprocessing, and training a convolutional neural network, there are several pain points which arise when attempting this process for behavioral phenotyping. Here, we discuss the challenges and corresponding opportunities in this space, including handling heterogeneous data, avoiding biased models, labeling massive and repetitive data sets, working with ambiguous or compound class labels, managing privacy concerns, creating appropriate representations, and personalizing models. We discuss current state-of-the-art research endeavors in CV such as data curation, data augmentation, crowdsourced labeling, active learning, reinforcement learning, generative models, representation learning, federated learning, and meta-learning. We highlight at least some of the machine learning advancements needed for imaging classifiers to detect human social cues successfully and reliably.