Shaun Canavan

h-index16

13papers

26citations

Novelty27%

AI Score34

Ranked #111,397 of 194,257 authors (top 57%)#37,214 in CV (top 63%)

13 Papers

5.1CVApr 2

Beyond the Fold: Quantifying Split-Level Noise and the Case for Leave-One-Dataset-Out AU Evaluation

Saurabh Hinduja, Gurmeet Kaur, Maneesh Bilalpur et al.

Subject-exclusive cross-validation is the standard evaluation protocol for facial Action Unit (AU) detection, yet reported improvements are often small. We show that cross-validation itself introduces measurable stochastic variance. On BP4D+, repeated 3-fold subject-exclusive splits produce an empirical noise floor of $\pm 0.065$ in average F1, with substantially larger variation for low-prevalence AUs. Operating-point metrics such as F1 fluctuate more than threshold-independent measures such as AUC, and model ranking can change under different fold assignments. We further evaluate cross-dataset robustness using a Leave-One-Dataset-Out (LODO) protocol across five AU datasets. LODO removes partition randomness and exposes domain-level instability that is not visible under single-dataset cross-validation. Together, these results suggest that gains often reported in cross-fold validation may fall within protocol variance. Leave-one-dataset-out cross-validation yields more stable and interpretable findings

5.2CVJul 3, 2024Code

FlowCon: Out-of-Distribution Detection using Flow-Based Contrastive Learning

Saandeep Aathreya, Shaun Canavan

Identifying Out-of-distribution (OOD) data is becoming increasingly critical as the real-world applications of deep learning methods expand. Post-hoc methods modify softmax scores fine-tuned on outlier data or leverage intermediate feature layers to identify distinctive patterns between In-Distribution (ID) and OOD samples. Other methods focus on employing diverse OOD samples to learn discrepancies between ID and OOD. These techniques, however, are typically dependent on the quality of the outlier samples assumed. Density-based methods explicitly model class-conditioned distributions but this requires long training time or retraining the classifier. To tackle these issues, we introduce \textit{FlowCon}, a new density-based OOD detection technique. Our main innovation lies in efficiently combining the properties of normalizing flow with supervised contrastive learning, ensuring robust representation learning with tractable density estimation. Empirical evaluation shows the enhanced performance of our method across common vision datasets such as CIFAR-10 and CIFAR-100 pretrained on ResNet18 and WideResNet classifiers. We also perform quantitative analysis using likelihood plots and qualitative visualization using UMAP embeddings and demonstrate the robustness of the proposed method under various OOD contexts. Code will be open-sourced post decision.

1.4CVMar 24, 2022

Random Forest Regression for continuous affect using Facial Action Units

Saurabh Hinduja, Shaun Canavan, Liza Jivnani et al.

In this paper we describe our approach to the arousal and valence track of the 3rd Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW). We extracted facial features using OpenFace and used them to train a multiple output random forest regressor. Our approach performed comparable to the baseline approach.

1.4CVOct 5, 2021

Quantified Facial Expressiveness for Affective Behavior Analytics

Md Taufeeq Uddin, Shaun Canavan

The quantified measurement of facial expressiveness is crucial to analyze human affective behavior at scale. Unfortunately, methods for expressiveness quantification at the video frame-level are largely unexplored, unlike the study of discrete expression. In this work, we propose an algorithm that quantifies facial expressiveness using a bounded, continuous expressiveness score using multimodal facial features, such as action units (AUs), landmarks, head pose, and gaze. The proposed algorithm more heavily weights AUs with high intensities and large temporal changes. The proposed algorithm can compute the expressiveness in terms of discrete expression, and can be used to perform tasks including facial behavior tracking and subjectivity quantification in context. Our results on benchmark datasets show the proposed algorithm is effective in terms of capturing temporal changes and expressiveness, measuring subjective differences in context, and extracting useful insight.

10.4HCJul 19, 2021Code

AffectiveTDA: Using Topological Data Analysis to Improve Analysis and Explainability in Affective Computing

Hamza Elhamdadi, Shaun Canavan, Paul Rosen

We present an approach utilizing Topological Data Analysis to study the structure of face poses used in affective computing, i.e., the process of recognizing human emotion. The approach uses a conditional comparison of different emotions, both respective and irrespective of time, with multiple topological distance metrics, dimension reduction techniques, and face subsections (e.g., eyes, nose, mouth, etc.). The results confirm that our topology-based approach captures known patterns, distinctions between emotions, and distinctions between individuals, which is an important step towards more robust and explainable emotion recognition by machines.

1.2CVNov 15, 2020

Accounting for Affect in Pain Level Recognition

Md Taufeeq Uddin, Shaun Canavan, Ghada Zamzmi

In this work, we address the importance of affect in automated pain assessment and the implications in real-world settings. To achieve this, we curate a new physiological dataset by merging the publicly available bioVid pain and emotion datasets. We then investigate pain level recognition on this dataset simulating participants' naturalistic affective behaviors. Our findings demonstrate that acknowledging affect in pain assessment is essential. We observe degradation in recognition performance when simulating the existence of affect to validate pain assessment models that do not account for it. Conversely, we observe a performance boost in recognition when we account for affect.

3.3CVOct 28, 2020

Quantified Facial Temporal-Expressiveness Dynamics for Affect Analysis

Md Taufeeq Uddin, Shaun Canavan

The quantification of visual affect data (e.g. face images) is essential to build and monitor automated affect modeling systems efficiently. Considering this, this work proposes quantified facial Temporal-expressiveness Dynamics (TED) to quantify the expressiveness of human faces. The proposed algorithm leverages multimodal facial features by incorporating static and dynamic information to enable accurate measurements of facial expressiveness. We show that TED can be used for high-level tasks such as summarization of unstructured visual data, and expectation from and interpretation of automated affect recognition models. To evaluate the positive impact of using TED, a case study was conducted on spontaneous pain using the UNBC-McMaster spontaneous shoulder pain dataset. Experimental results show the efficacy of using TED for quantified affect analysis.

2.3CVOct 15, 2020

Impact of Action Unit Occurrence Patterns on Detection

Saurabh Hinduja, Shaun Canavan, Saandeep Aathreya

Detecting action units is an important task in face analysis, especially in facial expression recognition. This is due, in part, to the idea that expressions can be decomposed into multiple action units. In this paper we investigate the impact of action unit occurrence patterns on detection of action units. To facilitate this investigation, we review state of the art literature, for AU detection, on 2 state-of-the-art face databases that are commonly used for this task, namely DISFA, and BP4D. Our findings, from this literature review, suggest that action unit occurrence patterns strongly impact evaluation metrics (e.g. F1-binary). Along with the literature review, we also conduct multi and single action unit detection, as well as propose a new approach to explicitly train deep neural networks using the occurrence patterns to boost the accuracy of action unit detection. These experiments validate that action unit patterns directly impact the evaluation metrics.

2.3CVMay 17, 2020

Detecting Forged Facial Videos using convolutional neural network

Neilesh Sambhu, Shaun Canavan

In this paper, we propose to detect forged videos, of faces, in online videos. To facilitate this detection, we propose to use smaller (fewer parameters to learn) convolutional neural networks (CNN), for a data-driven approach to forged video detection. To validate our approach, we investigate the FaceForensics public dataset detailing both frame-based and video-based results. The proposed method is shown to outperform current state of the art. We also perform an ablation study, analyzing the impact of batch size, number of filters, and number of network layers on the accuracy of detecting forged videos.

4.2CVMay 17, 2020

Facial Action Unit Detection using 3D Facial Landmarks

Saurabh Hinduja, Shaun Canavan

In this paper, we propose to detect facial action units (AU) using 3D facial landmarks. Specifically, we train a 2D convolutional neural network (CNN) on 3D facial landmarks, tracked using a shape index-based statistical shape model, for binary and multi-class AU detection. We show that the proposed approach is able to accurately model AU occurrences, as the movement of the facial landmarks corresponds directly to the movement of the AUs. By training a CNN on 3D landmarks, we can achieve accurate AU detection on two state-of-the-art emotion datasets, namely BP4D and BP4D+. Using the proposed method, we detect multiple AUs on over 330,000 frames, reporting improved results over state-of-the-art methods.

2.3CVMay 17, 2020

Impact of multiple modalities on emotion recognition: investigation into 3d facial landmarks, action units, and physiological data

Diego Fabiano, Manikandan Jaishanker, Shaun Canavan

To fully understand the complexities of human emotion, the integration of multiple physical features from different modalities can be advantageous. Considering this, we present an analysis of 3D facial data, action units, and physiological data as it relates to their impact on emotion recognition. We analyze each modality independently, as well as the fusion of each for recognizing human emotion. This analysis includes which features are most important for specific emotions (e.g. happy). Our analysis indicates that both 3D facial landmarks and physiological data are encouraging for expression/emotion recognition. On the other hand, while action units can positively impact emotion recognition when fused with other modalities, the results suggest it is difficult to detect emotion using them in a unimodal fashion.

3.3CVMay 17, 2020

Subject Identification Across Large Expression Variations Using 3D Facial Landmarks

Sk Rahatul Jannat, Diego Fabiano, Shaun Canavan et al.

Landmark localization is an important first step towards geometric based vision research including subject identification. Considering this, we propose to use 3D facial landmarks for the task of subject identification, over a range of expressed emotion. Landmarks are detected, using a Temporal Deformable Shape Model and used to train a Support Vector Machine (SVM), Random Forest (RF), and Long Short-term Memory (LSTM) neural network for subject identification. As we are interested in subject identification with large variations in expression, we conducted experiments on 3 emotion-based databases, namely the BU-4DFE, BP4D, and BP4D+ 3D/4D face databases. We show that our proposed method outperforms current state of the art methods for subject identification on BU-4DFE and BP4D. To the best of our knowledge, this is the first work to investigate subject identification on the BP4D+, resulting in a baseline for the community.

3.1HCJun 27, 2019

Studying the Impact of Mood on Identifying Smartphone Users

Khadija Zanna, Sayde King, Tempestt Neal et al.

This paper explores the identification of smartphone users when certain samples collected while the subject felt happy, upset or stressed were absent or present. We employ data from 19 subjects using the StudentLife dataset, a dataset collected by researchers at Dartmouth College that was originally collected to correlate behaviors characterized by smartphone usage patterns with changes in stress and academic performance. Although many previous works on behavioral biometrics have implied that mood is a source of intra-person variation which may impact biometric performance, our results contradict this assumption. Our findings show that performance worsens when removing samples that were generated when subjects may be happy, upset, or stressed. Thus, there is no indication that mood negatively impacts performance. However, we do find that changes existing in smartphone usage patterns may correlate with mood, including changes in locking, audio, location, calling, homescreen, and e-mail habits. Thus, we show that while mood is a source of intra-person variation, it may be an inaccurate assumption that biometric systems (particularly, mobile biometrics) are likely influenced by mood.