A Full Transformer-based Framework for Automatic Pain Estimation using Videos
This work addresses pain management for patients by providing reliable assessment, though it appears incremental as it builds on existing transformer architectures.
The authors tackled the problem of automatic pain estimation from videos by proposing a full transformer-based framework, achieving state-of-the-art performance across primary pain estimation tasks as demonstrated on the BioVid database.
The automatic estimation of pain is essential in designing an optimal pain management system offering reliable assessment and reducing the suffering of patients. In this study, we present a novel full transformer-based framework consisting of a Transformer in Transformer (TNT) model and a Transformer leveraging cross-attention and self-attention blocks. Elaborating on videos from the BioVid database, we demonstrate state-of-the-art performances, showing the efficacy, efficiency, and generalization capability across all the primary pain estimation tasks.