Valentina Bono

CVJun 26Code

HumanMoveVQA: Can Video MLLMs reason about human movement in videos?

Pulkit Gera, Faegheh Sardari, Asmar Nadeem et al.

Despite the rapid advance of Multimodal Large Language Models (MLLMs) in high-level video understanding, a fundamental bottleneck remains: these models collapse complex human motion into coarse semantic labels. Existing benchmarks mostly focus on scene-centric events or local joint articulations, failing to probe global human motion in space over time (trajectory and orientation changes). We introduce HumanMoveVQA, the first comprehensive benchmark designed to evaluate global trajectory and orientation reasoning from an exocentric perspective. Our benchmark utilizes a first-frame anchored world coordinate system, preserving translation and rotation relative to a fixed starting point. We propose a scalable, multi-stage pipeline that lifts 2D video observations into world-consistent 3D motion tracks to generate over 10K structured question-answer pairs across seven reasoning categories, including motion aggregation, sequential ordering, and trajectory-level inference. Our extensive evaluation reveals a critical capability gap in state-of-the-art proprietary models on deep human motion understanding. However, we demonstrate that this is a learnable problem; by fine-tuning an open-source baseline with our targeted, world-consistent supervision, we achieve a significant improvement.HumanMoveVQA establishes a rigorous geometric foundation for developing next-generation, movement-aware video understanding models.

1.2MED-PHOct 20, 2014

Artifact reduction in multichannel pervasive EEG using hybrid WPT-ICA and WPT-EMD signal decomposition techniques

Valentina Bono, Wasifa Jamal, Saptarshi Das et al.

In order to reduce the muscle artifacts in multi-channel pervasive Electroencephalogram (EEG) signals, we here propose and compare two hybrid algorithms by combining the concept of wavelet packet transform (WPT), empirical mode decomposition (EMD) and Independent Component Analysis (ICA). The signal cleaning performances of WPT-EMD and WPT-ICA algorithms have been compared using a signal-to-noise ratio (SNR)-like criterion for artifacts. The algorithms have been tested on multiple trials of four different artifact cases viz. eye-blinking and muscle artifacts including left and right hand movement and head-shaking.

Valentina Bono

2 Papers