DCApr 28
Performance and Energy Trade-Off Analysis of Hierarchical Federated Learning for Plant Disease ClassificationAthanasios Papanikolaou, Athanasios Tziouvaras, Pavlos Stoikos et al.
Early detection of plant diseases is critical for improving crop productivity, while it also facilitates the foundations of precision agriculture. Recent advances in distributed deep learning have enabled plant disease classification models to be trained across geographically distributed agricultural sensing infrastructures. However, deploying such systems in large-scale Internet of Things (IoT) environments, introduces significant challenges related to computational cost, energy consumption, and system efficiency. In this paper, we present a design-space exploration of hierarchical federated learning architectures for plant disease classification, with a particular focus on the trade-offs between predictive performance and energy efficiency. We further introduce a power- and energy-aware optimization framework that enables the systematic evaluation and selection of model-aggregator configurations under varying deployment constraints. The hierarchical federated architecture organizes distributed clients through intermediate aggregation layers, reducing communication and computational overhead. We evaluate multiple convolutional neural network architectures, including EfficientNet-B0, ResNet-50, and MobileNetV3-Large, in combination with different federated aggregation strategies such as FedAvg, FedProx, and FedAvgM. Experimental results demonstrate that different model-aggregator combinations exhibit distinct performance-energy trade-offs. Consequently, we highlight configurations that achieve competitive diagnostic accuracy and significantly reduce system resource requirements.
ROMay 8, 2024
GISR: Geometric Initialization and Silhouette-based Refinement for Single-View Robot Pose and Configuration EstimationIvan Bilić, Filip Marić, Fabio Bonsignorio et al.
In autonomous robotics, measurement of the robot's internal state and perception of its environment, including interaction with other agents such as collaborative robots, are essential. Estimating the pose of the robot arm from a single view has the potential to replace classical eye-to-hand calibration approaches and is particularly attractive for online estimation and dynamic environments. In addition to its pose, recovering the robot configuration provides a complete spatial understanding of the observed robot that can be used to anticipate the actions of other agents in advanced robotics use cases. Furthermore, this additional redundancy enables the planning and execution of recovery protocols in case of sensor failures or external disturbances. We introduce GISR - a deep configuration and robot-to-camera pose estimation method that prioritizes execution in real-time. GISR consists of two modules: (i) a geometric initialization module that efficiently computes an approximate robot pose and configuration, and (ii) a deep iterative silhouette-based refinement module that arrives at a final solution in just a few iterations. We evaluate GISR on publicly available data and show that it outperforms existing methods of the same class in terms of both speed and accuracy, and can compete with approaches that rely on ground-truth proprioception and recover only the pose.
CVMay 6, 2025
An Active Inference Model of Covert and Overt Visual AttentionTin Mišić, Karlo Koledić, Fabio Bonsignorio et al.
The ability to selectively attend to relevant stimuli while filtering out distractions is essential for agents that process complex, high-dimensional sensory input. This paper introduces a model of covert and overt visual attention through the framework of active inference, utilizing dynamic optimization of sensory precisions to minimize free-energy. The model determines visual sensory precisions based on both current environmental beliefs and sensory input, influencing attentional allocation in both covert and overt modalities. To test the effectiveness of the model, we analyze its behavior in the Posner cueing task and a simple target focus task using two-dimensional(2D) visual data. Reaction times are measured to investigate the interplay between exogenous and endogenous attention, as well as valid and invalid cueing. The results show that exogenous and valid cues generally lead to faster reaction times compared to endogenous and invalid cues. Furthermore, the model exhibits behavior similar to inhibition of return, where previously attended locations become suppressed after a specific cue-target onset asynchrony interval. Lastly, we investigate different aspects of overt attention and show that involuntary, reflexive saccades occur faster than intentional ones, but at the expense of adaptability.