CVFeb 4
SalFormer360: a transformer-based saliency estimation model for 360-degree videosMahmoud Z. A. Wahba, Francesco Barbato, Sara Baldoni et al.
Saliency estimation has received growing attention in recent years due to its importance in a wide range of applications. In the context of 360-degree video, it has been particularly valuable for tasks such as viewport prediction and immersive content optimization. In this paper, we propose SalFormer360, a novel saliency estimation model for 360-degree videos built on a transformer-based architecture. Our approach is based on the combination of an existing encoder architecture, SegFormer, and a custom decoder. The SegFormer model was originally developed for 2D segmentation tasks, and it has been fine-tuned to adapt it to 360-degree content. To further enhance prediction accuracy in our model, we incorporated Viewing Center Bias to reflect user attention in 360-degree environments. Extensive experiments on the three largest benchmark datasets for saliency estimation demonstrate that SalFormer360 outperforms existing state-of-the-art methods. In terms of Pearson Correlation Coefficient, our model achieves 8.4% higher performance on Sport360, 2.5% on PVS-HM, and 18.6% on VR-EyeTracking compared to previous state-of-the-art.
CVMay 22, 2025Code
Unsupervised Network Anomaly Detection with Autoencoders and Traffic ImagesMichael Neri, Sara Baldoni
Due to the recent increase in the number of connected devices, the need to promptly detect security issues is emerging. Moreover, the high number of communication flows creates the necessity of processing huge amounts of data. Furthermore, the connected devices are heterogeneous in nature, having different computational capacities. For this reason, in this work we propose an image-based representation of network traffic which allows to realize a compact summary of the current network conditions with 1-second time windows. The proposed representation highlights the presence of anomalies thus reducing the need for complex processing architectures. Finally, we present an unsupervised learning approach which effectively detects the presence of anomalies. The code and the dataset are available at https://github.com/michaelneri/image-based-network-traffic-anomaly-detection.
CVSep 15, 2025
Sphere-GAN: a GAN-based Approach for Saliency Estimation in 360° VideosMahmoud Z. A. Wahba, Sara Baldoni, Federica Battisti
The recent success of immersive applications is pushing the research community to define new approaches to process 360° images and videos and optimize their transmission. Among these, saliency estimation provides a powerful tool that can be used to identify visually relevant areas and, consequently, adapt processing algorithms. Although saliency estimation has been widely investigated for 2D content, very few algorithms have been proposed for 360° saliency estimation. Towards this goal, we introduce Sphere-GAN, a saliency detection model for 360° videos that leverages a Generative Adversarial Network with spherical convolutions. Extensive experiments were conducted using a public 360° video saliency dataset, and the results demonstrate that Sphere-GAN outperforms state-of-the-art models in accurately predicting saliency maps.