LGApr 16, 2022
Approaching sales forecasting using recurrent neural networks and transformersIván Vallés-Pérez, Emilio Soria-Olivas, Marcelino Martínez-Sober et al.
Accurate and fast demand forecast is one of the hot topics in supply chain for enabling the precise execution of the corresponding downstream processes (inbound and outbound planning, inventory placement, network planning, etc). We develop three alternatives to tackle the problem of forecasting the customer sales at day/store/item level using deep learning techniques and the Corporación Favorita data set, published as part of a Kaggle competition. Our empirical results show how good performance can be achieved by using a simple sequence to sequence architecture with minimal data preprocessing effort. Additionally, we describe a training trick for making the model more time independent and hence improving generalization over time. The proposed solution achieves a RMSLE of around 0.54, which is competitive with other more specific solutions to the problem proposed in the Kaggle competition.
CVDec 8, 2023
PixLore: A Dataset-driven Approach to Rich Image CaptioningDiego Bonilla-Salvador, Marcelino Martínez-Sober, Joan Vila-Francés et al.
In the domain of vision-language integration, generating detailed image captions poses a significant challenge due to the lack of curated and rich datasets. This study introduces PixLore, a novel method that leverages Querying Transformers through the fine-tuning of the BLIP-2 model using the LoRa method on a standard commercial GPU. The followed approach, which involves training on a carefully assembled dataset from state-of-the-art Computer Vision models combined and augmented by ChatGPT, addresses the question of whether intricate image understanding can be achieved with an ensemble of smaller-scale models, referred to as Knowledge Stitching. Comparative evaluations against major models such as GPT-4 and Google Bard demonstrate that PixLore-2.7B, despite having considerably fewer parameters, is rated higher than the existing State-of-the-Art models in over half of the assessments. Precisely, PixLore outperform Bard and BLIP-2, which score approximately 35.18% and 27.98% lower than PixLore in the task of image captioning. This research not only presents a groundbreaking approach but also highlights the importance of well-curated datasets in enhancing the performance of smaller models.
CVMay 21, 2024
Multimodal video analysis for crowd anomaly detection using open access tourism camerasAlejandro Dionis-Ros, Joan Vila-Francés, Rafael Magdalena-Benedicto et al.
In this article, we propose the detection of crowd anomalies through the extraction of information in the form of time series from video format using a multimodal approach. Through pattern recognition algorithms and segmentation, informative measures of the number of people and image occupancy are extracted at regular intervals, which are then analyzed to obtain trends and anomalous behaviors. Specifically, through temporal decomposition and residual analysis, intervals or specific situations of unusual behaviors are identified, which can be used in decision-making and improvement of actions in sectors related to human movement such as tourism or security. The application of this methodology on the webcam of Turisme Comunitat Valenciana in the town of Morella (Comunitat Valenciana, Spain) has provided excellent results. It is shown to correctly detect specific anomalous situations and unusual overall increases during the previous weekend and during the festivities in October 2023. These results have been obtained while preserving the confidentiality of individuals at all times by using measures that maximize anonymity, without trajectory recording or person recognition.
GEO-PHDec 22, 2020
Learning Structures in Earth Observation Data with Gaussian ProcessesFernando Mateo, Jordi Munoz-Mari, Valero Laparra et al.
Gaussian Processes (GPs) has experienced tremendous success in geoscience in general and for bio-geophysical parameter retrieval in the last years. GPs constitute a solid Bayesian framework to formulate many function approximation problems consistently. This paper reviews the main theoretical GP developments in the field. We review new algorithms that respect the signal and noise characteristics, that provide feature rankings automatically, and that allow applicability of associated uncertainty intervals to transport GP models in space and time. All these developments are illustrated in the field of geoscience and remote sensing at a local and global scales through a set of illustrative examples.