32.8LGMar 10
Reconstructing Movement from Sparse Samples: Enhanced Spatio-Temporal Matching Strategies for Low-Frequency DataAli Yousefian, Arianna Burzacchi, Simone Vantini
This paper explores potential improvements to the Spatial-Temporal Matching algorithm for aligning the GPS trajectories to road networks. While this algorithm is effective, it presents some limitations in computational efficiency and the accuracy of the results, especially in dense environments with relatively high sampling intervals. To address this, the paper proposes four modifications to the original algorithm: a dynamic buffer, an adaptive observation probability, a redesigned temporal scoring function, and a behavioral analysis to account for the historical mobility patterns. The enhancements are assessed using real-world data from the urban area of Milan, and through newly defined evaluation metrics to be applied in the absence of ground truth. The results of the experiment show significant improvements in performance efficiency and path quality across various metrics.
MEJan 29, 2025
Noise-Adaptive Conformal Classification with Marginal CoverageTeresa Bortolotti, Y. X. Rachel Wang, Xin Tong et al.
Conformal inference provides a rigorous statistical framework for uncertainty quantification in machine learning, enabling well-calibrated prediction sets with precise coverage guarantees for any classification model. However, its reliance on the idealized assumption of perfect data exchangeability limits its effectiveness in the presence of real-world complications, such as low-quality labels -- a widespread issue in modern large-scale data sets. This work tackles this open problem by introducing an adaptive conformal inference method capable of efficiently handling deviations from exchangeability caused by random label noise, leading to informative prediction sets with tight marginal coverage guarantees even in those challenging scenarios. We validate our method through extensive numerical experiments demonstrating its effectiveness on synthetic and real data sets, including CIFAR-10H and BigEarthNet.
IMJul 18, 2025
Hyper-spectral Unmixing algorithms for remote compositional surface mapping: a review of the state of the artAlfredo Gimenez Zapiola, Andrea Boselli, Alessandra Menafoglio et al.
This work concerns a detailed review of data analysis methods used for remotely sensed images of large areas of the Earth and of other solid astronomical objects. In detail, it focuses on the problem of inferring the materials that cover the surfaces captured by hyper-spectral images and estimating their abundances and spatial distributions within the region. The most successful and relevant hyper-spectral unmixing methods are reported as well as compared, as an addition to analysing the most recent methodologies. The most important public data-sets in this setting, which are vastly used in the testing and validation of the former, are also systematically explored. Finally, open problems are spotlighted and concrete recommendations for future research are provided.
MENov 2, 2020
Ridge regression with adaptive additive rectangles and other piecewise functional templatesEdoardo Belli, Simone Vantini
We propose an $L_{2}$-based penalization algorithm for functional linear regression models, where the coefficient function is shrunk towards a data-driven shape template $γ$, which is constrained to belong to a class of piecewise functions by restricting its basis expansion. In particular, we focus on the case where $γ$ can be expressed as a sum of $q$ rectangles that are adaptively positioned with respect to the regression error. As the problem of finding the optimal knot placement of a piecewise function is nonconvex, the proposed parametrization allows to reduce the number of variables in the global optimization scheme, resulting in a fitting algorithm that alternates between approximating a suitable template and solving a convex ridge-like problem. The predictive power and interpretability of our method is shown on multiple simulations and two real world case studies.
MLOct 30, 2020
Measure Inducing Classification and Regression Trees for Functional DataEdoardo Belli, Simone Vantini
We propose a tree-based algorithm for classification and regression problems in the context of functional data analysis, which allows to leverage representation learning and multiple splitting rules at the node level, reducing generalization error while retaining the interpretability of a tree. This is achieved by learning a weighted functional $L^{2}$ space by means of constrained convex optimization, which is then used to extract multiple weighted integral features from the input functions, in order to determine the binary split for each internal node of the tree. The approach is designed to manage multiple functional inputs and/or outputs, by defining suitable splitting rules and loss functions that can depend on the specific problem and can also be combined with scalar and categorical data, as the tree is grown with the original greedy CART algorithm. We focus on the case of scalar-valued functional inputs defined on unidimensional domains and illustrate the effectiveness of our method in both classification and regression tasks, through a simulation study and four real world applications.
LGMay 16, 2020
Conformal Prediction: a Unified Review of Theory and New ChallengesMatteo Fontana, Gianluca Zeni, Simone Vantini
In this work we provide a review of basic ideas and novel developments about Conformal Prediction -- an innovative distribution-free, non-parametric forecasting method, based on minimal assumptions -- that is able to yield in a very straightforward way predictions sets that are valid in a statistical sense also in in the finite sample case. The in-depth discussion provided in the paper covers the theoretical underpinnings of Conformal Prediction, and then proceeds to list the more advanced developments and adaptations of the original idea.