CVDec 3, 2025Code
How (Mis)calibrated is Your Federated CLIP and What To Do About It?Mainak Singha, Masih Aminbeidokhti, Paolo Casari et al.
While vision-language models like CLIP have been extensively studied, their calibration, crucial for reliable predictions, has received limited attention. Although a few prior works have examined CLIP calibration in offline settings, the impact of fine-tuning CLIP in a federated learning (FL) setup remains unexplored. In this work, we investigate how FL affects CLIP calibration and propose strategies to improve reliability in this distributed setting. We first analyze Textual Prompt Tuning approaches and show that they degrade calibration metrics when operating under FL. We also evaluate existing in-training calibration techniques across four global aggregation methods, finding that they provide limited improvements. Our results suggest that the key challenge lies not only in how we aggregate or calibrate, but in which components we choose to fine-tune. Motivated by this insight, we propose $\text{FL}^2\text{oRA}$, a straightforward LoRA-based approach that naturally improves calibration in FL, and we analyze the factors behind its effectiveness. Experiments on multiple benchmarks demonstrate that $\text{FL}^2\text{oRA}$ consistently produces well-calibrated models, reducing the need for explicit calibration procedures. Codes are available at https://github.com/mainaksingha01/FL2oRA.
SPNov 30, 2023
Indoor Millimeter Wave Localization using Multiple Self-Supervised Tiny Neural NetworksAnish Shastri, Andres Garcia-Saavedra, Paolo Casari
We consider the localization of a mobile millimeter-wave client in a large indoor environment using multilayer perceptron neural networks (NNs). Instead of training and deploying a single deep model, we proceed by choosing among multiple tiny NNs trained in a self-supervised manner. The main challenge then becomes to determine and switch to the best NN among the available ones, as an incorrect NN will fail to localize the client. In order to upkeep the localization accuracy, we propose two switching schemes: one based on a Kalman filter, and one based on the statistical distribution of the training data. We analyze the proposed schemes via simulations, showing that our approach outperforms both geometric localization schemes and the use of a single NN.
CVFeb 23Code
CLIPoint3D: Language-Grounded Few-Shot Unsupervised 3D Point Cloud Domain AdaptationMainak Singha, Sarthak Mehrotra, Paolo Casari et al.
Recent vision-language models (VLMs) such as CLIP demonstrate impressive cross-modal reasoning, extending beyond images to 3D perception. Yet, these models remain fragile under domain shifts, especially when adapting from synthetic to real-world point clouds. Conventional 3D domain adaptation approaches rely on heavy trainable encoders, yielding strong accuracy but at the cost of efficiency. We introduce CLIPoint3D, the first framework for few-shot unsupervised 3D point cloud domain adaptation built upon CLIP. Our approach projects 3D samples into multiple depth maps and exploits the frozen CLIP backbone, refined through a knowledge-driven prompt tuning scheme that integrates high-level language priors with geometric cues from a lightweight 3D encoder. To adapt task-specific features effectively, we apply parameter-efficient fine-tuning to CLIP's encoders and design an entropy-guided view sampling strategy for selecting confident projections. Furthermore, an optimal transport-based alignment loss and an uncertainty-aware prototype alignment loss collaboratively bridge source-target distribution gaps while maintaining class separability. Extensive experiments on PointDA-10 and GraspNetPC-10 benchmarks show that CLIPoint3D achieves consistent 3-16% accuracy gains over both CLIP-based and conventional encoder-based baselines. Codes are available at https://github.com/SarthakM320/CLIPoint3D.
NIDec 10, 2021
A Review of Indoor Millimeter Wave Device-based Localization and Device-free Sensing Technologies and ApplicationsAnish Shastri, Neharika Valecha, Enver Bashirov et al.
The commercial availability of low-cost millimeter wave (mmWave) communication and radar devices is starting to improve the penetration of such technologies in consumer markets, paving the way for large-scale and dense deployments in fifth-generation (5G)-and-beyond as well as 6G networks. At the same time, pervasive mmWave access will enable device localization and device-free sensing with unprecedented accuracy, especially with respect to sub-6 GHz commercial-grade devices. This paper surveys the state of the art in device-based localization and device-free sensing using mmWave communication and radar devices, with a focus on indoor deployments. We first overview key concepts about mmWave signal propagation and system design. Then, we provide a detailed account of approaches and algorithms for localization and sensing enabled by mmWaves. We consider several dimensions in our analysis, including the main objectives, techniques, and performance of each work, whether each research reached some degree of implementation, and which hardware platforms were used for this purpose. We conclude by discussing that better algorithms for consumer-grade devices, data fusion methods for dense deployments, as well as an educated application of machine learning methods are promising, relevant and timely research directions.
NIDec 9, 2021
Millimeter Wave Localization with Imperfect Training Data using Shallow Neural NetworksAnish Shastri, Joan Palacios, Paolo Casari
Millimeter wave (mmWave) localization algorithms exploit the quasi-optical propagation of mmWave signals, which yields sparse angular spectra at the receiver. Geometric approaches to angle-based localization typically require to know the map of the environment and the location of the access points. Thus, several works have resorted to automated learning in order to infer a device's location from the properties of the received mmWave signals. However, collecting training data for such models is a significant burden. In this work, we propose a shallow neural network model to localize mmWave devices indoors. This model requires significantly fewer weights than those proposed in the literature. Therefore, it is amenable for implementation in resource-constrained hardware, and needs fewer training samples to converge. We also propose to relieve training data collection efforts by retrieving (inherently imperfect) location estimates from geometry-based mmWave localization algorithms. Even in this case, our results show that the proposed neural networks perform as good as or better than state-of-the-art algorithms.
NISep 12, 2019
SQLR: Short-Term Memory Q-Learning for Elastic ProvisioningConstantine Ayimba, Paolo Casari, Vincenzo Mancuso
As more and more application providers transition to the cloud and deliver their services on a Software as a Service (SaaS) basis, cloud providers need to make their provisioning systems agile enough to meet Service Level Agreements. At the same time they should guard against over-provisioning which limits their capacity to accommodate more tenants. To this end we propose SQLR, a dynamic provisioning system employing a customized model-free reinforcement learning algorithm that is capable of reusing contextual knowledge learned from one workload to optimize resource provisioning for other workload patterns. SQLR achieves results comparable to those where resources are unconstrained, with minimal overhead. Our experiments show that we can reduce the amount of provisioned resources by almost 25% with less than 1% overall service unavailability (due to blocking) while delivering similar response times as those of an over-provisioned system.
LGJun 4, 2019
A Novel Hyperparameter-free Approach to Decision Tree Construction that Avoids Overfitting by DesignRafael Garcia Leiva, Antonio Fernandez Anta, Vincenzo Mancuso et al.
Decision trees are an extremely popular machine learning technique. Unfortunately, overfitting in decision trees still remains an open issue that sometimes prevents achieving good performance. In this work, we present a novel approach for the construction of decision trees that avoids the overfitting by design, without losing accuracy. A distinctive feature of our algorithm is that it requires neither the optimization of any hyperparameters, nor the use of regularization techniques, thus significantly reducing the decision tree training time. Moreover, our algorithm produces much smaller and shallower trees than traditional algorithms, facilitating the interpretability of the resulting models.