CVFeb 4, 2024
Uncertainty-Aware PerceiverEuiYul Song
The Perceiver makes few architectural assumptions about the relationship among its inputs with quadratic scalability on its memory and computation time. Indeed, the Perceiver model outpaces or is competitive with ResNet-50 and ViT in terms of accuracy to some degree. However, the Perceiver does not take predictive uncertainty and calibration into account. The Perceiver also generalizes its performance on three datasets, three models, one evaluation metric, and one hyper-parameter setting. Worst of all, the Perceiver's relative performance improvement against other models is marginal. Furthermore, its reduction of architectural prior is not substantial; is not equivalent to its quality. Thereby, I invented five mutations of the Perceiver, the Uncertainty-Aware Perceivers, that obtain uncertainty estimates and measured their performance on three metrics. Experimented with CIFAR-10 and CIFAR-100, the Uncertainty-Aware Perceivers make considerable performance enhancement compared to the Perceiver.
IRFeb 4, 2024
eXplainable Bayesian Multi-Perspective Generative RetrievalEuiYul Song, Philhoon Oh, Sangryul Kim et al.
Modern deterministic retrieval pipelines prioritize achieving state-of-the-art performance but often lack interpretability in decision-making. These models face challenges in assessing uncertainty, leading to overconfident predictions. To overcome these limitations, we integrate uncertainty calibration and interpretability into a retrieval pipeline. Specifically, we introduce Bayesian methodologies and multi-perspective retrieval to calibrate uncertainty within a retrieval pipeline. We incorporate techniques such as LIME and SHAP to analyze the behavior of a black-box reranker model. The importance scores derived from these explanation methodologies serve as supplementary relevance scores to enhance the base reranker model. We evaluate the resulting performance enhancements achieved through uncertainty calibration and interpretable reranking on Question Answering and Fact Checking tasks. Our methods demonstrate substantial performance improvements across three KILT datasets.