CVMar 15, 2024

Evaluating Perceptual Distance Models by Fitting Binomial Distributions to Two-Alternative Forced Choice Data

arXiv:2403.10390v4h-index: 3
Originality Incremental advance
AI Analysis

This addresses a methodological bottleneck for researchers in psychophysics and computer vision by providing a more robust evaluation framework for perceptual distance models, though it is incremental as it builds on existing probabilistic concepts.

The paper tackled the problem of evaluating perceptual distance models using Two-Alternative Forced Choice (2AFC) data, which is challenging in large datasets like BAPPS due to independent image comparisons, by introducing a probabilistic method based on maximum likelihood estimation of a binomial decision model, resulting in a simpler, more interpretable, and computationally efficient approach compared to existing ad-hoc neural network methods.

The Two Alternative Forced Choice (2AFC) paradigm offers advantages over the Mean Opinion Score (MOS) paradigm in psychophysics (PF), such as simplicity and robustness. However, when evaluating perceptual distance models, MOS enables direct correlation between model predictions and PF data. In contrast, 2AFC only allows pairwise comparisons to be converted into a quality ranking similar to MOS when comparisons include shared images. In large datasets, like BAPPS, where image patches and distortions are combined randomly, deriving rankings from 2AFC PF data becomes infeasible, as distorted images included in each comparisons are independent. To address this, instead of relying on MOS correlation, researchers have trained ad-hoc neural networks to reproduce 2AFC PF data based on pairs of model distances - a black-box approach with conceptual and operational limitations. This paper introduces a more robust distance-model evaluation method using a pure probabilistic approach, applying maximum likelihood estimation to a binomial decision model. Our method demonstrates superior simplicity, interpretability, flexibility, and computational efficiency, as shown through evaluations of various visual distance models on two 2AFC PF datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes