LG CL MLSep 13, 2023

Unsupervised Contrast-Consistent Ranking with Language Models

Niklas Stoehr, Pengxiang Cheng, Jing Wang, Daniel Preotiuc-Pietro, Rajarshi Bhowmik

ETH Zurich

arXiv:2309.06991v235.7111 citationsh-index: 35Has Code

Originality Incremental advance

AI Analysis

This addresses the issue of unreliable ranking outputs from language models for users in NLP applications, though it is incremental as it builds on existing probing and ranking techniques.

The paper tackled the problem of language models producing inconsistent rankings when prompted, and found that their unsupervised Contrast-Consistent Ranking method performed as well as or better than prompting across models and datasets.

Language models contain ranking-based knowledge and are powerful solvers of in-context ranking tasks. For instance, they may have parametric knowledge about the ordering of countries by size or may be able to rank product reviews by sentiment. We compare pairwise, pointwise and listwise prompting techniques to elicit a language model's ranking knowledge. However, we find that even with careful calibration and constrained decoding, prompting-based techniques may not always be self-consistent in the rankings they produce. This motivates us to explore an alternative approach that is inspired by an unsupervised probing method called Contrast-Consistent Search (CCS). The idea is to train a probe guided by a logical constraint: a language model's representation of a statement and its negation must be mapped to contrastive true-false poles consistently across multiple statements. We hypothesize that similar constraints apply to ranking tasks where all items are related via consistent, pairwise or listwise comparisons. To this end, we extend the binary CCS method to Contrast-Consistent Ranking (CCR) by adapting existing ranking methods such as the Max-Margin Loss, Triplet Loss and an Ordinal Regression objective. Across different models and datasets, our results confirm that CCR probing performs better or, at least, on a par with prompting.

View on arXiv PDF Code

Similar