IRMay 24

Multilingual Humour-Aware Retrieval with Dense and Re-Ranking Models

arXiv:2605.2516542.3

Predicted impact top 63% in IR · last 90 daysOriginality Synthesis-oriented

AI Analysis

For researchers in humour-aware IR, this work provides baselines and reveals cross-lingual performance gaps, but the findings are incremental as they confirm known limitations of dense retrieval for non-semantic cues.

The paper investigates multilingual humour-aware information retrieval using dense and re-ranking models on the CLEF 2025 JOKER Task 1 benchmark. Results show strong performance for Portuguese but significantly worse for English, highlighting limitations of semantic dense representations for humour retrieval.

Humour-aware information retrieval poses unique challenges beyond standard semantic retrieval, as systems must account not only for topical relevance but also for humour-specific linguistic phenomena such as wordplay, phonetic ambiguity, and polysemy. In this paper, Team DUTH studies multilingual humour-aware information retrieval using the CLEF 2025 JOKER Task 1 benchmark, which evaluates humour retrieval in English and Portuguese. Our approach combines multilingual XLM-RoBERTa-based dense retrieval with additional system variants, including neural re-ranking, in order to assess the extent to which general-purpose Transformer models can capture humour-specific relevance. The results reveal substantial cross-lingual variation. While the Portuguese runs demonstrate comparatively strong performance across MAP, MRR, and early precision metrics, the English runs perform significantly worse, with relevant humorous documents frequently appearing at lower ranks. These findings highlight the limitations of purely semantic dense representations for humour retrieval, particularly when humour depends on surface-level cues that are not explicitly modelled by multilingual encoders. We further analyse contributing factors to this discrepancy, including dataset characteristics, query-document alignment, and variation in humour mechanisms. Overall, the Team DUTH experiments establish multilingual dense-retrieval and re-ranking baselines and provide insights into the challenges of modelling humour-aware relevance within the JOKER framework.

View on arXiv PDF

Similar