CLAIOct 23, 2024

Key Algorithms for Keyphrase Generation: Instruction-Based LLMs for Russian Scientific Keyphrases

arXiv:2410.18040v12 citationsh-index: 4AIST
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of adapting keyphrase generation to Russian, which is incremental as it applies existing LLM methods to a new language domain.

The study tackled keyphrase generation for Russian scientific abstracts by evaluating prompt-based LLMs against traditional methods, finding that even simple prompts can outperform common baselines.

Keyphrase selection is a challenging task in natural language processing that has a wide range of applications. Adapting existing supervised and unsupervised solutions for the Russian language faces several limitations due to the rich morphology of Russian and the limited number of training datasets available. Recent studies conducted on English texts show that large language models (LLMs) successfully address the task of generating keyphrases. LLMs allow achieving impressive results without task-specific fine-tuning, using text prompts instead. In this work, we access the performance of prompt-based methods for generating keyphrases for Russian scientific abstracts. First, we compare the performance of zero-shot and few-shot prompt-based methods, fine-tuned models, and unsupervised methods. Then we assess strategies for selecting keyphrase examples in a few-shot setting. We present the outcomes of human evaluation of the generated keyphrases and analyze the strengths and weaknesses of the models through expert assessment. Our results suggest that prompt-based methods can outperform common baselines even using simple text prompts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes