29.3IRApr 23
Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge DistillationNikita Severin, Danil Kartushov, Vladislav Urzhumov et al.
Sequential recommender systems have achieved significant success in modeling temporal user behavior but remain limited in capturing rich user semantics beyond interaction patterns. Large Language Models (LLMs) present opportunities to enhance user understanding with their reasoning capabilities, yet existing integration approaches create prohibitive inference costs in real time. To address these limitations, we present a novel knowledge distillation method that utilizes textual user profile generated by pre-trained LLMs into sequential recommenders without requiring LLM inference at serving time. The resulting approach maintains the inference efficiency of traditional sequential models while requiring neither architectural modifications nor LLM fine-tuning.
17.7LGApr 10
Beyond Isolated Clients: Integrating Graph-Based Embeddings into Event Sequence ModelsHarry Proshian, Nikita Severin, Sergey Nikolenko et al.
Large-scale digital platforms generate billions of timestamped user-item interactions (events) that are crucial for predicting user attributes in, e.g., fraud prevention and recommendations. While self-supervised learning (SSL) effectively models the temporal order of events, it typically overlooks the global structure of the user-item interaction graph. To bridge this gap, we propose three model-agnostic strategies for integrating this structural information into contrastive SSL: enriching event embeddings, aligning client representations with graph embeddings, and adding a structural pretext task. Experiments on four financial and e-commerce datasets demonstrate that our approach consistently improves the accuracy (up to a 2.3% AUC) and reveals that graph density is a key factor in selecting the optimal integration strategy.
IRNov 1, 2024
LLM-KT: A Versatile Framework for Knowledge Transfer from Large Language Models to Collaborative FilteringNikita Severin, Aleksei Ziablitsev, Yulia Savelyeva et al.
We present LLM-KT, a flexible framework designed to enhance collaborative filtering (CF) models by seamlessly integrating LLM (Large Language Model)-generated features. Unlike existing methods that rely on passing LLM-generated features as direct inputs, our framework injects these features into an intermediate layer of any CF model, allowing the model to reconstruct and leverage the embeddings internally. This model-agnostic approach works with a wide range of CF models without requiring architectural changes, making it adaptable to various recommendation scenarios. Our framework is built for easy integration and modification, providing researchers and developers with a powerful tool for extending CF model capabilities through efficient knowledge transfer. We demonstrate its effectiveness through experiments on the MovieLens and Amazon datasets, where it consistently improves baseline CF models. Experimental studies showed that LLM-KT is competitive with the state-of-the-art methods in context-aware settings but can be applied to a broader range of CF models than current approaches.
CLJun 29, 2025
ATGen: A Framework for Active Text GenerationAkim Tsvigun, Daniil Vasilev, Ivan Tsvigun et al.
Active learning (AL) has demonstrated remarkable potential in reducing the annotation effort required for training machine learning models. However, despite the surging popularity of natural language generation (NLG) tasks in recent years, the application of AL to NLG has been limited. In this paper, we introduce Active Text Generation (ATGen) - a comprehensive framework that bridges AL with text generation tasks, enabling the application of state-of-the-art AL strategies to NLG. Our framework simplifies AL-empowered annotation in NLG tasks using both human annotators and automatic annotation agents based on large language models (LLMs). The framework supports LLMs deployed as services, such as ChatGPT and Claude, or operated on-premises. Furthermore, ATGen provides a unified platform for smooth implementation and benchmarking of novel AL strategies tailored to NLG tasks. Finally, we present evaluation results for state-of-the-art AL strategies across diverse settings and multiple text generation tasks. We show that ATGen reduces both the effort of human annotators and costs associated with API calls to LLM-based annotation agents. The code of the framework is available on GitHub under the MIT license. The video presentation is available at http://atgen-video.nlpresearch.group
AIJun 5, 2024
The Good, the Bad, and the Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining GamesMikhail Mozikov, Nikita Severin, Valeria Bodishtianu et al.
Behavior study experiments are an important part of society modeling and understanding human interactions. In practice, many behavioral experiments encounter challenges related to internal and external validity, reproducibility, and social bias due to the complexity of social interactions and cooperation in human user studies. Recent advances in Large Language Models (LLMs) have provided researchers with a new promising tool for the simulation of human behavior. However, existing LLM-based simulations operate under the unproven hypothesis that LLM agents behave similarly to humans as well as ignore a crucial factor in human decision-making: emotions. In this paper, we introduce a novel methodology and the framework to study both, the decision-making of LLMs and their alignment with human behavior under emotional states. Experiments with GPT-3.5 and GPT-4 on four games from two different classes of behavioral game theory showed that emotions profoundly impact the performance of LLMs, leading to the development of more optimal strategies. While there is a strong alignment between the behavioral responses of GPT-3.5 and human participants, particularly evident in bargaining games, GPT-4 exhibits consistent behavior, ignoring induced emotions for rationality decisions. Surprisingly, emotional prompting, particularly with `anger' emotion, can disrupt the "superhuman" alignment of GPT-4, resembling human emotional responses.
LGAug 19, 2021
Temporal Graph Network Embedding with Causal Anonymous Walks RepresentationsIlya Makarov, Andrey Savchenko, Arseny Korovko et al.
Many tasks in graph machine learning, such as link prediction and node classification, are typically solved by using representation learning, in which each node or edge in the network is encoded via an embedding. Though there exists a lot of network embeddings for static graphs, the task becomes much more complicated when the dynamic (i.e. temporal) network is analyzed. In this paper, we propose a novel approach for dynamic network representation learning based on Temporal Graph Network by using a highly custom message generating function by extracting Causal Anonymous Walks. For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings. This work provides the first comprehensive comparison framework for temporal network representation learning in every available setting for graph machine learning problems involving node classification and link prediction. The proposed model outperforms state-of-the-art baseline models. The work also justifies the difference between them based on evaluation in various transductive/inductive edge/node classification tasks. In addition, we show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks, involving credit scoring based on transaction data.