IRMay 9, 2022
Are Quantum Computers Practical Yet? A Case for Feature Selection in Recommender Systems using Tensor NetworksArtyom Nikitin, Andrei Chertkov, Rafael Ballester-Ripoll et al.
Collaborative filtering models generally perform better than content-based filtering models and do not require careful feature engineering. However, in the cold-start scenario collaborative information may be scarce or even unavailable, whereas the content information may be abundant, but also noisy and expensive to acquire. Thus, selection of particular features that improve cold-start recommendations becomes an important and non-trivial task. In the recent approach by Nembrini et al., the feature selection is driven by the correlational compatibility between collaborative and content-based models. The problem is formulated as a Quadratic Unconstrained Binary Optimization (QUBO) which, due to its NP-hard complexity, is solved using Quantum Annealing on a quantum computer provided by D-Wave. Inspired by the reported results, we contend the idea that current quantum annealers are superior for this problem and instead focus on classical algorithms. In particular, we tackle QUBO via TTOpt, a recently proposed black-box optimizer based on tensor networks and multilinear algebra. We show the computational feasibility of this method for large problems with thousands of features, and empirically demonstrate that the solutions found are comparable to the ones obtained with D-Wave across all examined datasets.
LGDec 12, 2022
Tensor-based Sequential Learning via Hankel Matrix Representation for Next Item RecommendationsEvgeny Frolov, Ivan Oseledets
Self-attentive transformer models have recently been shown to solve the next item recommendation task very efficiently. The learned attention weights capture sequential dynamics in user behavior and generalize well. Motivated by the special structure of learned parameter space, we question if it is possible to mimic it with an alternative and more lightweight approach. We develop a new tensor factorization-based model that ingrains the structural knowledge about sequential data within the learning process. We demonstrate how certain properties of a self-attention network can be reproduced with our approach based on special Hankel matrix representation. The resulting model has a shallow linear architecture and compares competitively to its neural counterpart.
CLApr 22, 2022
MEKER: Memory Efficient Knowledge Embedding Representation for Link Prediction and Question AnsweringViktoriia Chekalina, Anton Razzhigaev, Albert Sayapin et al.
Knowledge Graphs (KGs) are symbolically structured storages of facts. The KG embedding contains concise data used in NLP tasks requiring implicit information about the real world. Furthermore, the size of KGs that may be useful in actual NLP assignments is enormous, and creating embedding over it has memory cost issues. We represent KG as a 3rd-order binary tensor and move beyond the standard CP decomposition by using a data-specific generalized version of it. The generalization of the standard CP-ALS algorithm allows obtaining optimization gradients without a backpropagation mechanism. It reduces the memory needed in training while providing computational benefits. We propose a MEKER, a memory-efficient KG embedding model, which yields SOTA-comparable performance on link prediction tasks and KG-based Question Answering.
IRMay 10, 2022
Tensor-based Collaborative Filtering With Smooth Ratings ScaleNikita Marin, Elizaveta Makhneva, Maria Lysyuk et al.
Conventional collaborative filtering techniques don't take into consideration the effect of discrepancy in users' rating perception. Some users may rarely give 5 stars to items while others almost always assign 5 stars to the chosen item. Even if they had experience with the same items this systematic discrepancy in their evaluation style will lead to the systematic errors in the ability of recommender system to effectively extract right patterns from data. To mitigate this problem we introduce the ratings' similarity matrix which represents the dependency between different values of ratings on the population level. Hence, if on average the correlations between ratings exist, it is possible to improve the quality of proposed recommendations by off-setting the effect of either shifted down or shifted up users' rates.
IRFeb 5, 2023
Federated Privacy-preserving Collaborative Filtering for On-Device Next App PredictionAlbert Sayapin, Gleb Balitskiy, Daniel Bershatsky et al.
In this study, we propose a novel SeqMF model to solve the problem of predicting the next app launch during mobile device usage. Although this problem can be represented as a classical collaborative filtering problem, it requires proper modification since the data are sequential, the user feedback is distributed among devices and the transmission of users' data to aggregate common patterns must be protected against leakage. According to such requirements, we modify the structure of the classical matrix factorization model and update the training procedure to sequential learning. Since the data about user experience are distributed among devices, the federated learning setup is used to train the proposed sequential matrix factorization model. One more ingredient of the proposed approach is a new privacy mechanism that guarantees the protection of the sent data from the users to the remote server. To demonstrate the efficiency of the proposed model we use publicly available mobile user behavior data. We compare our model with sequential rules and models based on the frequency of app launches. The comparison is conducted in static and dynamic environments. The static environment evaluates how our model processes sequential data compared to competitors. Therefore, the standard train-validation-test evaluation procedure is used. The dynamic environment emulates the real-world scenario, where users generate new data by running apps on devices, and evaluates our model in this case. Our experiments show that the proposed model provides comparable quality with other methods in the static environment. However, more importantly, our method achieves a better privacy-utility trade-off than competitors in the dynamic environment, which provides more accurate simulations of real-world usage.
AIJan 8, 2023
Mitigating Human and Computer Opinion Fraud via Contrastive LearningYuliya Tukmacheva, Ivan Oseledets, Evgeny Frolov
We introduce the novel approach towards fake text reviews detection in collaborative filtering recommender systems. The existing algorithms concentrate on detecting the fake reviews, generated by language models and ignore the texts, written by dishonest users, mostly for monetary gains. We propose the contrastive learning-based architecture, which utilizes the user demographic characteristics, along with the text reviews, as the additional evidence against fakes. This way, we are able to account for two different types of fake reviews spamming and make the recommendation system more robust to biased reviews.
IRFeb 22Code
SplitLight: An Exploratory Toolkit for Recommender Systems Datasets and SplitsAnna Volodkevich, Dmitry Anikin, Danil Gusak et al.
Offline evaluation of recommender systems is often affected by hidden, under-documented choices in data preparation. Seemingly minor decisions in filtering, handling repeats, cold-start treatment, and splitting strategy design can substantially reorder model rankings and undermine reproducibility and cross-paper comparability. In this paper, we introduce SplitLight, an open-source exploratory toolkit that enables researchers and practitioners designing preprocessing and splitting pipelines or reviewing external artifacts to make these decisions measurable, comparable, and reportable. Given an interaction log and derived split subsets, SplitLight analyzes core and temporal dataset statistics, characterizes repeat consumption patterns and timestamp anomalies, and diagnoses split validity, including temporal leakage, cold-user/item exposure, and distribution shifts. SplitLight further allows side-by-side comparison of alternative splitting strategies through comprehensive aggregated summaries and interactive visualizations. Delivered as both a Python toolkit and an interactive no-code interface, SplitLight produces audit summaries that justify evaluation protocols and support transparent, reliable, and comparable experimentation in recommender systems research and industry.
IRSep 27, 2024
Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item CatalogsGleb Mezentsev, Danil Gusak, Ivan Oseledets et al.
Scalability issue plays a crucial role in productionizing modern recommender systems. Even lightweight architectures may suffer from high computational overload due to intermediate calculations, limiting their practicality in real-world applications. Specifically, applying full Cross-Entropy (CE) loss often yields state-of-the-art performance in terms of recommendations quality. Still, it suffers from excessive GPU memory utilization when dealing with large item catalogs. This paper introduces a novel Scalable Cross-Entropy (SCE) loss function in the sequential learning setup. It approximates the CE loss for datasets with large-size catalogs, enhancing both time efficiency and memory usage without compromising recommendations quality. Unlike traditional negative sampling methods, our approach utilizes a selective GPU-efficient computation strategy, focusing on the most informative elements of the catalog, particularly those most likely to be false positives. This is achieved by approximating the softmax distribution over a subset of the model outputs through the maximum inner product search. Experimental results on multiple datasets demonstrate the effectiveness of SCE in reducing peak memory usage by a factor of up to 100 compared to the alternatives, retaining or even exceeding their metrics values. The proposed approach also opens new perspectives for large-scale developments in different domains, such as large language models.
IRAug 5, 2024
RECE: Reduced Cross-Entropy Loss for Large-Catalogue Sequential RecommendersDanil Gusak, Gleb Mezentsev, Ivan Oseledets et al.
Scalability is a major challenge in modern recommender systems. In sequential recommendations, full Cross-Entropy (CE) loss achieves state-of-the-art recommendation quality but consumes excessive GPU memory with large item catalogs, limiting its practicality. Using a GPU-efficient locality-sensitive hashing-like algorithm for approximating large tensor of logits, this paper introduces a novel RECE (REduced Cross-Entropy) loss. RECE significantly reduces memory consumption while allowing one to enjoy the state-of-the-art performance of full CE loss. Experimental results on various datasets show that RECE cuts training peak memory usage by up to 12 times compared to existing methods while retaining or exceeding performance metrics of CE loss. The approach also opens up new possibilities for large-scale applications in other domains.
IRJul 22, 2025Code
Time to Split: Exploring Data Splitting Strategies for Offline Evaluation of Sequential RecommendersDanil Gusak, Anna Volodkevich, Anton Klenitskiy et al.
Modern sequential recommender systems, ranging from lightweight transformer-based variants to large language models, have become increasingly prominent in academia and industry due to their strong performance in the next-item prediction task. Yet common evaluation protocols for sequential recommendations remain insufficiently developed: they often fail to reflect the corresponding recommendation task accurately, or are not aligned with real-world scenarios. Although the widely used leave-one-out split matches next-item prediction, it permits the overlap between training and test periods, which leads to temporal leakage and unrealistically long test horizon, limiting real-world relevance. Global temporal splitting addresses these issues by evaluating on distinct future periods. However, its applications to sequential recommendations remain loosely defined, particularly in terms of selecting target interactions and constructing a validation subset that provides necessary consistency between validation and test metrics. In this paper, we demonstrate that evaluation outcomes can vary significantly across splitting strategies, influencing model rankings and practical deployment decisions. To improve reproducibility in both academic and industrial settings, we systematically compare different splitting strategies for sequential recommendations across multiple datasets and established baselines. Our findings show that prevalent splits, such as leave-one-out, may be insufficiently aligned with more realistic evaluation strategies. Code: https://github.com/monkey0head/time-to-split
IRFeb 26
Cross-Representation Knowledge Transfer for Improved Sequential RecommendationsArtur Gimranov, Viacheslav Yusupov, Elfat Sabitov et al.
Transformer architectures, capable of capturing sequential dependencies in the history of user interactions, have become the dominant approach in sequential recommender systems. Despite their success, such models consider sequence elements in isolation, implicitly accounting for the complex relationships between them. Graph neural networks, in contrast, explicitly model these relationships through higher order interactions but are often unable to adequately capture their evolution over time, limiting their use for predicting the next interaction. To fill this gap, we present a new framework that combines transformers and graph neural networks and aligns different representations for solving next-item prediction task. Our solution simultaneously encodes structural dependencies in the interaction graph and tracks their dynamic change. Experimental results on a number of open datasets demonstrate that the proposed framework consistently outperforms both pure sequential and graph approaches in terms of recommendation quality, as well as recent methods that combine both types of signals.
IRFeb 24
Position-Aware Sequential Attention for Accurate Next Item RecommendationsTimur Nabiev, Evgeny Frolov
Sequential self-attention models usually rely on additive positional embeddings, which inject positional information into item representations at the input. In the absence of positional signals, the attention block is permutation-equivariant over sequence positions and thus has no intrinsic notion of temporal order beyond causal masking. We argue that additive positional embeddings make the attention mechanism only superficially sensitive to sequence order: positional information is entangled with item embedding semantics, propagates weakly in deep architectures, and limits the ability to capture rich sequential patterns. To address these limitations, we introduce a kernelized self-attention mechanism, where a learnable positional kernel operates purely in the position space, disentangled from semantic similarity, and directly modulates attention weights. When applied per attention block, this kernel enables adaptive multi-scale sequential modeling. Experiments on standard next-item prediction benchmarks show that our positional kernel attention consistently improves over strong competing baselines.
IRMar 1, 2024
End-to-End Graph-Sequential Representation Learning for Accurate RecommendationsVladimir Baikalov, Evgeny Frolov
Recent recommender system advancements have focused on developing sequence-based and graph-based approaches. Both approaches proved useful in modeling intricate relationships within behavioral data, leading to promising outcomes in personalized ranking and next-item recommendation tasks while maintaining good scalability. However, they capture very different signals from data. While the former approach represents users directly through ordered interactions with recent items, the latter aims to capture indirect dependencies across the interactions graph. This paper presents a novel multi-representational learning framework exploiting these two paradigms' synergies. Our empirical evaluation on several datasets demonstrates that mutual training of sequential and graph components with the proposed framework significantly improves recommendations performance.
LGAug 28, 2025
Dynamic Low-rank Approximation of Full-Matrix Preconditioner for Training Generalized Linear ModelsTatyana Matveeva, Aleksandr Katrutsa, Evgeny Frolov
Adaptive gradient methods like Adagrad and its variants are widespread in large-scale optimization. However, their use of diagonal preconditioning matrices limits the ability to capture parameter correlations. Full-matrix adaptive methods, approximating the exact Hessian, can model these correlations and may enable faster convergence. At the same time, their computational and memory costs are often prohibitive for large-scale models. To address this limitation, we propose AdaGram, an optimizer that enables efficient full-matrix adaptive gradient updates. To reduce memory and computational overhead, we utilize fast symmetric factorization for computing the preconditioned update direction at each iteration. Additionally, we maintain the low-rank structure of a preconditioner along the optimization trajectory using matrix integrator methods. Numerical experiments on standard machine learning tasks show that AdaGram converges faster or matches the performance of diagonal adaptive optimizers when using rank five and smaller rank approximations. This demonstrates AdaGram's potential as a scalable solution for adaptive optimization in large models.
IRAug 8, 2025
Maximum Impact with Fewer Features: Efficient Feature Selection for Cold-Start Recommenders through Collaborative Importance WeightingNikita Sukhorukov, Danil Gusak, Evgeny Frolov
Cold-start challenges in recommender systems necessitate leveraging auxiliary features beyond user-item interactions. However, the presence of irrelevant or noisy features can degrade predictive performance, whereas an excessive number of features increases computational demands, leading to higher memory consumption and prolonged training times. To address this, we propose a feature selection strategy that prioritizes the user behavioral information. Our method enhances the feature representation by incorporating correlations from collaborative behavior data using a hybrid matrix factorization technique and then ranks features using a mechanism based on the maximum volume algorithm. This approach identifies the most influential features, striking a balance between recommendation accuracy and computational efficiency. We conduct an extensive evaluation across various datasets and hybrid recommendation models, demonstrating that our method excels in cold-start scenarios by selecting minimal yet highly effective feature subsets. Even under strict feature reduction, our approach surpasses existing feature selection techniques while maintaining superior efficiency.
LGApr 3, 2025
Knowledge Graph Completion with Mixed Geometry Tensor FactorizationViacheslav Yusupov, Maxim Rakhuba, Evgeny Frolov
In this paper, we propose a new geometric approach for knowledge graph completion via low rank tensor approximation. We augment a pretrained and well-established Euclidean model based on a Tucker tensor decomposition with a novel hyperbolic interaction term. This correction enables more nuanced capturing of distributional properties in data better aligned with real-world knowledge graphs. By combining two geometries together, our approach improves expressivity of the resulting model achieving new state-of-the-art link prediction accuracy with a significantly lower number of parameters compared to the previous Euclidean and hyperbolic models.
CLOct 22, 2025
LLavaCode: Compressed Code Representations for Retrieval-Augmented Code GenerationDaria Cherniuk, Nikita Sukhorukov, Nikita Sushko et al.
Retrieval-augmented generation has emerged as one of the most effective approaches for code completion, particularly when context from a surrounding repository is essential. However, incorporating context significantly extends sequence length, leading to slower inference - a critical limitation for interactive settings such as IDEs. In this work, we introduce LlavaCode, a framework that compresses code into compact, semantically rich representations interpretable by code LLM, enhancing generation quality while reducing the retrieved context to only a few compressed single-token vectors. Using a small projector module we can significantly increase the EM and ES metrics of coding model with negligible latency increase. Our experiments demonstrate that compressed context enables 20-38% reduction in Time-to-First-Token (TTFT) on line completion tasks compared to full-RAG pipelines.
LGOct 22, 2025
Scalable LinUCB: Low-Rank Design Matrix Updates for Recommenders with Large Action SpacesEvgenia Shustova, Marina Sheshukova, Sergey Samsonov et al.
Linear contextual bandits, especially LinUCB, are widely used in recommender systems. However, its training, inference, and memory costs grow with feature dimensionality and the size of the action space. The key bottleneck becomes the need to update, invert and store a design matrix that absorbs contextual information from interaction history. In this paper, we introduce Scalable LinUCB, the algorithm that enables fast and memory efficient operations with the inverse regularized design matrix. We achieve this through a dynamical low-rank parametrization of its inverse Cholesky-style factors. We derive numerically stable rank-1 and batched updates that maintain the inverse without directly forming the entire matrix. To control memory growth, we employ a projector-splitting integrator for dynamical low-rank approximation, yielding average per-step update cost $O(dr)$ and memory $O(dr)$ for approximation rank $r$. Inference complexity of the suggested algorithm is $O(dr)$ per action evaluation. Experiments on recommender system datasets demonstrate the effectiveness of our algorithm.
LGSep 18, 2025
Low-rank surrogate modeling and stochastic zero-order optimization for training of neural networks with black-box layersAndrei Chertkov, Artem Basharin, Mikhail Saygin et al.
The growing demand for energy-efficient, high-performance AI systems has led to increased attention on alternative computing platforms (e.g., photonic, neuromorphic) due to their potential to accelerate learning and inference. However, integrating such physical components into deep learning pipelines remains challenging, as physical devices often offer limited expressiveness, and their non-differentiable nature renders on-device backpropagation difficult or infeasible. This motivates the development of hybrid architectures that combine digital neural networks with reconfigurable physical layers, which effectively behave as black boxes. In this work, we present a framework for the end-to-end training of such hybrid networks. This framework integrates stochastic zeroth-order optimization for updating the physical layer's internal parameters with a dynamic low-rank surrogate model that enables gradient propagation through the physical layer. A key component of our approach is the implicit projector-splitting integrator algorithm, which updates the lightweight surrogate model after each forward pass with minimal hardware queries, thereby avoiding costly full matrix reconstruction. We demonstrate our method across diverse deep learning tasks, including: computer vision, audio classification, and language modeling. Notably, across all modalities, the proposed approach achieves near-digital baseline accuracy and consistently enables effective end-to-end training of hybrid models incorporating various non-differentiable physical components (spatial light modulators, microring resonators, and Mach-Zehnder interferometers). This work bridges hardware-aware deep learning and gradient-free optimization, thereby offering a practical pathway for integrating non-differentiable physical components into scalable, end-to-end trainable AI systems.
IRSep 1, 2025
Ultra Fast Warm Start Solution for Graph RecommendationsViacheslav Yusupov, Maxim Rakhuba, Evgeny Frolov
In this work, we present a fast and effective Linear approach for updating recommendations in a scalable graph-based recommender system UltraGCN. Solving this task is extremely important to maintain the relevance of the recommendations under the conditions of a large amount of new data and changing user preferences. To address this issue, we adapt the simple yet effective low-rank approximation approach to the graph-based model. Our method delivers instantaneous recommendations that are up to 30 times faster than conventional methods, with gains in recommendation quality, and demonstrates high scalability even on the large catalogue datasets.
IRAug 16, 2025
Leveraging Geometric Insights in Hyperbolic Triplet Loss for Improved RecommendationsViacheslav Yusupov, Maxim Rakhuba, Evgeny Frolov
Recent studies have demonstrated the potential of hyperbolic geometry for capturing complex patterns from interaction data in recommender systems. In this work, we introduce a novel hyperbolic recommendation model that uses geometrical insights to improve representation learning and increase computational stability at the same time. We reformulate the notion of hyperbolic distances to unlock additional representation capacity over conventional Euclidean space and learn more expressive user and item representations. To better capture user-items interactions, we construct a triplet loss that models ternary relations between users and their corresponding preferred and nonpreferred choices through a mix of pairwise interaction terms driven by the geometry of data. Our hyperbolic approach not only outperforms existing Euclidean and hyperbolic models but also reduces popularity bias, leading to more diverse and personalized recommendations.
IRAug 11, 2025
Recommendation Is a Dish Better Served WarmDanil Gusak, Nikita Sukhorukov, Evgeny Frolov
In modern recommender systems, experimental settings typically include filtering out cold users and items based on a minimum interaction threshold. However, these thresholds are often chosen arbitrarily and vary widely across studies, leading to inconsistencies that can significantly affect the comparability and reliability of evaluation results. In this paper, we systematically explore the cold-start boundary by examining the criteria used to determine whether a user or an item should be considered cold. Our experiments incrementally vary the number of interactions for different items during training, and gradually update the length of user interaction histories during inference. We investigate the thresholds across several widely used datasets, commonly represented in recent papers from top-tier conferences, and on multiple established recommender baselines. Our findings show that inconsistent selection of cold-start thresholds can either result in the unnecessary removal of valuable data or lead to the misclassification of cold instances as warm, introducing more noise into the system.
LGAug 6, 2025
Matrix-Free Two-to-Infinity and One-to-Two Norms EstimationAskar Tsyganov, Evgeny Frolov, Sergey Samsonov et al.
In this paper, we propose new randomized algorithms for estimating the two-to-infinity and one-to-two norms in a matrix-free setting, using only matrix-vector multiplications. Our methods are based on appropriate modifications of Hutchinson's diagonal estimator and its Hutch++ version. We provide oracle complexity bounds for both modifications. We further illustrate the practical utility of our algorithms for Jacobian-based regularization in deep neural network training on image classification tasks. We also demonstrate that our methodology can be applied to mitigate the effect of adversarial attacks in the domain of recommender systems.
LGMay 23, 2025
Generalized Fisher-Weighted SVD: Scalable Kronecker-Factored Fisher Approximation for Compressing Large Language ModelsViktoriia Chekalina, Daniil Moskovskiy, Daria Cherniuk et al.
The Fisher information is a fundamental concept for characterizing the sensitivity of parameters in neural networks. However, leveraging the full observed Fisher information is too expensive for large models, so most methods rely on simple diagonal approximations. While efficient, this approach ignores parameter correlations, often resulting in reduced performance on downstream tasks. In this work, we mitigate these limitations and propose Generalized Fisher-Weighted SVD (GFWSVD), a post-training LLM compression technique that accounts for both diagonal and off-diagonal elements of the Fisher information matrix, providing a more accurate reflection of parameter importance. To make the method tractable, we introduce a scalable adaptation of the Kronecker-factored approximation algorithm for the observed Fisher information. We demonstrate the effectiveness of our method on LLM compression, showing improvements over existing compression baselines. For example, at a 20 compression rate on the MMLU benchmark, our method outperforms FWSVD, which is based on a diagonal approximation of the Fisher information, by 5 percent, SVD-LLM by 3 percent, and ASVD by 6 percent compression rate.
IRDec 4, 2023
Dynamic Collaborative Filtering for Matrix- and Tensor-based Recommender SystemsAlbert Saiapin, Ivan Oseledets, Evgeny Frolov
In production applications of recommender systems, a continuous data flow is employed to update models in real-time. Many recommender models often require complete retraining to adapt to new data. In this work, we introduce a novel collaborative filtering model for sequential problems known as Tucker Integrator Recommender - TIRecA. TIRecA efficiently updates its parameters using only the new data segment, allowing incremental addition of new users and items to the recommender system. To demonstrate the effectiveness of the proposed model, we conducted experiments on four publicly available datasets: MovieLens 20M, Amazon Beauty, Amazon Toys and Games, and Steam. Our comparison with general matrix and tensor-based baselines in terms of prediction quality and computational time reveals that TIRecA achieves comparable quality to the baseline methods, while being 10-20 times faster in training time.
IRApr 11, 2021
Dynamic Modeling of User Preferences for Stable RecommendationsOluwafemi Olaleke, Ivan Oseledets, Evgeny Frolov
In domains where users tend to develop long-term preferences that do not change too frequently, the stability of recommendations is an important factor of the perceived quality of a recommender system. In such cases, unstable recommendations may lead to poor personalization experience and distrust, driving users away from a recommendation service. We propose an incremental learning scheme that mitigates such problems through the dynamic modeling approach. It incorporates a generalized matrix form of a partial differential equation integrator that yields a dynamic low-rank approximation of time-dependent matrices representing user preferences. The scheme allows extending the famous PureSVD approach to time-aware settings and significantly improves its stability without sacrificing the accuracy in standard top-$n$ recommendations tasks.
IRAug 15, 2020
Performance of Hyperbolic Geometry Models on Top-N Recommendation TasksLeyla Mirvakhabova, Evgeny Frolov, Valentin Khrulkov et al.
We introduce a simple autoencoder based on hyperbolic geometry for solving standard collaborative filtering problem. In contrast to many modern deep learning techniques, we build our solution using only a single hidden layer. Remarkably, even with such a minimalistic approach, we not only outperform the Euclidean counterpart but also achieve a competitive performance with respect to the current state-of-the-art. We additionally explore the effects of space curvature on the quality of hyperbolic models and propose an efficient data-driven method for estimating its optimal value.
IRJul 27, 2018
Revealing the Unobserved by Linking Collaborative Behavior and Side KnowledgeEvgeny Frolov, Ivan Oseledets
We propose a tensor-based model that fuses a more granular representation of user preferences with the ability to take additional side information into account. The model relies on the concept of ordinal nature of utility, which better corresponds to actual user perception. In addition to that, unlike the majority of hybrid recommenders, the model ties side information directly to collaborative data, which not only addresses the problem of extreme data sparsity, but also allows to naturally exploit patterns in the observed behavior for a more meaningful representation of user intents. We demonstrate the effectiveness of the proposed model on several standard benchmark datasets. The general formulation of the approach imposes no restrictions on the type of observed interactions and makes it potentially applicable for joint modelling of context information along with side data.
LGFeb 18, 2018
HybridSVD: When Collaborative Information is Not EnoughEvgeny Frolov, Ivan Oseledets
We propose a new hybrid algorithm that allows incorporating both user and item side information within the standard collaborative filtering technique. One of its key features is that it naturally extends a simple PureSVD approach and inherits its unique advantages, such as highly efficient Lanczos-based optimization procedure, simplified hyper-parameter tuning and a quick folding-in computation for generating recommendations instantly even in highly dynamic online environments. The algorithm utilizes a generalized formulation of the singular value decomposition, which adds flexibility to the solution and allows imposing the desired structure on its latent space. Conveniently, the resulting model also admits an efficient and straightforward solution for the cold start scenario. We evaluate our approach on a diverse set of datasets and show its superiority over similar classes of hybrid models.
LGJul 14, 2016
Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Recommendations TasksEvgeny Frolov, Ivan Oseledets
Conventional collaborative filtering techniques treat a top-n recommendations problem as a task of generating a list of the most relevant items. This formulation, however, disregards an opposite - avoiding recommendations with completely irrelevant items. Due to that bias, standard algorithms, as well as commonly used evaluation metrics, become insensitive to negative feedback. In order to resolve this problem we propose to treat user feedback as a categorical variable and model it with users and items in a ternary way. We employ a third-order tensor factorization technique and implement a higher order folding-in method to support online recommendations. The method is equally sensitive to entire spectrum of user ratings and is able to accurately predict relevant items even from a negative only feedback. Our method may partially eliminate the need for complicated rating elicitation process as it provides means for personalized recommendations from the very beginning of an interaction with a recommender system. We also propose a modification of standard metrics which helps to reveal unwanted biases and account for sensitivity to a negative feedback. Our model achieves state-of-the-art quality in standard recommendation tasks while significantly outperforming other methods in the cold-start "no-positive-feedback" scenarios.
LGMar 19, 2016
Tensor Methods and Recommender SystemsEvgeny Frolov, Ivan Oseledets
A substantial progress in development of new and efficient tensor factorization techniques has led to an extensive research of their applicability in recommender systems field. Tensor-based recommender models push the boundaries of traditional collaborative filtering techniques by taking into account a multifaceted nature of real environments, which allows to produce more accurate, situational (e.g. context-aware, criteria-driven) recommendations. Despite the promising results, tensor-based methods are poorly covered in existing recommender systems surveys. This survey aims to complement previous works and provide a comprehensive overview on the subject. To the best of our knowledge, this is the first attempt to consolidate studies from various application domains in an easily readable, digestible format, which helps to get a notion of the current state of the field. We also provide a high level discussion of the future perspectives and directions for further improvement of tensor-based recommendation systems.