IRSep 12, 2023
SAGE: Structured Attribute Value Generation for Billion-Scale Product CatalogsAthanasios N. Nikolakopoulos, Swati Kaul, Siva Karthik Gade et al.
We introduce SAGE; a Generative LLM for inferring attribute values for products across world-wide e-Commerce catalogs. We introduce a novel formulation of the attribute-value prediction problem as a Seq2Seq summarization task, across languages, product types and target attributes. Our novel modeling approach lifts the restriction of predicting attribute values within a pre-specified set of choices, as well as, the requirement that the sought attribute values need to be explicitly mentioned in the text. SAGE can infer attribute values even when such values are mentioned implicitly using periphrastic language, or not-at-all-as is the case for common-sense defaults. Additionally, SAGE is capable of predicting whether an attribute is inapplicable for the product at hand, or non-obtainable from the available information. SAGE is the first method able to tackle all aspects of the attribute-value-prediction task as they arise in practical settings in e-Commerce catalogs. A comprehensive set of experiments demonstrates the effectiveness of the proposed approach, as well as, its superiority against state-of-the-art competing alternatives. Moreover, our experiments highlight SAGE's ability to tackle the task of predicting attribute values in zero-shot setting; thereby, opening up opportunities for significantly reducing the overall number of labeled examples required for training.
IRSep 9, 2021
Trust your neighbors: A comprehensive survey of neighborhood-based methods for recommender systemsAthanasios N. Nikolakopoulos, Xia Ning, Christian Desrosiers et al.
Collaborative recommendation approaches based on nearest-neighbors are still highly popular today due to their simplicity, their efficiency, and their ability to produce accurate and personalized recommendations. This chapter offers a comprehensive survey of neighborhood-based methods for the item recommendation problem. It presents the main characteristics and benefits of such methods, describes key design choices for implementing a neighborhood-based recommender system, and gives practical information on how to make these choices. A broad range of methods is covered in the chapter, including traditional algorithms like k-nearest neighbors as well as advanced approaches based on matrix factorization, sparse coding and random walks.
NAOct 13, 2020
Projection techniques to update the truncated SVD of evolving matricesVassilis Kalantzis, Georgios Kollias, Shashanka Ubaru et al.
This paper considers the problem of updating the rank-k truncated Singular Value Decomposition (SVD) of matrices subject to the addition of new rows and/or columns over time. Such matrix problems represent an important computational kernel in applications such as Latent Semantic Indexing and Recommender Systems. Nonetheless, the proposed framework is purely algebraic and targets general updating problems. The algorithm presented in this paper undertakes a projection view-point and focuses on building a pair of subspaces which approximate the linear span of the sought singular vectors of the updated matrix. We discuss and analyze two different choices to form the projection subspaces. Results on matrices from real applications suggest that the proposed algorithm can lead to higher accuracy, especially for the singular triplets associated with the largest modulus singular values. Several practical details and key differences with other approaches are also discussed.
SIAug 29, 2020
Random Surfing Revisited: Generalizing PageRank's Teleportation ModelAthanasios N. Nikolakopoulos
We revisit the Random Surfer model, focusing on its--often overlooked--Teleportation component, and we introduce NCDawareRank; a novel ranking framework designed to exploit network meta-information as well as aspects of its higher-order structural organization in a way that preserves the mathematical structure and the attractive computational characteristics of PageRank. A rigorous theoretical exploration of the proposed model reveals a wealth of mathematical properties that entail tangible benefits in terms of robustness, computability, as well as modeling flexibility and expressiveness. A set of experiments on real-work networks verify the theoretically predicted properties of NCDawareRank, and showcase its effectiveness as a network centrality measure.
IRSep 9, 2019
Boosting Item-based Collaborative Filtering via Nearly Uncoupled Random WalksAthanasios N. Nikolakopoulos, George Karypis
Item-based models are among the most popular collaborative filtering approaches for building recommender systems. Random walks can provide a powerful tool for harvesting the rich network of interactions captured within these models. They can exploit indirect relations between the items, mitigate the effects of sparsity, ensure wider itemspace coverage, as well as increase the diversity of recommendation lists. Their potential, however, can be hindered by the tendency of the walks to rapidly concentrate towards the central nodes of the graph, thereby significantly restricting the range of K-step distributions that can be exploited for personalized recommendations. In this work we introduce RecWalk; a novel random walk-based method that leverages the spectral properties of nearly uncoupled Markov chains to provably lift this limitation and prolong the influence of users' past preferences on the successive steps of the walk---allowing the walker to explore the underlying network more fruitfully. A comprehensive set of experiments on real-world datasets verify the theoretically predicted properties of the proposed approach and indicate that they are directly linked to significant improvements in top-n recommendation accuracy. They also highlight RecWalk's potential in providing a framework for boosting the performance of item-based models. RecWalk achieves state-of-the-art top-n recommendation quality outperforming several competing approaches, including recently proposed methods that rely on deep neural networks.
MLApr 5, 2018
Adaptive Diffusions for Scalable Learning over GraphsDimitris Berberidis, Athanasios N. Nikolakopoulos, Georgios B. Giannakis
Diffusion-based classifiers such as those relying on the Personalized PageRank and the Heat kernel, enjoy remarkable classification accuracy at modest computational requirements. Their performance however is affected by the extent to which the chosen diffusion captures a typically unknown label propagation mechanism, that can be specific to the underlying graph, and potentially different for each class. The present work introduces a disciplined, data-efficient approach to learning class-specific diffusion functions adapted to the underlying network topology. The novel learning approach leverages the notion of "landing probabilities" of class-specific random walks, which can be computed efficiently, thereby ensuring scalability to large graphs. This is supported by rigorous analysis of the properties of the model as well as the proposed algorithms. Furthermore, a robust version of the classifier facilitates learning even in noisy environments. Classification tests on real networks demonstrate that adapting the diffusion function to the given graph and observed labels, significantly improves the performance over fixed diffusions; reaching -- and many times surpassing -- the classification accuracy of computationally heavier state-of-the-art competing methods, that rely on node embeddings and deep neural networks.
MLNov 28, 2017
Kernel-based Inference of Functions over GraphsVassilis N. Ioannidis, Meng Ma, Athanasios N. Nikolakopoulos et al.
The study of networks has witnessed an explosive growth over the past decades with several ground-breaking methods introduced. A particularly interesting -- and prevalent in several fields of study -- problem is that of inferring a function defined over the nodes of a network. This work presents a versatile kernel-based framework for tackling this inference problem that naturally subsumes and generalizes the reconstruction approaches put forth recently by the signal processing on graphs community. Both the static and the dynamic settings are considered along with effective modeling approaches for addressing real-world problems. The herein analytical discussion is complemented by a set of numerical examples, which showcase the effectiveness of the presented techniques, as well as their merits related to state-of-the-art methods.
IRNov 19, 2015
EigenRec: Generalizing PureSVD for Effective and Efficient Top-N RecommendationsAthanasios N. Nikolakopoulos, Vassilis Kalantzis, Efstratios Gallopoulos et al.
We introduce EigenRec; a versatile and efficient Latent-Factor framework for Top-N Recommendations that includes the well-known PureSVD algorithm as a special case. EigenRec builds a low dimensional model of an inter-item proximity matrix that combines a similarity component, with a scaling operator, designed to control the influence of the prior item popularity on the final model. Seeing PureSVD within our framework provides intuition about its inner workings, exposes its inherent limitations, and also, paves the path towards painlessly improving its recommendation performance. A comprehensive set of experiments on the MovieLens and the Yahoo datasets based on widely applied performance metrics, indicate that EigenRec outperforms several state-of-the-art algorithms, in terms of Standard and Long-Tail recommendation accuracy, exhibiting low susceptibility to sparsity, even in its most extreme manifestations -- the Cold-Start problems. At the same time EigenRec has an attractive computational profile and it can apply readily in large-scale recommendation settings.
IRJun 30, 2015
Top-N recommendations in the presence of sparsity: An NCD-based approachAthanasios N. Nikolakopoulos, John D. Garofalakis
Making recommendations in the presence of sparsity is known to present one of the most challenging problems faced by collaborative filtering methods. In this work we tackle this problem by exploiting the innately hierarchical structure of the item space following an approach inspired by the theory of Decomposability. We view the itemspace as a Nearly Decomposable system and we define blocks of closely related elements and corresponding indirect proximity components. We study the theoretical properties of the decomposition and we derive sufficient conditions that guarantee full item space coverage even in cold-start recommendation scenarios. A comprehensive set of experiments on the MovieLens and the Yahoo!R2Music datasets, using several widely applied performance metrics, support our model's theoretically predicted properties and verify that NCDREC outperforms several state-of-the-art algorithms, in terms of recommendation accuracy, diversity and sparseness insensitivity.
SIMay 30, 2015
Random Surfing Without TeleportationAthanasios N. Nikolakopoulos, John D. Garofalakis
In the standard Random Surfer Model, the teleportation matrix is necessary to ensure that the final PageRank vector is well-defined. The introduction of this matrix, however, results in serious problems and imposes fundamental limitations to the quality of the ranking vectors. In this work, building on the recently proposed NCDawareRank framework, we exploit the decomposition of the underlying space into blocks, and we derive easy to check necessary and sufficient conditions for random surfing without teleportation.