Dwaipayan Roy

h-index13

4papers

170citations

Novelty38%

AI Score22

Ranked #180,027 of 194,257 authors (top 93%)#1,938 in IR (top 89%)

4 Papers

1.2DLJun 8, 2021

ConSTR: A Contextual Search Term Recommender

Thomas Krämer, Zeljko Carevic, Dwaipayan Roy et al.

In this demo paper, we present ConSTR, a novel Contextual Search Term Recommender that utilises the user's interaction context for search term recommendation and literature retrieval. ConSTR integrates a two-layered recommendation interface: the first layer suggests terms with respect to a user's current search term, and the second layer suggests terms based on the users' previous search activities (interaction context). For the demonstration, ConSTR is built on the arXiv, an academic repository consisting of 1.8 million documents.

4.3IRApr 14, 2020

Tag Embedding Based Personalized Point Of Interest Recommendation System

Suraj Agrawal, Dwaipayan Roy, Mandar Mitra

Personalized Point of Interest recommendation is very helpful for satisfying users' needs at new places. In this article, we propose a tag embedding based method for Personalized Recommendation of Point Of Interest. We model the relationship between tags corresponding to Point Of Interest. The model provides representative embedding corresponds to a tag in a way that related tags will be closer. We model Point of Interest-based on tag embedding and also model the users (user profile) based on the Point Of Interest rated by them. finally, we rank the user's candidate Point Of Interest based on cosine similarity between user's embedding and Point of Interest's embedding. Further, we find the parameters required to model user by discrete optimizing over different measures (like ndcg@5, MRR, ...). We also analyze the result while considering the same parameters for all users and individual parameters for each user. Along with it we also analyze the effect on the result while changing the dataset to model the relationship between tags. Our method also minimizes the privacy leak issue. We used TREC Contextual Suggestion 2016 Phase 2 dataset and have significant improvement over all the measures on the state of the art method. It improves ndcg@5 by 12.8%, p@5 by 4.3%, and MRR by 7.8%, which shows the effectiveness of the method.

12.9IRJun 25, 2016

Representing Documents and Queries as Sets of Word Embedded Vectors for Information Retrieval

Dwaipayan Roy, Debasis Ganguly, Mandar Mitra et al.

A major difficulty in applying word vector embeddings in IR is in devising an effective and efficient strategy for obtaining representations of compound units of text, such as whole documents, (in comparison to the atomic words), for the purpose of indexing and scoring documents. Instead of striving for a suitable method for obtaining a single vector representation of a large document of text, we rather aim for developing a similarity metric that makes use of the similarities between the individual embedded word vectors in a document and a query. More specifically, we represent a document and a query as sets of word vectors, and use a standard notion of similarity measure between these sets, computed as a function of the similarities between each constituent word pair from these sets. We then make use of this similarity measure in combination with standard IR based similarities for document ranking. The results of our initial experimental investigations shows that our proposed method improves MAP by up to $5.77\%$, in comparison to standard text-based language model similarity, on the TREC ad-hoc dataset.

27.3IRJun 24, 2016

Using Word Embeddings for Automatic Query Expansion

Dwaipayan Roy, Debjyoti Paul, Mandar Mitra et al.

In this paper a framework for Automatic Query Expansion (AQE) is proposed using distributed neural language model word2vec. Using semantic and contextual relation in a distributed and unsupervised framework, word2vec learns a low dimensional embedding for each vocabulary entry. Using such a framework, we devise a query expansion technique, where related terms to a query are obtained by K-nearest neighbor approach. We explore the performance of the AQE methods, with and without feedback query expansion, and a variant of simple K-nearest neighbor in the proposed framework. Experiments on standard TREC ad-hoc data (Disk 4, 5 with query sets 301-450, 601-700) and web data (WT10G data with query set 451-550) shows significant improvement over standard term-overlapping based retrieval methods. However the proposed method fails to achieve comparable performance with statistical co-occurrence based feedback method such as RM3. We have also found that the word2vec based query expansion methods perform similarly with and without any feedback information.