Zeon Trevor Fernando

5papers

105citations

Novelty22%

AI Score17

Ranked #200,614 of 205,806 authors (top 97%)#2,120 in IR (top 95%)

5 Papers

IRMar 30, 2021

An In-depth Analysis of Passage-Level Label Transfer for Contextual Document Ranking

Koustav Rudra, Zeon Trevor Fernando, Avishek Anand

Pre-trained contextual language models such as BERT, GPT, and XLnet work quite well for document retrieval tasks. Such models are fine-tuned based on the query-document/query-passage level relevance labels to capture the ranking signals. However, the documents are longer than the passages and such document ranking models suffer from the token limitation (512) of BERT. Researchers proposed ranking strategies that either truncate the documents beyond the token limit or chunk the documents into units that can fit into the BERT. In the later case, the relevance labels are either directly transferred from the original query-document pair or learned through some external model. In this paper, we conduct a detailed study of the design decisions about splitting and label transfer on retrieval effectiveness and efficiency. We find that direct transfer of relevance labels from documents to passages introduces label noise that strongly affects retrieval effectiveness for large training datasets. We also find that query processing times are adversely affected by fine-grained splitting schemes. As a remedy, we propose a careful passage level labelling scheme using weak supervision that delivers improved performance (3-14% in terms of nDCG score) over most of the recently proposed models for ad-hoc retrieval while maintaining manageable computational complexity on four diverse document retrieval datasets.

IRJul 15, 2019

A study on the Interpretability of Neural Retrieval Models using DeepSHAP

Zeon Trevor Fernando, Jaspreet Singh, Avishek Anand

A recent trend in IR has been the usage of neural networks to learn retrieval models for text based adhoc search. While various approaches and architectures have yielded significantly better performance than traditional retrieval models such as BM25, it is still difficult to understand exactly why a document is relevant to a query. In the ML community several approaches for explaining decisions made by deep neural networks have been proposed -- including DeepSHAP which modifies the DeepLift algorithm to estimate the relative importance (shapley values) of input features for a given decision by comparing the activations in the network for a given image against the activations caused by a reference input. In image classification, the reference input tends to be a plain black image. While DeepSHAP has been well studied for image classification tasks, it remains to be seen how we can adapt it to explain the output of Neural Retrieval Models (NRMs). In particular, what is a good "black" image in the context of IR? In this paper we explored various reference input document construction techniques. Additionally, we compared the explanations generated by DeepSHAP to LIME (a model agnostic approach) and found that the explanations differ considerably. Our study raises concerns regarding the robustness and accuracy of explanations produced for NRMs. With this paper we aim to shed light on interesting problems surrounding interpretability in NRMs and highlight areas of future work.

HCAug 15, 2018

LogCanvas: Visualizing Search History Using Knowledge Graphs

Luyan Xu, Zeon Trevor Fernando, Xuan Zhou et al.

In this demo paper, we introduce LogCanvas, a platform for user search history visualisation. Different from the existing visualisation tools, LogCanvas focuses on helping users re-construct the semantic relationship among their search activities. LogCanvas segments a user's search history into different sessions and generates a knowledge graph to represent the information exploration process in each session. A knowledge graph is composed of the most important concepts or entities discovered by each search query as well as their relationships. It thus captures the semantic relationship among the queries. LogCanvas offers a session timeline viewer and a snippets viewer to enable users to re-find their previous search results efficiently. LogCanvas also provides a collaborative perspective to support a group of users in sharing search results and experience.

HCSep 9, 2015

LearnWeb-OER: Improving Accessibility of Open Educational Resources

Jaspreet Singh, Zeon Trevor Fernando, Saniya Chawla

In addition to user-generated content, Open Educational Resources are increasingly made available on the Web by several institutions and organizations with the aim of being re-used. Nevertheless, it is still difficult for users to find appropriate resources for specific learning scenarios among the vast amount offered on the Web. Our goal is to give users the opportunity to search for authentic resources from the Web and reuse them in a learning context. The LearnWeb-OER platform enhances collaborative searching and sharing of educational resources providing specific means and facilities for education. In the following, we provide a description of the functionalities that support users in collaboratively collecting, selecting, annotating and discussing search results and learning resources.

IRMar 12, 2015

Capturing, Documenting and Visualizing Search Contexts for building Multimedia Corpora

Zeon Trevor Fernando

In Social Science research, multimedia documents are often collected to answer particular research questions like: "Which of the aesthetic properties of a photo are considered important on the web" or "How has Street Art developed over the past 50 years". Therefore, a researcher generally issues multiple queries to a number of search engines. This activity may span over long time intervals and results in a collection which can be further analyzed. Documenting the collection building process which includes the context of the carried out searches is imperative for social scientists to reproduce their research. Such context documentation consists of several user actions and search attributes like: the issued queries; the results clicked and saved; duration a particular result was viewed for; the set of results that was displayed but neither clicked, nor saved; as well as user annotations like comments or tags. In this work we will describe a search process tracking module and a search history visualization module. These modules can be integrated into keyword based search systems through a REST API which was developed to help capture, document and revisit past search contexts while building a web corpora. Finally, we detail the implementation of how the module was integrated into the LearnWeb2.0 platform - a multimedia web2.0 search and sharing application which can obtain resources from various web2.0 tools such as Youtube, Bing, Flickr, etc using keyword search.