Abhijit Mahalunkar

6papers

1,112citations

Novelty45%

AI Score26

Ranked #169,201 of 205,806 authors (top 82%)#36,792 in LG (top 87%)

6 Papers

LGDec 8, 2020

Mutual Information Decay Curves and Hyper-Parameter Grid Search Design for Recurrent Neural Architectures

Abhijit Mahalunkar, John D. Kelleher

We present an approach to design the grid searches for hyper-parameter optimization for recurrent neural architectures. The basis for this approach is the use of mutual information to analyze long distance dependencies (LDDs) within a dataset. We also report a set of experiments that demonstrate how using this approach, we obtain state-of-the-art results for DilatedRNNs across a range of benchmark datasets.

LGJul 13, 2019

Multi-Element Long Distance Dependencies: Using SPk Languages to Explore the Characteristics of Long-Distance Dependencies

Abhijit Mahalunkar, John D. Kelleher

In order to successfully model Long Distance Dependencies (LDDs) it is necessary to understand the full-range of the characteristics of the LDDs exhibited in a target dataset. In this paper, we use Strictly k-Piecewise languages to generate datasets with various properties. We then compute the characteristics of the LDDs in these datasets using mutual information and analyze the impact of factors such as (i) k, (ii) length of LDDs, (iii) vocabulary size, (iv) forbidden subsequences, and (v) dataset size. This analysis reveal that the number of interacting elements in a dependency is an important characteristic of LDDs. This leads us to the challenge of modelling multi-element long-distance dependencies. Our results suggest that attention mechanisms in neural networks may aide in modeling datasets with multi-element long-distance dependencies. However, we conclude that there is a need to develop more efficient attention mechanisms to address this issue.

CVDec 19, 2018

Generating Diverse and Meaningful Captions

Annika Lindh, Robert J. Ross, Abhijit Mahalunkar et al.

Image Captioning is a task that requires models to acquire a multi-modal understanding of the world and to express this understanding in natural language text. While the state-of-the-art for this task has rapidly improved in terms of n-gram metrics, these models tend to output the same generic captions for similar images. In this work, we address this limitation and train a model that generates more diverse and specific captions through an unsupervised training approach that incorporates a learning signal from an Image Retrieval model. We summarize previous results and improve the state-of-the-art on caption diversity and novelty. We make our source code publicly available online.

LGOct 6, 2018

Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets

Abhijit Mahalunkar, John D. Kelleher

In order to build efficient deep recurrent neural architectures, it is essential to analyze the complexityof long distance dependencies (LDDs) of the dataset being modeled. In this paper, we presentdetailed analysis of the dependency decay curve exhibited by various datasets. The datasets sampledfrom a similar process (e.g. natural language, sequential MNIST, Strictlyk-Piecewise languages,etc) display variations in the properties of the dependency decay curve. Our analysis reveal thefactors resulting in these variations; such as (i) number of unique symbols in a dataset, (ii) size ofthe dataset, (iii) number of interacting symbols within a given LDD, and (iv) the distance betweenthe interacting symbols. We test these factors by generating synthesized datasets of the Strictlyk-Piecewise languages. Another advantage of these synthesized datasets is that they enable targetedtesting of deep recurrent neural architectures in terms of their ability to model LDDs with differentcharacteristics. We also demonstrate that analysing dependency decay curves can inform the selectionof optimal hyper-parameters for SOTA deep recurrent neural architectures. This analysis can directlycontribute to the development of more accurate and efficient sequential models.

LGAug 15, 2018

Using Regular Languages to Explore the Representational Capacity of Recurrent Neural Architectures

Abhijit Mahalunkar, John D. Kelleher

The presence of Long Distance Dependencies (LDDs) in sequential data poses significant challenges for computational models. Various recurrent neural architectures have been designed to mitigate this issue. In order to test these state-of-the-art architectures, there is growing need for rich benchmarking datasets. However, one of the drawbacks of existing datasets is the lack of experimental control with regards to the presence and/or degree of LDDs. This lack of control limits the analysis of model performance in relation to the specific challenge posed by LDDs. One way to address this is to use synthetic data having the properties of subregular languages. The degree of LDDs within the generated data can be controlled through the k parameter, length of the generated strings, and by choosing appropriate forbidden strings. In this paper, we explore the capacity of different RNN extensions to model LDDs, by evaluating these models on a sequence of SPk synthesized datasets, where each subsequent dataset exhibits a longer degree of LDD. Even though SPk are simple languages, the presence of LDDs does have significant impact on the performance of recurrent neural architectures, thus making them prime candidate in benchmarking tasks.

CYMar 12, 2018

Addressing the Free-Rider Problem in Public Transport Systems

Vaibhav Kulkarni, Bertil Chapuis, Benoît Garbinato et al.

Public transport network constitutes for an indispensable part of a city by providing mobility services to the general masses. To improve ease of access and reduce infrastructural investments, public transport authorities often adopt proof of payment system. Such a system operates by eliminating ticket controls when boarding the vehicle and subjecting the travelers to random ticket checks by affiliated personnel (controllers). Although cost efficient, such a system promotes free-riders, who deliberately decide to evade fares for the transport service. A recent survey by the association of European transport, estimates hefty income losses due to fare evasion, highlighting that free-riding is a serious problem that needs immediate attention. To this end, we highlight the attack vectors which can be exploited by free-riders by analyzing the crowdsourced data about the control-locations. Next, we propose a framework to generate randomized control-location traces by using generative adversarial networks (GANs) in order to minimize the attack vectors. Finally, we propose metrics to evaluate such a system, quantified in terms of increased risk and higher probability of being subjected to control checks across the city.