CVFeb 5, 2023Code
Towards Precision in Appearance-based Gaze Estimation in the WildMurthy L. R. D., Abhishek Mukhopadhyay, Shambhavi Aggarwal et al.
Appearance-based gaze estimation systems have shown great progress recently, yet the performance of these techniques depend on the datasets used for training. Most of the existing gaze estimation datasets setup in interactive settings were recorded in laboratory conditions and those recorded in the wild conditions display limited head pose and illumination variations. Further, we observed little attention so far towards precision evaluations of existing gaze estimation approaches. In this work, we present a large gaze estimation dataset, PARKS-Gaze, with wider head pose and illumination variation and with multiple samples for a single Point of Gaze (PoG). The dataset contains 974 minutes of data from 28 participants with a head pose range of 60 degrees in both yaw and pitch directions. Our within-dataset and cross-dataset evaluations and precision evaluations indicate that the proposed dataset is more challenging and enable models to generalize on unseen participants better than the existing in-the-wild datasets. The project page can be accessed here: https://github.com/lrdmurthy/PARKS-Gaze
CVAug 19, 2022
To show or not to show: Redacting sensitive text from videos of electronic displaysAbhishek Mukhopadhyay, Shubham Agarwal, Patrick Dylan Zwick et al.
With the increasing prevalence of video recordings there is a growing need for tools that can maintain the privacy of those recorded. In this paper, we define an approach for redacting personally identifiable text from videos using a combination of optical character recognition (OCR) and natural language processing (NLP) techniques. We examine the relative performance of this approach when used with different OCR models, specifically Tesseract and the OCR system from Google Cloud Vision (GCV). For the proposed approach the performance of GCV, in both accuracy and speed, is significantly higher than Tesseract. Finally, we explore the advantages and disadvantages of both models in real-world applications.
HCJan 4, 2021
Eye Tracking to Understand Impact of Aging on Mobile Phone ApplicationsAntony William Joseph, Jeevitha Shree DV, Kamal Preet Singh Saluja et al.
Usage of smartphones and tablets have been increasing rapidly with multi-touch interaction and powerful configurations. Performing tasks on mobile phones become more complex as people age, thereby increasing their cognitive workload. In this context, we conducted an eye tracking study with 50 participants between the age of 20 to 60 years and above, living in Bangalore, India. This paper focuses on visual nature of interaction with mobile user interfaces. The study aims to investigate how aging affects user experience on mobile phones while performing complex tasks, and estimate cognitive workload using eye tracking metrics. The study consisted of five tasks that were performed on an android mobile phone under naturalistic scenarios using eye tracking glasses. We recorded ocular parameters like fixation rate, saccadic rate, average fixation duration, maximum fixation duration and standard deviation of pupil dilation for left and right eyes respectively for each participant. Results from our study show that aging has a bigger effect on performance of using mobile phones irrespective of any complex task given to them. We noted that, participants aged between 50 to 60+ years had difficulties in completing tasks and showed increased cognitive workload. They took longer fixation duration to complete tasks which involved copy-paste operations. Further, we identifed design implications and provided design recommendations for designers and manufacturers.
LGSep 30, 2020
Bridging the gap between Markowitz planning and deep reinforcement learningEric Benhamou, David Saltiel, Sandrine Ungari et al.
While researchers in the asset management industry have mostly focused on techniques based on financial and risk planning techniques like Markowitz efficient frontier, minimum variance, maximum diversification or equal risk parity, in parallel, another community in machine learning has started working on reinforcement learning and more particularly deep reinforcement learning to solve other decision making problems for challenging task like autonomous driving, robot learning, and on a more conceptual side games solving like Go. This paper aims to bridge the gap between these two approaches by showing Deep Reinforcement Learning (DRL) techniques can shed new lights on portfolio allocation thanks to a more general optimization setting that casts portfolio allocation as an optimal control problem that is not just a one-step optimization, but rather a continuous control optimization with a delayed reward. The advantages are numerous: (i) DRL maps directly market conditions to actions by design and hence should adapt to changing environment, (ii) DRL does not rely on any traditional financial risk assumptions like that risk is represented by variance, (iii) DRL can incorporate additional data and be a multi inputs method as opposed to more traditional optimization methods. We present on an experiment some encouraging results using convolution networks.
LGSep 30, 2020
AAMDRL: Augmented Asset Management with Deep Reinforcement LearningEric Benhamou, David Saltiel, Sandrine Ungari et al.
Can an agent learn efficiently in a noisy and self adapting environment with sequential, non-stationary and non-homogeneous observations? Through trading bots, we illustrate how Deep Reinforcement Learning (DRL) can tackle this challenge. Our contributions are threefold: (i) the use of contextual information also referred to as augmented state in DRL, (ii) the impact of a one period lag between observations and actions that is more realistic for an asset management environment, (iii) the implementation of a new repetitive train test method called walk forward analysis, similar in spirit to cross validation for time series. Although our experiment is on trading bots, it can easily be translated to other bot environments that operate in sequential environment with regime changes and noisy data. Our experiment for an augmented asset manager interested in finding the best portfolio for hedging strategies shows that AAMDRL achieves superior returns and lower risk.
PMSep 16, 2020
Time your hedge with Deep Reinforcement LearningEric Benhamou, David Saltiel, Sandrine Ungari et al.
Can an asset manager plan the optimal timing for her/his hedging strategies given market conditions? The standard approach based on Markowitz or other more or less sophisticated financial rules aims to find the best portfolio allocation thanks to forecasted expected returns and risk but fails to fully relate market conditions to hedging strategies decision. In contrast, Deep Reinforcement Learning (DRL) can tackle this challenge by creating a dynamic dependency between market information and hedging strategies allocation decisions. In this paper, we present a realistic and augmented DRL framework that: (i) uses additional contextual information to decide an action, (ii) has a one period lag between observations and actions to account for one day lag turnover of common asset managers to rebalance their hedge, (iii) is fully tested in terms of stability and robustness thanks to a repetitive train test method called anchored walk forward training, similar in spirit to k fold cross validation for time series and (iv) allows managing leverage of our hedging strategy. Our experiment for an augmented asset manager interested in sizing and timing his hedges shows that our approach achieves superior returns and lower risk.
CVJul 15, 2020
Decoding CNN based Object Classifier Using VisualizationAbhishek Mukhopadhyay, Imon Mukherjee, Pradipta Biswas
This paper investigates how working of Convolutional Neural Network (CNN) can be explained through visualization in the context of machine perception of autonomous vehicles. We visualize what type of features are extracted in different convolution layers of CNN that helps to understand how CNN gradually increases spatial information in every layer. Thus, it concentrates on region of interests in every transformation. Visualizing heat map of activation helps us to understand how CNN classifies and localizes different objects in image. This study also helps us to reason behind low accuracy of a model helps to increase trust on object detection module.
HCMay 27, 2020
Eye Gaze Controlled Interfaces for Head Mounted and Multi-Functional Displays in Military Aviation EnvironmentLRD Murthy, Abhishek Mukhopadhyay, Varshit Yellheti et al.
Eye gaze controlled interfaces allow us to directly manipulate a graphical user interface just by looking at it. This technology has great potential in military aviation, in particular, operating different displays in situations where pilots hands are occupied with flying the aircraft. This paper reports studies on analyzing accuracy of eye gaze controlled interface inside aircraft undertaking representative flying missions. We reported that pilots can undertake representative pointing and selection tasks at less than 2 secs on average. Further, we evaluated the accuracy of eye gaze tracking glass under various G-conditions and analyzed its failure modes. We observed that the accuracy of an eye tracker is less than 5 degree of visual angle up to +3G, although it is less accurate at minus 1G and plus 5G. We observed that eye tracker may fail to track under higher external illumination. We also infer that an eye tracker to be used in military aviation need to have larger vertical field of view than the present available systems. We used this analysis to develop eye gaze trackers for Multi-Functional displays and Head Mounted Display System. We obtained significant reduction in pointing and selection times using our proposed HMDS system compared to traditional TDS.