Pierre Thodoroff

h-index6

5papers

377citations

Novelty47%

AI Score32

Ranked #125,266 of 194,257 authors (top 64%)#27,546 in LG (top 69%)

5 Papers

7.5SEFeb 9, 2023Code

Machine Learning Systems: A Survey from a Data-Oriented Perspective

Christian Cabrera, Andrei Paleyes, Pierre Thodoroff et al. · cambridge

Engineers are deploying ML models as parts of real-world systems with the upsurge of AI technologies. Real-world environments challenge the deployment of such systems because these environments produce large amounts of heterogeneous data, and users require increasingly efficient responses. These requirements push prevalent software architectures to the limit when deploying ML-based systems. Data-oriented Architecture (DOA) is an emerging style that equips systems better for integrating ML models. Even though papers on deployed ML systems do not mention DOA, their authors made design decisions that implicitly follow DOA. Implicit decisions create a knowledge gap, limiting the practitioners' ability to implement ML-based systems. \hlb{This paper surveys why, how, and to what extent practitioners have adopted DOA to implement and deploy ML-based systems.} We overcome the knowledge gap by answering these questions and explicitly showing the design decisions and practices behind these systems. The survey follows a well-known systematic and semi-automated methodology for reviewing papers in software engineering. The majority of reviewed works partially adopt DOA. Such an adoption enables systems to address requirements such as Big Data management, low latency processing, resource management, security and privacy. Based on these findings, we formulate practical advice to facilitate the deployment of ML-based systems.

3.4LGMay 23, 2019

Recurrent Value Functions

Pierre Thodoroff, Nishanth Anand, Lucas Caccia et al.

Despite recent successes in Reinforcement Learning, value-based methods often suffer from high variance hindering performance. In this paper, we illustrate this in a continuous control setting where state of the art methods perform poorly whenever sensor noise is introduced. To overcome this issue, we introduce Recurrent Value Functions (RVFs) as an alternative to estimate the value function of a state. We propose to estimate the value function of the current state using the value function of past states visited along the trajectory. Due to the nature of their formulation, RVFs have a natural way of learning an emphasis function that selectively emphasizes important states. First, we establish RVF's asymptotic convergence properties in tabular settings. We then demonstrate their robustness on a partially observable domain and continuous control tasks. Finally, we provide a qualitative interpretation of the learned emphasis function.

4.1LGNov 1, 2018Code

Temporal Regularization in Markov Decision Process

Pierre Thodoroff, Audrey Durand, Joelle Pineau et al.

Several applications of Reinforcement Learning suffer from instability due to high variance. This is especially prevalent in high dimensional domains. Regularization is a commonly used technique in machine learning to reduce variance, at the cost of introducing some bias. Most existing regularization techniques focus on spatial (perceptual) regularization. Yet in reinforcement learning, due to the nature of the Bellman equation, there is an opportunity to also exploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization. We formally characterize the bias induced by this technique using Markov chain concepts. We illustrate the various characteristics of temporal regularization via a sequence of simple discrete and continuous MDPs, and show that the technique provides improvement even in high-dimensional Atari games.

7.9LGOct 17, 2018Code

Adversarial Balancing for Causal Inference

Michal Ozery-Flato, Pierre Thodoroff, Matan Ninio et al.

Biases in observational data of treatments pose a major challenge to estimating expected treatment outcomes in different populations. An important technique that accounts for these biases is reweighting samples to minimize the discrepancy between treatment groups. We present a novel reweighting approach that uses bi-level optimization to alternately train a discriminator to minimize classification error, and a balancing weights generator that uses exponentiated gradient descent to maximize this error. This approach borrows principles from generative adversarial networks (GANs) to exploit the power of classifiers for measuring two-sample divergence. We provide theoretical results for conditions in which the estimation error is bounded by two factors: (i) the discrepancy measure induced by the discriminator; and (ii) the weights variability. Experimental results on several benchmarks comparing to previous state-of-the-art reweighting methods demonstrate the effectiveness of this approach in estimating causal effects.

17.3LGJul 31, 2016

Learning Robust Features using Deep Learning for Automatic Seizure Detection

Pierre Thodoroff, Joelle Pineau, Andrew Lim

We present and evaluate the capacity of a deep neural network to learn robust features from EEG to automatically detect seizures. This is a challenging problem because seizure manifestations on EEG are extremely variable both inter- and intra-patient. By simultaneously capturing spectral, temporal and spatial information our recurrent convolutional neural network learns a general spatially invariant representation of a seizure. The proposed approach exceeds significantly previous results obtained on cross-patient classifiers both in terms of sensitivity and false positive rate. Furthermore, our model proves to be robust to missing channel and variable electrode montage.