MEMay 27, 2022
Inference and Sampling for Archimax CopulasYuting Ng, Ali Hasan, Vahid Tarokh
Understanding multivariate dependencies in both the bulk and the tails of a distribution is an important problem for many applications, such as ensuring algorithms are robust to observations that are infrequent but have devastating effects. Archimax copulas are a family of distributions endowed with a precise representation that allows simultaneous modeling of the bulk and the tails of a distribution. Rather than separating the two as is typically done in practice, incorporating additional information from the bulk may improve inference of the tails, where observations are limited. Building on the stochastic representation of Archimax copulas, we develop a non-parametric inference method and sampling algorithm. Our proposed methods, to the best of our knowledge, are the first that allow for highly flexible and scalable inference and sampling algorithms, enabling the increased use of Archimax copulas in practical settings. We experimentally compare to state-of-the-art density modeling techniques, and the results suggest that the proposed method effectively extrapolates to the tails while scaling to higher dimensional data. Our findings suggest that the proposed algorithms can be used in a variety of applications where understanding the interplay between the bulk and the tails of a distribution is necessary, such as healthcare and safety.
LGOct 3, 2023
Perceiver-based CDF Modeling for Time Series ForecastingCat P. Le, Chris Cannella, Ali Hasan et al.
Transformers have demonstrated remarkable efficacy in forecasting time series data. However, their extensive dependence on self-attention mechanisms demands significant computational resources, thereby limiting their practical applicability across diverse tasks, especially in multimodal problems. In this work, we propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data. Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction. By leveraging the perceiver, our model efficiently transforms high-dimensional and multimodal data into a compact latent space, thereby significantly reducing computational demands. Subsequently, we implement a copula-based attention mechanism to construct the joint distribution of missing data for prediction. Further, we propose an output variance testing mechanism to effectively mitigate error propagation during prediction. To enhance efficiency and reduce complexity, we introduce midpoint inference for the local attention mechanism. This enables the model to efficiently capture dependencies within nearby imputed samples without considering all previous samples. The experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods while utilizing less than half of the computational resources.
MEJun 20, 2023
Treatment Effects in Extreme RegimesAhmed Aloui, Ali Hasan, Yuting Ng et al.
Understanding treatment effects in extreme regimes is important for characterizing risks associated with different interventions. This is hindered by the unavailability of counterfactual outcomes and the rarity and difficulty of collecting extreme data in practice. To address this issue, we propose a new framework based on extreme value theory for estimating treatment effects in extreme regimes. We quantify these effects using variations in tail decay rates of potential outcomes in the presence and absence of treatments. We establish algorithms for calculating these quantities and develop related theoretical results. We demonstrate the efficacy of our approach on various standard synthetic and semi-synthetic datasets.
MLJul 31, 2024
Distributionally Robust Optimization as a Scalable Framework to Characterize Extreme Value DistributionsPatrick Kuiper, Ali Hasan, Wenhao Yang et al.
The goal of this paper is to develop distributionally robust optimization (DRO) estimators, specifically for multidimensional Extreme Value Theory (EVT) statistics. EVT supports using semi-parametric models called max-stable distributions built from spatial Poisson point processes. While powerful, these models are only asymptotically valid for large samples. However, since extreme data is by definition scarce, the potential for model misspecification error is inherent to these applications, thus DRO estimators are natural. In order to mitigate over-conservative estimates while enhancing out-of-sample performance, we study DRO estimators informed by semi-parametric max-stable constraints in the space of point processes. We study both tractable convex formulations for some problems of interest (e.g. CVaR) and more general neural network based estimators. Both approaches are validated using synthetically generated data, recovering prescribed characteristics, and verifying the efficacy of the proposed techniques. Additionally, the proposed method is applied to a real data set of financial returns for comparison to a previous analysis. We established the proposed model as a novel formulation in the multivariate EVT domain, and innovative with respect to performance when compared to relevant alternate proposals.
LGApr 15, 2024
Neural McKean-Vlasov Processes: Distributional Dependence in Diffusion ProcessesHaoming Yang, Ali Hasan, Yuting Ng et al.
McKean-Vlasov stochastic differential equations (MV-SDEs) provide a mathematical description of the behavior of an infinite number of interacting particles by imposing a dependence on the particle density. As such, we study the influence of explicitly including distributional information in the parameterization of the SDE. We propose a series of semi-parametric methods for representing MV-SDEs, and corresponding estimators for inferring parameters from data based on the properties of the MV-SDE. We analyze the characteristics of the different architectures and estimators, and consider their applicability in relevant machine learning problems. We empirically compare the performance of the different architectures and estimators on real and synthetic datasets for time series and probabilistic modeling. The results suggest that explicitly including distributional dependence in the parameterization of the SDE is effective in modeling temporal data with interaction under an exchangeability assumption while maintaining strong performance for standard Itô-SDEs due to the richer class of probability flows associated with MV-SDEs.
LGMar 4, 2025
Elliptic Loss RegularizationAli Hasan, Haoming Yang, Yuting Ng et al.
Regularizing neural networks is important for anticipating model behavior in regions of the data space that are not well represented. In this work, we propose a regularization technique for enforcing a level of smoothness in the mapping between the data input space and the loss value. We specify the level of regularity by requiring that the loss of the network satisfies an elliptic operator over the data domain. To do this, we modify the usual empirical risk minimization objective such that we instead minimize a new objective that satisfies an elliptic operator over points within the domain. This allows us to use existing theory on elliptic operators to anticipate the behavior of the error for points outside the training set. We propose a tractable computational method that approximates the behavior of the elliptic operator while being computationally efficient. Finally, we analyze the properties of the proposed regularization to understand the performance on common problems of distribution shift and group imbalance. Numerical experiments confirm the utility of the proposed regularization technique.
LGFeb 22, 2021
Generative Archimedean CopulasYuting Ng, Ali Hasan, Khalil Elkhalil et al.
We propose a new generative modeling technique for learning multidimensional cumulative distribution functions (CDFs) in the form of copulas. Specifically, we consider certain classes of copulas known as Archimedean and hierarchical Archimedean copulas, popular for their parsimonious representation and ability to model different tail dependencies. We consider their representation as mixture models with Laplace transforms of latent random variables from generative neural networks. This alternative representation allows for computational efficiencies and easy sampling, especially in high dimensions. We describe multiple methods for optimizing the network parameters. Finally, we present empirical results that demonstrate the efficacy of our proposed method in learning multidimensional CDFs and its computational efficiency compared to existing methods.
MLFeb 17, 2021
Modeling Extremes with d-max-decreasing Neural NetworksAli Hasan, Khalil Elkhalil, Yuting Ng et al.
We propose a novel neural network architecture that enables non-parametric calibration and generation of multivariate extreme value distributions (MEVs). MEVs arise from Extreme Value Theory (EVT) as the necessary class of models when extrapolating a distributional fit over large spatial and temporal scales based on data observed in intermediate scales. In turn, EVT dictates that $d$-max-decreasing, a stronger form of convexity, is an essential shape constraint in the characterization of MEVs. As far as we know, our proposed architecture provides the first class of non-parametric estimators for MEVs that preserve these essential shape constraints. We show that our architecture approximates the dependence structure encoded by MEVs at parametric rate. Moreover, we present a new method for sampling high-dimensional MEVs using a generative model. We demonstrate our methodology on a wide range of experimental settings, ranging from environmental sciences to financial mathematics and verify that the structural properties of MEVs are retained compared to existing methods.
LGJan 2, 2020
Robust Marine Buoy Placement for Ship Detection Using Dropout K-MeansYuting Ng, João M. Pereira, Denis Garagic et al.
Marine buoys aid in the battle against Illegal, Unreported and Unregulated (IUU) fishing by detecting fishing vessels in their vicinity. Marine buoys, however, may be disrupted by natural causes and buoy vandalism. In this paper, we formulate marine buoy placement as a clustering problem, and propose dropout k-means and dropout k-median to improve placement robustness to buoy disruption. We simulated the passage of ships in the Gabonese waters near West Africa using historical Automatic Identification System (AIS) data, then compared the ship detection probability of dropout k-means to classic k-means and dropout k-median to classic k-median. With 5 buoys, the buoy arrangement computed by classic k-means, dropout k-means, classic k-median and dropout k-median have ship detection probabilities of 38%, 45%, 48% and 52%.