60.8SYJun 2
Dynamics of the Thermomagnetic PendulumRyan Thompson, Ethan Wang, Nilay Kant
A thermomagnetic pendulum is introduced as a coupled thermo-magnetic-mechanical system consisting of a ferromagnetic bob under gravity and an offset permanent magnet. Heating drives the bob temperature above and below the Curie point, causing magnetic attraction to vanish and recover as the bob moves and cools. A multiphysics model is developed in which the magnetic torque depends nonlinearly on the bob temperature field and pendulum configuration. The formulation couples transient three-dimensional heat transfer, a temperature-dependent magnetization law, and pendulum dynamics. Simulations show angular torque asymmetry, rapid force reduction near the Curie point, and sustained oscillations.
MLFeb 2, 2023
The Contextual Lasso: Sparse Linear Models via Deep Neural NetworksRyan Thompson, Amir Dezfouli, Robert Kohn
Sparse linear models are one of several core tools for interpretable machine learning, a field of emerging importance as predictive models permeate decision-making in many domains. Unfortunately, sparse linear models are far less flexible as functions of their input features than black-box models like deep neural networks. With this capability gap in mind, we study a not-uncommon situation where the input features dichotomize into two groups: explanatory features, which are candidates for inclusion as variables in an interpretable model, and contextual features, which select from the candidate variables and determine their effects. This dichotomy leads us to the contextual lasso, a new statistical estimator that fits a sparse linear model to the explanatory features such that the sparsity pattern and coefficients vary as a function of the contextual features. The fitting process learns this function nonparametrically via a deep neural network. To attain sparse coefficients, we train the network with a novel lasso regularizer in the form of a projection layer that maps the network's output onto the space of $\ell_1$-constrained linear models. An extensive suite of experiments on real and synthetic data suggests that the learned models, which remain highly transparent, can be sparser than the regular lasso without sacrificing the predictive power of a standard deep neural network.
MLOct 24, 2023
Contextual Directed Acyclic GraphsRyan Thompson, Edwin V. Bonilla, Robert Kohn
Estimating the structure of directed acyclic graphs (DAGs) from observational data remains a significant challenge in machine learning. Most research in this area concentrates on learning a single DAG for the entire population. This paper considers an alternative setting where the graph structure varies across individuals based on available "contextual" features. We tackle this contextual DAG problem via a neural network that maps the contextual features to a DAG, represented as a weighted adjacency matrix. The neural network is equipped with a novel projection layer that ensures the output matrices are sparse and satisfy a recently developed characterization of acyclicity. We devise a scalable computational framework for learning contextual DAGs and provide a convergence guarantee and an analytical gradient for backpropagating through the projection layer. Our experiments suggest that the new approach can recover the true context-specific graph where existing approaches fail.
71.3LGMay 8
Arrow: A Foundation Model for Causal DiscoveryRyan Thompson, He Zhao, Daniel M. Steinberg et al.
We introduce Arrow, a foundation model for zero-shot causal discovery on observational tabular data. Arrow factorizes a directed acyclic graph into an undirected skeleton and a topological order, guaranteeing acyclicity by construction. Given a new dataset, it uses a transformer-based architecture to contextualize variables within and across observations, then predicts skeleton edge probabilities and node order scores that together define a graph. Arrow is trained in a supervised fashion on synthetic datasets with ground-truth graphs, using an end-to-end differentiable directed edge composite likelihood induced by the skeleton-order factorization. The training distribution spans diverse graph families, functional forms, noise models, and dataset shapes. Across in- and out-of-distribution synthetic, semi-synthetic, and real datasets, Arrow matches or outperforms existing causal discovery methods at substantially lower inference cost than competitive alternatives. Our results demonstrate that large-scale pretraining on diverse synthetic data can yield zero-shot causal discovery models that are fast, accurate, and reusable on new datasets.
MLJun 25, 2025
Scalable Subset Selection in Linear Mixed ModelsRyan Thompson, Matt P. Wand, Joanna J. J. Wang
Linear mixed models (LMMs), which incorporate fixed and random effects, are key tools for analyzing heterogeneous data, such as in personalized medicine. Nowadays, this type of data is increasingly wide, sometimes containing thousands of candidate predictors, necessitating sparsity for prediction and interpretation. However, existing sparse learning methods for LMMs do not scale well beyond tens or hundreds of predictors, leaving a large gap compared with sparse methods for linear models, which ignore random effects. This paper closes the gap with a new $\ell_0$ regularized method for LMM subset selection that can run on datasets containing thousands of predictors in seconds to minutes. On the computational front, we develop a coordinate descent algorithm as our main workhorse and provide a guarantee of its convergence. We also develop a local search algorithm to help traverse the nonconvex optimization surface. Both algorithms readily extend to subset selection in generalized LMMs via a penalized quasi-likelihood approximation. On the statistical front, we provide a finite-sample bound on the Kullback-Leibler divergence of the new method. We then demonstrate its excellent performance in experiments involving synthetic and real datasets.
MEMay 25, 2021
Group selection and shrinkage: Structured sparsity for semiparametric additive modelsRyan Thompson, Farshid Vahid
Sparse regression and classification estimators that respect group structures have application to an assortment of statistical and machine learning problems, from multitask learning to sparse additive modeling to hierarchical selection. This work introduces structured sparse estimators that combine group subset selection with shrinkage. To accommodate sophisticated structures, our estimators allow for arbitrary overlap between groups. We develop an optimization framework for fitting the nonconvex regularization surface and present finite-sample error bounds for estimation of the regression function. As an application requiring structure, we study sparse semiparametric additive modeling, a procedure that allows the effect of each predictor to be zero, linear, or nonlinear. For this task, the new estimators improve across several metrics on synthetic data compared to alternatives. Finally, we demonstrate their efficacy in modeling supermarket foot traffic and economic recessions using many predictors. These demonstrations suggest sparse semiparametric additive models, fit using the new estimators, are an excellent compromise between fully linear and fully nonparametric alternatives. All of our algorithms are made available in the scalable implementation grpsel.
CRJul 29, 2018
Mobile Technology in Healthcare Environment: Security Vulnerabilities and CountermeasuresSajedul Talukder, Shalisha Witherspoon, Kanishk Srivastava et al.
Mobile devices and technologies offer a tremendous amount of benefits to users, although it is also understood that it introduces a set of challenges when it comes to security, compliance, and risks. More and more healthcare organizations have been seeking to update their outdated technology, and have considered the adoption of mobile devices to meet these needs. However, introducing mobile devices and technology also introduces new risks and threats to the organization. As a test case, we examine Epic Rover, a mobile application that has been identified as a viable solution to manage the electronic medical system. In this paper, we study the insights that the security team needs to investigate, before the adoption of this mobile technology, as well as provide a thorough examination of the vulnerabilities and threats that the use of mobile devices in the healthcare environment brings, and introduce countermeasures and mitigations to reduce the risk while maintaining regulatory compliance.