Zhizhen Zhao

LG
h-index8
42papers
1,361citations
Novelty46%
AI Score44

42 Papers

CYSep 30, 2022
FAIR for AI: An interdisciplinary and international community building perspective

E. A. Huerta, Ben Blaiszik, L. Catherine Brinson et al.

A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to include the software, tools, algorithms, and workflows that produce data. FAIR principles are now being adapted in the context of AI models and datasets. Here, we present the perspectives, vision, and experiences of researchers from different countries, disciplines, and backgrounds who are leading the definition and adoption of FAIR principles in their communities of practice, and discuss outcomes that may result from pursuing and incentivizing FAIR AI research. The material for this report builds on the FAIR for AI Workshop held at Argonne National Laboratory on June 7, 2022.

HEP-EXDec 9, 2022
FAIR AI Models in High Energy Physics

Javier Duarte, Haoyang Li, Avik Roy et al.

The findable, accessible, interoperable, and reusable (FAIR) data principles provide a framework for examining, evaluating, and improving how data is shared to facilitate scientific discovery. Generalizing these principles to research software and other digital products is an active area of research. Machine learning (ML) models -- algorithms that have been trained on data without being explicitly programmed -- and more generally, artificial intelligence (AI) models, are an important target for this because of the ever-increasing pace with which AI is transforming scientific domains, such as experimental high energy physics (HEP). In this paper, we propose a practical definition of FAIR principles for AI models in HEP and describe a template for the application of these principles. We demonstrate the template's use with an example AI model applied to HEP, in which a graph neural network is used to identify Higgs bosons decaying to two bottom quarks. We report on the robustness of this FAIR AI model, its portability across hardware architectures and software frameworks, and its interpretability.

CVJul 28, 2022
Initialization and Alignment for Adversarial Texture Optimization

Xiaoming Zhao, Zhizhen Zhao, Alexander G. Schwing

While recovery of geometry from image and video data has received a lot of attention in computer vision, methods to capture the texture for a given geometry are less mature. Specifically, classical methods for texture generation often assume clean geometry and reasonably well-aligned image data. While very recent methods, e.g., adversarial texture optimization, better handle lower-quality data obtained from hand-held devices, we find them to still struggle frequently. To improve robustness, particularly of recent adversarial texture optimization, we develop an explicit initialization and an alignment procedure. It deals with complex geometry due to a robust mapping of the geometry to the texture map and a hard-assignment-based initialization. It deals with misalignment of geometry and images by integrating fast image-alignment into the texture refinement optimization. We demonstrate efficacy of our texture generation on a dataset of 11 scenes with a total of 2807 frames, observing 7.8% and 11.1% relative improvements regarding perceptual and sharpness measurements.

OCJul 6, 2022
Orthogonal Matrix Retrieval with Spatial Consensus for 3D Unknown-View Tomography

Shuai Huang, Mona Zehni, Ivan Dokmanić et al.

Unknown-view tomography (UVT) reconstructs a 3D density map from its 2D projections at unknown, random orientations. A line of work starting with Kam (1980) employs the method of moments (MoM) with rotation-invariant Fourier features to solve UVT in the frequency domain, assuming that the orientations are uniformly distributed. This line of work includes the recent orthogonal matrix retrieval (OMR) approaches based on matrix factorization, which, while elegant, either require side information about the density that is not available, or fail to be sufficiently robust. For OMR to break free from those restrictions, we propose to jointly recover the density map and the orthogonal matrices by requiring that they be mutually consistent. We regularize the resulting non-convex optimization problem by a denoised reference projection and a nonnegativity constraint. This is enabled by the new closed-form expressions for spatial autocorrelation features. Further, we design an easy-to-compute initial density map which effectively mitigates the non-convexity of the reconstruction problem. Experimental results show that the proposed OMR with spatial consensus is more robust and performs significantly better than the previous state-of-the-art OMR approach in the typical low-SNR scenario of 3D UVT.

AO-PHJun 18, 2023
Convolutional GRU Network for Seasonal Prediction of the El Niño-Southern Oscillation

Lingda Wang, Savana Ammons, Vera Mikyoung Hur et al.

Predicting sea surface temperature (SST) within the El Niño-Southern Oscillation (ENSO) region has been extensively studied due to its significant influence on global temperature and precipitation patterns. Statistical models such as linear inverse model (LIM), analog forecasting (AF), and recurrent neural network (RNN) have been widely used for ENSO prediction, offering flexibility and relatively low computational expense compared to large dynamic models. However, these models have limitations in capturing spatial patterns in SST variability or relying on linear dynamics. Here we present a modified Convolutional Gated Recurrent Unit (ConvGRU) network for the ENSO region spatio-temporal sequence prediction problem, along with the Niño 3.4 index prediction as a down stream task. The proposed ConvGRU network, with an encoder-decoder sequence-to-sequence structure, takes historical SST maps of the Pacific region as input and generates future SST maps for subsequent months within the ENSO region. To evaluate the performance of the ConvGRU network, we trained and tested it using data from multiple large climate models. The results demonstrate that the ConvGRU network significantly improves the predictability of the Niño 3.4 index compared to LIM, AF, and RNN. This improvement is evidenced by extended useful prediction range, higher Pearson correlation, and lower root-mean-square error. The proposed model holds promise for improving our understanding and predicting capabilities of the ENSO phenomenon and can be broadly applicable to other weather and climate prediction scenarios with spatial patterns and teleconnections.

SIJun 16, 2022
Multi-Frequency Joint Community Detection and Phase Synchronization

Lingda Wang, Zhizhen Zhao

This paper studies the joint community detection and phase synchronization problem on the \textit{stochastic block model with relative phase}, where each node is associated with an unknown phase angle. This problem, with a variety of real-world applications, aims to recover the cluster structure and associated phase angles simultaneously. We show this problem exhibits a \textit{``multi-frequency''} structure by closely examining its maximum likelihood estimation (MLE) formulation, whereas existing methods are not originated from this perspective. To this end, two simple yet efficient algorithms that leverage the MLE formulation and benefit from the information across multiple frequencies are proposed. The former is a spectral method based on the novel multi-frequency column-pivoted QR factorization. The factorization applied to the top eigenvectors of the observation matrix provides key information about the cluster structure and associated phase angles. The second approach is an iterative multi-frequency generalized power method, where each iteration updates the estimation in a matrix-multiplication-then-projection manner. Numerical experiments show that our proposed algorithms significantly improve the ability of exactly recovering the cluster structure and the accuracy of the estimated phase angles, compared to state-of-the-art algorithms.

LGFeb 24, 2025Code
Towards Hierarchical Rectified Flow

Yichi Zhang, Yici Yan, Alex Schwing et al.

We formulate a hierarchical rectified flow to model data distributions. It hierarchically couples multiple ordinary differential equations (ODEs) and defines a time-differentiable stochastic process that generates a data distribution from a known source distribution. Each ODE resembles the ODE that is solved in a classic rectified flow, but differs in its domain, i.e., location, velocity, acceleration, etc. Unlike the classic rectified flow formulation, which formulates a single ODE in the location domain and only captures the expected velocity field (sufficient to capture a multi-modal data distribution), the hierarchical rectified flow formulation models the multi-modal random velocity field, acceleration field, etc., in their entirety. This more faithful modeling of the random velocity field enables integration paths to intersect when the underlying ODE is solved during data generation. Intersecting paths in turn lead to integration trajectories that are more straight than those obtained in the classic rectified flow formulation, where integration paths cannot intersect. This leads to modeling of data distributions with fewer neural function evaluations. We empirically verify this on synthetic 1D and 2D data as well as MNIST, CIFAR-10, and ImageNet-32 data. Our code is available at: https://riccizz.github.io/HRF/.

CVJul 17, 2025Code
Hierarchical Rectified Flow Matching with Mini-Batch Couplings

Yichi Zhang, Yici Yan, Alex Schwing et al.

Flow matching has emerged as a compelling generative modeling approach that is widely used across domains. To generate data via a flow matching model, an ordinary differential equation (ODE) is numerically solved via forward integration of the modeled velocity field. To better capture the multi-modality that is inherent in typical velocity fields, hierarchical flow matching was recently introduced. It uses a hierarchy of ODEs that are numerically integrated when generating data. This hierarchy of ODEs captures the multi-modal velocity distribution just like vanilla flow matching is capable of modeling a multi-modal data distribution. While this hierarchy enables to model multi-modal velocity distributions, the complexity of the modeled distribution remains identical across levels of the hierarchy. In this paper, we study how to gradually adjust the complexity of the distributions across different levels of the hierarchy via mini-batch couplings. We show the benefits of mini-batch couplings in hierarchical rectified flow matching via compelling results on synthetic and imaging data. Code is available at https://riccizz.github.io/HRF_coupling.

LGOct 27, 2025Code
Variational Masked Diffusion Models

Yichi Zhang, Alex Schwing, Zhizhen Zhao

Masked diffusion models have recently emerged as a flexible framework for discrete generative modeling. However, a key limitation of standard masked diffusion is its inability to effectively capture dependencies among tokens that are predicted concurrently, leading to degraded generation quality when dependencies among tokens are important. To explicitly model dependencies among tokens, we propose Variational Masked Diffusion (VMD), a framework that introduces latent variables into the masked diffusion process. Through controlled experiments on synthetic datasets, we demonstrate that VMD successfully learns dependencies that conventional masked diffusion fails to capture. We further validate the effectiveness of our approach on Sudoku puzzles and text datasets, where learning of dependencies among tokens improves global consistency. Across these domains, VMD enhances both generation quality and dependency awareness, highlighting the value of integrating variational inference into masked diffusion. Our code is available at: https://riccizz.github.io/VMD.

IMMay 13, 2021Code
Advances in Machine and Deep Learning for Modeling and Real-time Detection of Multi-Messenger Sources

E. A. Huerta, Zhizhen Zhao

We live in momentous times. The science community is empowered with an arsenal of cosmic messengers to study the Universe in unprecedented detail. Gravitational waves, electromagnetic waves, neutrinos and cosmic rays cover a wide range of wavelengths and time scales. Combining and processing these datasets that vary in volume, speed and dimensionality requires new modes of instrument coordination, funding and international collaboration with a specialized human and technological infrastructure. In tandem with the advent of large-scale scientific facilities, the last decade has experienced an unprecedented transformation in computing and signal processing algorithms. The combination of graphics processing units, deep learning, and the availability of open source, high-quality datasets, have powered the rise of artificial intelligence. This digital revolution now powers a multi-billion dollar industry, with far-reaching implications in technology and society. In this chapter we describe pioneering efforts to adapt artificial intelligence algorithms to address computational grand challenges in Multi-Messenger Astrophysics. We review the rapid evolution of these disruptive algorithms, from the first class of algorithms introduced in early 2017, to the sophisticated algorithms that now incorporate domain expertise in their architectural design and optimization schemes. We discuss the importance of scientific visualization and extreme-scale computing in reducing time-to-insight and obtaining new knowledge from the interplay between models and data.

LGJan 6, 2019Code
LanczosNet: Multi-Scale Deep Graph Convolutional Networks

Renjie Liao, Zhizhen Zhao, Raquel Urtasun et al.

We propose the Lanczos network (LanczosNet), which uses the Lanczos algorithm to construct low rank approximations of the graph Laplacian for graph convolution. Relying on the tridiagonal decomposition of the Lanczos algorithm, we not only efficiently exploit multi-scale information via fast approximated computation of matrix power but also design learnable spectral filters. Being fully differentiable, LanczosNet facilitates both graph kernel learning as well as learning node embeddings. We show the connection between our LanczosNet and graph based manifold learning methods, especially the diffusion maps. We benchmark our model against several recent deep graph networks on citation networks and QM8 quantum chemistry dataset. Experimental results show that our model achieves the state-of-the-art performance in most tasks. Code is released at: \url{https://github.com/lrjconan/LanczosNetwork}.

LGDec 17, 2024
Boosting Test Performance with Importance Sampling--a Subpopulation Perspective

Hongyu Shen, Zhizhen Zhao

Despite empirical risk minimization (ERM) is widely applied in the machine learning community, its performance is limited on data with spurious correlation or subpopulation that is introduced by hidden attributes. Existing literature proposed techniques to maximize group-balanced or worst-group accuracy when such correlation presents, yet, at the cost of lower average accuracy. In addition, many existing works conduct surveys on different subpopulation methods without revealing the inherent connection between these methods, which could hinder the technology advancement in this area. In this paper, we identify important sampling as a simple yet powerful tool for solving the subpopulation problem. On the theory side, we provide a new systematic formulation of the subpopulation problem and explicitly identify the assumptions that are not clearly stated in the existing works. This helps to uncover the cause of the dropped average accuracy. We provide the first theoretical discussion on the connections of existing methods, revealing the core components that make them different. On the application side, we demonstrate a single estimator is enough to solve the subpopulation problem. In particular, we introduce the estimator in both attribute-known and -unknown scenarios in the subpopulation setup, offering flexibility in practical use cases. And empirically, we achieve state-of-the-art performance on commonly used benchmark datasets.

LGFeb 27, 2024
DeepDRK: Deep Dependency Regularized Knockoff for Feature Selection

Hongyu Shen, Yici Yan, Zhizhen Zhao

Model-X knockoff has garnered significant attention among various feature selection methods due to its guarantees for controlling the false discovery rate (FDR). Since its introduction in parametric design, knockoff techniques have evolved to handle arbitrary data distributions using deep learning-based generative models. However, we have observed limitations in the current implementations of the deep Model-X knockoff framework. Notably, the "swap property" that knockoffs require often faces challenges at the sample level, resulting in diminished selection power. To address these issues, we develop "Deep Dependency Regularized Knockoff (DeepDRK)," a distribution-free deep learning method that effectively balances FDR and power. In DeepDRK, we introduce a novel formulation of the knockoff model as a learning problem under multi-source adversarial attacks. By employing an innovative perturbation technique, we achieve lower FDR and higher power. Our model outperforms existing benchmarks across synthetic, semi-synthetic, and real-world datasets, particularly when sample sizes are small and data distributions are non-Gaussian.

MLDec 25, 2021
A Spectral Method for Joint Community Detection and Orthogonal Group Synchronization

Yifeng Fan, Yuehaw Khoo, Zhizhen Zhao

Community detection and orthogonal group synchronization are both fundamental problems with a variety of important applications in science and engineering. In this work, we consider the joint problem of community detection and orthogonal group synchronization which aims to recover the communities and perform synchronization simultaneously. To this end, we propose a simple algorithm that consists of a spectral decomposition step followed by a blockwise column pivoted QR factorization (CPQR). The proposed algorithm is efficient and scales linearly with the number of edges in the graph. We also leverage the recently developed `leave-one-out' technique to establish a near-optimal guarantee for exact recovery of the cluster memberships and stable recovery of the orthogonal transforms. Numerical experiments demonstrate the efficiency and efficacy of our algorithm and confirm our theoretical characterization of it.

IVAug 23, 2021
An Adversarial Learning Based Approach for Unknown View Tomographic Reconstruction

Mona Zehni, Zhizhen Zhao

The goal of 2D tomographic reconstruction is to recover an image given its projections from various views. It is often presumed that projection angles associated with the projections are known in advance. Under certain situations, however, these angles are known only approximately or are completely unknown. It becomes more challenging to reconstruct the image from a collection of random projections. We propose an adversarial learning based approach to recover the image and the projection angle distribution by matching the empirical distribution of the measurements with the generated data. Fitting the distributions is achieved through solving a min-max game between a generator and a critic based on Wasserstein generative adversarial network structure. To accommodate the update of the projection angle distribution through gradient back propagation, we approximate the loss using the Gumbel-Softmax reparameterization of samples from discrete distributions. Our theoretical analysis verifies the unique recovery of the image and the projection distribution up to a rotation and reflection upon convergence. Our extensive numerical experiments showcase the potential of our method to accurately recover the image and the projection angle distribution under noise contamination.

HEP-EXAug 4, 2021
A FAIR and AI-ready Higgs boson decay dataset

Yifan Chen, E. A. Huerta, Javier Duarte et al.

To enable the reusability of massive scientific datasets by humans and machines, researchers aim to adhere to the principles of findability, accessibility, interoperability, and reusability (FAIR) for data and artificial intelligence (AI) models. This article provides a domain-agnostic, step-by-step assessment guide to evaluate whether or not a given dataset meets these principles. We demonstrate how to use this guide to evaluate the FAIRness of an open simulated dataset produced by the CMS Collaboration at the CERN Large Hadron Collider. This dataset consists of Higgs boson decays and quark and gluon background, and is available through the CERN Open Data Portal. We use additional available tools to assess the FAIRness of this dataset, and incorporate feedback from members of the FAIR community to validate our results. This article is accompanied by a Jupyter notebook to visualize and explore this dataset. This study marks the first in a planned series of articles that will guide scientists in the creation of FAIR AI models and datasets in high energy particle physics.

DIS-NNAug 4, 2021
Spacetime Neural Network for High Dimensional Quantum Dynamics

Jiangran Wang, Zhuo Chen, Di Luo et al.

We develop a spacetime neural network method with second order optimization for solving quantum dynamics from the high dimensional Schrödinger equation. In contrast to the standard iterative first order optimization and the time-dependent variational principle, our approach utilizes the implicit mid-point method and generates the solution for all spatial and temporal values simultaneously after optimization. We demonstrate the method in the Schrödinger equation with a self-normalized autoregressive spacetime neural network construction. Future explorations for solving different high dimensional differential equations are discussed.

CVMay 13, 2021
DeepQAMVS: Query-Aware Hierarchical Pointer Networks for Multi-Video Summarization

Safa Messaoud, Ismini Lourentzou, Assma Boughoula et al.

The recent growth of web video sharing platforms has increased the demand for systems that can efficiently browse, retrieve and summarize video content. Query-aware multi-video summarization is a promising technique that caters to this demand. In this work, we introduce a novel Query-Aware Hierarchical Pointer Network for Multi-Video Summarization, termed DeepQAMVS, that jointly optimizes multiple criteria: (1) conciseness, (2) representativeness of important query-relevant events and (3) chronological soundness. We design a hierarchical attention model that factorizes over three distributions, each collecting evidence from a different modality, followed by a pointer network that selects frames to include in the summary. DeepQAMVS is trained with reinforcement learning, incorporating rewards that capture representativeness, diversity, query-adaptability and temporal coherence. We achieve state-of-the-art results on the MVS1K dataset, with inference time scaling linearly with the number of input video frames.

MLMay 13, 2021
Joint Community Detection and Rotational Synchronization via Semidefinite Programming

Yifeng Fan, Yuehaw Khoo, Zhizhen Zhao

In the presence of heterogeneous data, where randomly rotated objects fall into multiple underlying categories, it is challenging to simultaneously classify them into clusters and synchronize them based on pairwise relations. This gives rise to the joint problem of community detection and synchronization. We propose a series of semidefinite relaxations, and prove their exact recovery when extending the celebrated stochastic block model to this new setting where both rotations and cluster identities are to be determined. Numerical experiments demonstrate the efficacy of our proposed algorithms and confirm our theoretical result which indicates a sharp phase transition for exact recovery.

SYFeb 19, 2021
Spatial-temporal switching estimators for imaging locally concentrated dynamics

Parisa Karimi, Mark Butala, Zhizhen Zhao et al.

The evolution of images with physics-based dynamics is often spatially localized and nonlinear. A switching linear dynamic system (SLDS) is a natural model under which to pose such problems when the system's evolution randomly switches over the observation interval. Because of the high parameter space dimensionality, efficient and accurate recovery of the underlying state is challenging. The work presented in this paper focuses on the common cases where the dynamic evolution may be adequately modeled as a collection of decoupled, locally concentrated dynamic operators. Patch-based hybrid estimators are proposed for real-time reconstruction of images from noisy measurements given perfect or partial information about the underlying system dynamics. Numerical results demonstrate the effectiveness of the proposed approach for denoising in a realistic data-driven simulation of remotely sensed cloud dynamics.

CVFeb 9, 2021
UVTomo-GAN: An adversarial learning based approach for unknown view X-ray tomographic reconstruction

Mona Zehni, Zhizhen Zhao

Tomographic reconstruction recovers an unknown image given its projections from different angles. State-of-the-art methods addressing this problem assume the angles associated with the projections are known a-priori. Given this knowledge, the reconstruction process is straightforward as it can be formulated as a convex problem. Here, we tackle a more challenging setting: 1) the projection angles are unknown, 2) they are drawn from an unknown probability distribution. In this set-up our goal is to recover the image and the projection angle distribution using an unsupervised adversarial learning approach. For this purpose, we formulate the problem as a distribution matching between the real projection lines and the generated ones from the estimated image and projection distribution. This is then solved by reaching the equilibrium in a min-max game between a generator and a discriminator. Our novel contribution is to recover the unknown projection distribution and the image simultaneously using adversarial learning. To accommodate this, we use Gumbel-softmax approximation of samples from categorical distribution to approximate the generator's loss as a function of the unknown image and the projection distribution. Our approach can be generalized to different inverse problems. Our simulation results reveal the ability of our method in successfully recovering the image and the projection distribution in various settings.

STR-ELJan 18, 2021
Gauge Invariant and Anyonic Symmetric Transformer and RNN Quantum States for Quantum Lattice Models

Di Luo, Zhuo Chen, Kaiwen Hu et al.

Symmetries such as gauge invariance and anyonic symmetry play a crucial role in quantum many-body physics. We develop a general approach to constructing gauge invariant or anyonic symmetric autoregressive neural network quantum states, including a wide range of architectures such as Transformer and recurrent neural network (RNN), for quantum lattice models. These networks can be efficiently sampled and explicitly obey gauge symmetries or anyonic constraint. We prove that our methods can provide exact representation for the ground and excited states of the 2D and 3D toric codes, and the X-cube fracton model. We variationally optimize our symmetry incorporated autoregressive neural networks for ground states as well as real-time dynamics for a variety of models. We simulate the dynamics and the ground states of the quantum link model of $\text{U(1)}$ lattice gauge theory, obtain the phase diagram for the 2D $\mathbb{Z}_2$ gauge theory, determine the phase transition and the central charge of the $\text{SU(2)}_3$ anyonic chain, and also compute the ground state energy of the SU(2) invariant Heisenberg spin chain. Our approach provides powerful tools for exploring condensed matter physics, high energy physics and quantum information science.

LGDec 10, 2020
Adversarial Linear Contextual Bandits with Graph-Structured Side Observations

Lingda Wang, Bingcong Li, Huozhi Zhou et al.

This paper studies the adversarial graphical contextual bandits, a variant of adversarial multi-armed bandits that leverage two categories of the most common side information: \emph{contexts} and \emph{side observations}. In this setting, a learning agent repeatedly chooses from a set of $K$ actions after being presented with a $d$-dimensional context vector. The agent not only incurs and observes the loss of the chosen action, but also observes the losses of its neighboring actions in the observation structures, which are encoded as a series of feedback graphs. This setting models a variety of applications in social networks, where both contexts and graph-structured side observations are available. Two efficient algorithms are developed based on \texttt{EXP3}. Under mild conditions, our analysis shows that for undirected feedback graphs the first algorithm, \texttt{EXP3-LGC-U}, achieves the regret of order $\mathcal{O}(\sqrt{(K+α(G)d)T\log{K}})$ over the time horizon $T$, where $α(G)$ is the average \emph{independence number} of the feedback graphs. A slightly weaker result is presented for the directed graph setting as well. The second algorithm, \texttt{EXP3-LGC-IX}, is developed for a special class of problems, for which the regret is reduced to $\mathcal{O}(\sqrt{α(G)dT\log{K}\log(KT)})$ for both directed as well as undirected feedback graphs. Numerical tests corroborate the efficiency of proposed algorithms.

OCDec 9, 2020
Enhancing Parameter-Free Frank Wolfe with an Extra Subproblem

Bingcong Li, Lingda Wang, Georgios B. Giannakis et al.

Aiming at convex optimization under structural constraints, this work introduces and analyzes a variant of the Frank Wolfe (FW) algorithm termed ExtraFW. The distinct feature of ExtraFW is the pair of gradients leveraged per iteration, thanks to which the decision variable is updated in a prediction-correction (PC) format. Relying on no problem dependent parameters in the step sizes, the convergence rate of ExtraFW for general convex problems is shown to be ${\cal O}(\frac{1}{k})$, which is optimal in the sense of matching the lower bound on the number of solved FW subproblems. However, the merit of ExtraFW is its faster rate ${\cal O}\big(\frac{1}{k^2} \big)$ on a class of machine learning problems. Compared with other parameter-free FW variants that have faster rates on the same problems, ExtraFW has improved rates and fine-grained analysis thanks to its PC update. Numerical tests on binary classification with different sparsity-promoting constraints demonstrate that the empirical performance of ExtraFW is significantly better than FW, and even faster than Nesterov's accelerated gradient on certain datasets. For matrix completion, ExtraFW enjoys smaller optimality gap, and lower rank than FW.

LGDec 16, 2019
Deep Learning for Cardiologist-level Myocardial Infarction Detection in Electrocardiograms

Arjun Gupta, E. A. Huerta, Zhizhen Zhao et al.

Myocardial infarction is the leading cause of death worldwide. In this paper, we design domain-inspired neural network models to detect myocardial infarction. First, we study the contribution of various leads. This systematic analysis, first of its kind in the literature, indicates that out of 15 ECG leads, data from the v6, vz, and ii leads are critical to correctly identify myocardial infarction. Second, we use this finding and adapt the ConvNetQuake neural network model--originally designed to identify earthquakes--to attain state-of-the-art classification results for myocardial infarction, achieving $99.43\%$ classification accuracy on a record-wise split, and $97.83\%$ classification accuracy on a patient-wise split. These two results represent cardiologist-level performance level for myocardial infarction detection after feeding only 10 seconds of raw ECG data into our model. Third, we show that our multi-ECG-channel neural network achieves cardiologist-level performance without the need of any kind of manual feature extraction or data pre-processing.

GR-QCNov 26, 2019
Enabling real-time multi-messenger astrophysics discoveries with deep learning

E. A. Huerta, Gabrielle Allen, Igor Andreoni et al.

Multi-messenger astrophysics is a fast-growing, interdisciplinary field that combines data, which vary in volume and speed of data processing, from many different instruments that probe the Universe using different cosmic messengers: electromagnetic waves, cosmic rays, gravitational waves and neutrinos. In this Expert Recommendation, we review the key challenges of real-time observations of gravitational wave sources and their electromagnetic and astroparticle counterparts, and make a number of recommendations to maximize their potential for scientific discovery. These recommendations refer to the design of scalable and computationally efficient machine learning algorithms; the cyber-infrastructure to numerically simulate astrophysical sources, and to process and interpret multi-messenger astrophysics data; the management of gravitational wave detections to trigger real-time alerts for electromagnetic and astroparticle follow-ups; a vision to harness future developments of machine learning and cyber-infrastructure resources to cope with the big-data requirements; and the need to build a community of experts to realize the goals of multi-messenger astrophysics.

LGSep 12, 2019
Nearly Optimal Algorithms for Piecewise-Stationary Cascading Bandits

Lingda Wang, Huozhi Zhou, Bingcong Li et al.

Cascading bandit (CB) is a popular model for web search and online advertising, where an agent aims to learn the $K$ most attractive items out of a ground set of size $L$ during the interaction with a user. However, the stationary CB model may be too simple to apply to real-world problems, where user preferences may change over time. Considering piecewise-stationary environments, two efficient algorithms, \texttt{GLRT-CascadeUCB} and \texttt{GLRT-CascadeKL-UCB}, are developed and shown to ensure regret upper bounds on the order of $\mathcal{O}(\sqrt{NLT\log{T}})$, where $N$ is the number of piecewise-stationary segments, and $T$ is the number of time slots. At the crux of the proposed algorithms is an almost parameter-free change-point detector, the generalized likelihood ratio test (GLRT). Comparing with existing works, the GLRT-based algorithms: i) are free of change-point-dependent information for choosing parameters; ii) have fewer tuning parameters; iii) improve at least the $L$ dependence in regret upper bounds. In addition, we show that the proposed algorithms are optimal (up to a logarithm factor) in terms of regret by deriving a minimax lower bound on the order of $Ω(\sqrt{NLT})$ for piecewise-stationary CB. The efficiency of the proposed algorithms relative to state-of-the-art approaches is validated through numerical experiments on both synthetic and real-world datasets.

LGJun 6, 2019
Unsupervised Co-Learning on $\mathcal{G}$-Manifolds Across Irreducible Representations

Yifeng Fan, Tingran Gao, Zhizhen Zhao

We introduce a novel co-learning paradigm for manifolds naturally equipped with a group action, motivated by recent developments on learning a manifold from attached fibre bundle structures. We utilize a representation theoretic mechanism that canonically associates multiple independent vector bundles over a common base manifold, which provides multiple views for the geometry of the underlying manifold. The consistency across these fibre bundles provide a common base for performing unsupervised manifold co-learning through the redundancy created artificially across irreducible representations of the transformation group. We demonstrate the efficacy of the proposed algorithmic paradigm through drastically improved robust nearest neighbor search and community detection on rotation-invariant cryo-electron microscopy image analysis.

LGJun 6, 2019
Multi-Frequency Vector Diffusion Maps

Yifeng Fan, Zhizhen Zhao

We introduce multi-frequency vector diffusion maps (MFVDM), a new framework for organizing and analyzing high dimensional datasets. The new method is a mathematical and algorithmic generalization of vector diffusion maps (VDM) and other non-linear dimensionality reduction methods. MFVDM combines different nonlinear embeddings of the data points defined with multiple unitary irreducible representations of the alignment group that connect two nodes in the graph. We illustrate the efficacy of MFVDM on synthetic data generated according to a random graph model and cryo-electron microscopy image dataset. The new method achieves better nearest neighbor search and alignment estimation than the state-of-the-arts VDM and diffusion maps (DM) on extremely noisy data.

IVMay 31, 2019
Representation Theoretic Patterns in Multi-Frequency Class Averaging for Three-Dimensional Cryo-Electron Microscopy

Yifeng Fan, Tingran Gao, Zhizhen Zhao

We develop in this paper a novel intrinsic classification algorithm -- multi-frequency class averaging (MFCA) -- for classifying noisy projection images obtained from three-dimensional cryo-electron microscopy (cryo-EM) by the similarity among their viewing directions. This new algorithm leverages multiple irreducible representations of the unitary group to introduce additional redundancy into the representation of the optimal in-plane rotational alignment, extending and outperforming the existing class averaging algorithm that uses only a single representation. The formal algebraic model and representation theoretic patterns of the proposed MFCA algorithm extend the framework of Hadani and Singer to arbitrary irreducible representations of the unitary group. We conceptually establish the consistency and stability of MFCA by inspecting the spectral properties of a generalized local parallel transport operator through the lens of Wigner $D$-matrices. We demonstrate the efficacy of the proposed algorithm with numerical experiments.

CVApr 16, 2019
Cryo-Electron Microscopy Image Analysis Using Multi-Frequency Vector Diffusion Maps

Yifeng Fan, Zhizhen Zhao

Cryo-electron microscopy (EM) single particle reconstruction is an entirely general technique for 3D structure determination of macromolecular complexes. However, because the images are taken at low electron dose, it is extremely hard to visualize the individual particle with low contrast and high noise level. In this paper, we propose a novel approach called multi-frequency vector diffusion maps (MFVDM) to improve the efficiency and accuracy of cryo-EM 2D image classification and denoising. This framework incorporates different irreducible representations of the estimated alignment between similar images. In addition, we propose a graph filtering scheme to denoise the images using the eigenvalues and eigenvectors of the MFVDM matrices. Through both simulated and publicly available real data, we demonstrate that our proposed method is efficient and robust to noise compared with the state-of-the-art cryo-EM 2D class averaging and image restoration algorithms.

LGApr 11, 2019
Max-Sliced Wasserstein Distance and its use for GANs

Ishan Deshpande, Yuan-Ting Hu, Ruoyu Sun et al.

Generative adversarial nets (GANs) and variational auto-encoders have significantly improved our distribution modeling capabilities, showing promise for dataset augmentation, image-to-image translation and feature learning. However, to model high-dimensional distributions, sequential training and stacked architectures are common, increasing the number of tunable hyper-parameters as well as the training time. Nonetheless, the sample complexity of the distance metrics remains one of the factors affecting GAN training. We first show that the recently proposed sliced Wasserstein distance has compelling sample complexity properties when compared to the Wasserstein distance. To further improve the sliced Wasserstein distance we then analyze its `projection complexity' and develop the max-sliced Wasserstein distance which enjoys compelling sample complexity while reducing projection complexity, albeit necessitating a max estimation. We finally illustrate that the proposed distance trains GANs on high-dimensional images up to a resolution of 256x256 easily.

COMar 6, 2019
Denoising Gravitational Waves with Enhanced Deep Recurrent Denoising Auto-Encoders

Hongyu Shen, Daniel George, E. A. Huerta et al.

Denoising of time domain data is a crucial task for many applications such as communication, translation, virtual assistants etc. For this task, a combination of a recurrent neural net (RNNs) with a Denoising Auto-Encoder (DAEs) has shown promising results. However, this combined model is challenged when operating with low signal-to-noise ratio (SNR) data embedded in non-Gaussian and non-stationary noise. To address this issue, we design a novel model, referred to as 'Enhanced Deep Recurrent Denoising Auto-Encoder' (EDRDAE), that incorporates a signal amplifier layer, and applies curriculum learning by first denoising high SNR signals, before gradually decreasing the SNR until the signals become noise dominated. We showcase the performance of EDRDAE using time-series data that describes gravitational waves embedded in very noisy backgrounds. In addition, we show that EDRDAE can accurately denoise signals whose topology is significantly more complex than those used for training, demonstrating that our model generalizes to new classes of gravitational waves that are beyond the scope of established denoising algorithms.

GR-QCMar 5, 2019
Statistically-informed deep learning for gravitational wave parameter estimation

Hongyu Shen, E. A. Huerta, Eamonn O'Shea et al.

We introduce deep learning models to estimate the masses of the binary components of black hole mergers, $(m_1,m_2)$, and three astrophysical properties of the post-merger compact remnant, namely, the final spin, $a_f$, and the frequency and damping time of the ringdown oscillations of the fundamental $\ell=m=2$ bar mode, $(ω_R, ω_I)$. Our neural networks combine a modified $\texttt{WaveNet}$ architecture with contrastive learning and normalizing flow. We validate these models against a Gaussian conjugate prior family whose posterior distribution is described by a closed analytical expression. Upon confirming that our models produce statistically consistent results, we used them to estimate the astrophysical parameters $(m_1,m_2, a_f, ω_R, ω_I)$ of five binary black holes: $\texttt{GW150914}, \texttt{GW170104}, \texttt{GW170814}, \texttt{GW190521}$ and $\texttt{GW190630}$. We use $\texttt{PyCBC Inference}$ to directly compare traditional Bayesian methodologies for parameter estimation with our deep-learning-based posterior distributions. Our results show that our neural network models predict posterior distributions that encode physical correlations, and that our data-driven median results and 90$\%$ confidence intervals are similar to those produced with gravitational wave Bayesian analyses. This methodology requires a single V100 $\texttt{NVIDIA}$ GPU to produce median values and posterior distributions within two milliseconds for each event. This neural network, and a tutorial for its use, are available at the $\texttt{Data and Learning Hub for Science}$.

IMFeb 1, 2019
Deep Learning for Multi-Messenger Astrophysics: A Gateway for Discovery in the Big Data Era

Gabrielle Allen, Igor Andreoni, Etienne Bachelet et al.

This report provides an overview of recent work that harnesses the Big Data Revolution and Large Scale Computing to address grand computational challenges in Multi-Messenger Astrophysics, with a particular emphasis on real-time discovery campaigns. Acknowledging the transdisciplinary nature of Multi-Messenger Astrophysics, this document has been prepared by members of the physics, astronomy, computer science, data science, software and cyberinfrastructure communities who attended the NSF-, DOE- and NVIDIA-funded "Deep Learning for Multi-Messenger Astrophysics: Real-time Discovery at Scale" workshop, hosted at the National Center for Supercomputing Applications, October 17-19, 2018. Highlights of this report include unanimous agreement that it is critical to accelerate the development and deployment of novel, signal-processing algorithms that use the synergy between artificial intelligence (AI) and high performance computing to maximize the potential for scientific discovery with Multi-Messenger Astrophysics. We discuss key aspects to realize this endeavor, namely (i) the design and exploitation of scalable and computationally efficient AI algorithms for Multi-Messenger Astrophysics; (ii) cyberinfrastructure requirements to numerically simulate astrophysical sources, and to process and interpret Multi-Messenger Astrophysics data; (iii) management of gravitational wave detections and triggers to enable electromagnetic and astro-particle follow-ups; (iv) a vision to harness future developments of machine and deep learning and cyberinfrastructure resources to cope with the scale of discovery in the Big Data Era; (v) and the need to build a community that brings domain experts together with data scientists on equal footing to maximize and accelerate discovery in the nascent field of Multi-Messenger Astrophysics.

CVDec 20, 2018
Steerable $e$PCA: Rotationally Invariant Exponential Family PCA

Zhizhen Zhao, Lydia T. Liu, Amit Singer

In photon-limited imaging, the pixel intensities are affected by photon count noise. Many applications, such as 3-D reconstruction using correlation analysis in X-ray free electron laser (XFEL) single molecule imaging, require an accurate estimation of the covariance of the underlying 2-D clean images. Accurate estimation of the covariance from low-photon count images must take into account that pixel intensities are Poisson distributed, hence the classical sample covariance estimator is sub-optimal. Moreover, in single molecule imaging, including in-plane rotated copies of all images could further improve the accuracy of covariance estimation. In this paper we introduce an efficient and accurate algorithm for covariance matrix estimation of count noise 2-D images, including their uniform planar rotations and possibly reflections. Our procedure, steerable $e$PCA, combines in a novel way two recently introduced innovations. The first is a methodology for principal component analysis (PCA) for Poisson distributions, and more generally, exponential family distributions, called $e$PCA. The second is steerable PCA, a fast and accurate procedure for including all planar rotations for PCA. The resulting principal components are invariant to the rotation and reflection of the input images. We demonstrate the efficiency and accuracy of steerable $e$PCA in numerical experiments involving simulated XFEL datasets and rotated Yale B face data.

LGMay 7, 2018
Real-time regression analysis with deep convolutional neural networks

E. A. Huerta, Daniel George, Zhizhen Zhao et al.

We discuss the development of novel deep learning algorithms to enable real-time regression analysis for time series data. We showcase the application of this new method with a timely case study, and then discuss the applicability of this approach to tackle similar challenges across science domains.

GR-QCNov 27, 2017
Denoising Gravitational Waves using Deep Learning with Recurrent Denoising Autoencoders

Hongyu Shen, Daniel George, E. A. Huerta et al.

Gravitational wave astronomy is a rapidly growing field of modern astrophysics, with observations being made frequently by the LIGO detectors. Gravitational wave signals are often extremely weak and the data from the detectors, such as LIGO, is contaminated with non-Gaussian and non-stationary noise, often containing transient disturbances which can obscure real signals. Traditional denoising methods, such as principal component analysis and dictionary learning, are not optimal for dealing with this non-Gaussian noise, especially for low signal-to-noise ratio gravitational wave signals. Furthermore, these methods are computationally expensive on large datasets. To overcome these issues, we apply state-of-the-art signal processing techniques, based on recent groundbreaking advancements in deep learning, to denoise gravitational wave signals embedded either in Gaussian noise or in real LIGO noise. We introduce SMTDAE, a Staired Multi-Timestep Denoising Autoencoder, based on sequence-to-sequence bi-directional Long-Short-Term-Memory recurrent neural networks. We demonstrate the advantages of using our unsupervised deep learning approach and show that, after training only using simulated Gaussian noise, SMTDAE achieves superior recovery performance for gravitational wave signals embedded in real non-Gaussian LIGO noise.

APNov 10, 2016
Mahalanobis Distance for Class Averaging of Cryo-EM Images

Tejal Bhamre, Zhizhen Zhao, Amit Singer

Single particle reconstruction (SPR) from cryo-electron microscopy (EM) is a technique in which the 3D structure of a molecule needs to be determined from its contrast transfer function (CTF) affected, noisy 2D projection images taken at unknown viewing directions. One of the main challenges in cryo-EM is the typically low signal to noise ratio (SNR) of the acquired images. 2D classification of images, followed by class averaging, improves the SNR of the resulting averages, and is used for selecting particles from micrographs and for inspecting the particle images. We introduce a new affinity measure, akin to the Mahalanobis distance, to compare cryo-EM images belonging to different defocus groups. The new similarity measure is employed to detect similar images, thereby leading to an improved algorithm for class averaging. We evaluate the performance of the proposed class averaging procedure on synthetic datasets, obtaining state of the art classification.

NASep 17, 2015
Data-driven prediction strategies for low-frequency patterns of North Pacific climate variability

Darin Comeau, Zhizhen Zhao, Dimitrios Giannakis et al.

The North Pacific exhibits patterns of low-frequency variability on the intra-annual to decadal time scales, which manifest themselves in both model data and the observational record, and prediction of such low-frequency modes of variability is of great interest to the community. While parametric models, such as stationary and non-stationary autoregressive models, possibly including external factors, may perform well in a data-fitting setting, they may perform poorly in a prediction setting. Ensemble analog forecasting, which relies on the historical record to provide estimates of the future based on past trajectories of those states similar to the initial state of interest, provides a promising, nonparametric approach to forecasting that makes no assumptions on the underlying dynamics or its statistics. We apply such forecasting to low-frequency modes of variability for the North Pacific sea surface temperature and sea ice concentration fields extracted through Nonlinear Laplacian Spectral Analysis. We find such methods may outperform parametric methods and simple persistence with increased predictive skill.

CVDec 2, 2014
Fast Steerable Principal Component Analysis

Zhizhen Zhao, Yoel Shkolnisky, Amit Singer

Cryo-electron microscopy nowadays often requires the analysis of hundreds of thousands of 2D images as large as a few hundred pixels in each direction. Here we introduce an algorithm that efficiently and accurately performs principal component analysis (PCA) for a large set of two-dimensional images, and, for each image, the set of its uniform rotations in the plane and their reflections. For a dataset consisting of $n$ images of size $L \times L$ pixels, the computational complexity of our algorithm is $O(nL^3 + L^4)$, while existing algorithms take $O(nL^4)$. The new algorithm computes the expansion coefficients of the images in a Fourier-Bessel basis efficiently using the non-uniform fast Fourier transform. We compare the accuracy and efficiency of the new algorithm with traditional PCA and existing algorithms for steerable PCA.

BMSep 29, 2013
Rotationally Invariant Image Representation for Viewing Direction Classification in Cryo-EM

Zhizhen Zhao, Amit Singer

We introduce a new rotationally invariant viewing angle classification method for identifying, among a large number of Cryo-EM projection images, similar views without prior knowledge of the molecule. Our rotationally invariant features are based on the bispectrum. Each image is denoised and compressed using steerable principal component analysis (PCA) such that rotating an image is equivalent to phase shifting the expansion coefficients. Thus we are able to extend the theory of bispectrum of 1D periodic signals to 2D images. The randomized PCA algorithm is then used to efficiently reduce the dimensionality of the bispectrum coefficients, enabling fast computation of the similarity between any pair of images. The nearest neighbors provide an initial classification of similar viewing angles. In this way, rotational alignment is only performed for images with their nearest neighbors. The initial nearest neighbor classification and alignment are further improved by a new classification method called vector diffusion maps. Our pipeline for viewing angle classification and alignment is experimentally shown to be faster and more accurate than reference-free alignment with rotationally invariant K-means clustering, MSA/MRA 2D classification, and their modern approximations.