LGJul 28, 2022
Learning Personalized Representations using Graph Convolutional NetworkHongyu Shen, Jinoh Oh, Shuai Zhao et al.
Generating representations that precisely reflect customers' behavior is an important task for providing personalized skill routing experience in Alexa. Currently, Dynamic Routing (DR) team, which is responsible for routing Alexa traffic to providers or skills, relies on two features to be served as personal signals: absolute traffic count and normalized traffic count of every skill usage per customer. Neither of them considers the network based structure for interactions between customers and skills, which contain richer information for customer preferences. In this work, we first build a heterogeneous edge attributed graph based customers' past interactions with the invoked skills, in which the user requests (utterances) are modeled as edges. Then we propose a graph convolutional network(GCN) based model, namely Personalized Dynamic Routing Feature Encoder(PDRFE), that generates personalized customer representations learned from the built graph. Compared with existing models, PDRFE is able to further capture contextual information in the graph convolutional function. The performance of our proposed model is evaluated by a downstream task, defect prediction, that predicts the defect label from the learned embeddings of customers and their triggered skills. We observe up to 41% improvements on the cross entropy metric for our proposed models compared to the baselines.
88.8CVApr 2
Lifting Unlabeled Internet-level Data for 3D Scene UnderstandingYixin Chen, Yaowei Zhang, Huangyue Yu et al.
Annotated 3D scene data is scarce and expensive to acquire, while abundant unlabeled videos are readily available on the internet. In this paper, we demonstrate that carefully designed data engines can leverage web-curated, unlabeled videos to automatically generate training data, to facilitate end-to-end models in 3D scene understanding alongside human-annotated datasets. We identify and analyze bottlenecks in automated data generation, revealing critical factors that determine the efficiency and effectiveness of learning from unlabeled data. To validate our approach across different perception granularities, we evaluate on three tasks spanning low-level perception, i.e., 3D object detection and instance segmentation, to high-evel reasoning, i.e., 3D spatial Visual Question Answering (VQA) and Vision-Lanugage Navigation (VLN). Models trained on our generated data demonstrate strong zero-shot performance and show further improvement after finetuning. This demonstrates the viability of leveraging readily available web data as a path toward more capable scene understanding systems.
CVAug 5, 2025
Trace3D: Consistent Segmentation Lifting via Gaussian Instance TracingHongyu Shen, Junfeng Ni, Yixin Chen et al.
We address the challenge of lifting 2D visual segmentation to 3D in Gaussian Splatting. Existing methods often suffer from inconsistent 2D masks across viewpoints and produce noisy segmentation boundaries as they neglect these semantic cues to refine the learned Gaussians. To overcome this, we introduce Gaussian Instance Tracing (GIT), which augments the standard Gaussian representation with an instance weight matrix across input views. Leveraging the inherent consistency of Gaussians in 3D, we use this matrix to identify and correct 2D segmentation inconsistencies. Furthermore, since each Gaussian ideally corresponds to a single object, we propose a GIT-guided adaptive density control mechanism to split and prune ambiguous Gaussians during training, resulting in sharper and more coherent 2D and 3D segmentation boundaries. Experimental results show that our method extracts clean 3D assets and consistently improves 3D segmentation in both online (e.g., self-prompting) and offline (e.g., contrastive lifting) settings, enabling applications such as hierarchical segmentation, object extraction, and scene editing.
LGDec 17, 2024
Boosting Test Performance with Importance Sampling--a Subpopulation PerspectiveHongyu Shen, Zhizhen Zhao
Despite empirical risk minimization (ERM) is widely applied in the machine learning community, its performance is limited on data with spurious correlation or subpopulation that is introduced by hidden attributes. Existing literature proposed techniques to maximize group-balanced or worst-group accuracy when such correlation presents, yet, at the cost of lower average accuracy. In addition, many existing works conduct surveys on different subpopulation methods without revealing the inherent connection between these methods, which could hinder the technology advancement in this area. In this paper, we identify important sampling as a simple yet powerful tool for solving the subpopulation problem. On the theory side, we provide a new systematic formulation of the subpopulation problem and explicitly identify the assumptions that are not clearly stated in the existing works. This helps to uncover the cause of the dropped average accuracy. We provide the first theoretical discussion on the connections of existing methods, revealing the core components that make them different. On the application side, we demonstrate a single estimator is enough to solve the subpopulation problem. In particular, we introduce the estimator in both attribute-known and -unknown scenarios in the subpopulation setup, offering flexibility in practical use cases. And empirically, we achieve state-of-the-art performance on commonly used benchmark datasets.
LGFeb 27, 2024
DeepDRK: Deep Dependency Regularized Knockoff for Feature SelectionHongyu Shen, Yici Yan, Zhizhen Zhao
Model-X knockoff has garnered significant attention among various feature selection methods due to its guarantees for controlling the false discovery rate (FDR). Since its introduction in parametric design, knockoff techniques have evolved to handle arbitrary data distributions using deep learning-based generative models. However, we have observed limitations in the current implementations of the deep Model-X knockoff framework. Notably, the "swap property" that knockoffs require often faces challenges at the sample level, resulting in diminished selection power. To address these issues, we develop "Deep Dependency Regularized Knockoff (DeepDRK)," a distribution-free deep learning method that effectively balances FDR and power. In DeepDRK, we introduce a novel formulation of the knockoff model as a learning problem under multi-source adversarial attacks. By employing an innovative perturbation technique, we achieve lower FDR and higher power. Our model outperforms existing benchmarks across synthetic, semi-synthetic, and real-world datasets, particularly when sample sizes are small and data distributions are non-Gaussian.
CVNov 24, 2020
Do You Live a Healthy Life? Analyzing Lifestyle by Visual Life LoggingQing Gao, Mingtao Pei, Hongyu Shen
A healthy lifestyle is the key to better health and happiness and has a considerable effect on quality of life and disease prevention. Current lifelogging/egocentric datasets are not suitable for lifestyle analysis; consequently, there is no research on lifestyle analysis in the field of computer vision. In this work, we investigate the problem of lifestyle analysis and build a visual lifelogging dataset for lifestyle analysis (VLDLA). The VLDLA contains images captured by a wearable camera every 3 seconds from 8:00 am to 6:00 pm for seven days. In contrast to current lifelogging/egocentric datasets, our dataset is suitable for lifestyle analysis as images are taken with short intervals to capture activities of short duration; moreover, images are taken continuously from morning to evening to record all the activities performed by a user. Based on our dataset, we classify the user activities in each frame and use three latent fluents of the user, which change over time and are associated with activities, to measure the healthy degree of the user's lifestyle. The scores for the three latent fluents are computed based on recognized activities, and the healthy degree of the lifestyle for the day is determined based on the scores for the latent fluents. Experimental results show that our method can be used to analyze the healthiness of users' lifestyles.
GR-QCNov 26, 2019
Enabling real-time multi-messenger astrophysics discoveries with deep learningE. A. Huerta, Gabrielle Allen, Igor Andreoni et al.
Multi-messenger astrophysics is a fast-growing, interdisciplinary field that combines data, which vary in volume and speed of data processing, from many different instruments that probe the Universe using different cosmic messengers: electromagnetic waves, cosmic rays, gravitational waves and neutrinos. In this Expert Recommendation, we review the key challenges of real-time observations of gravitational wave sources and their electromagnetic and astroparticle counterparts, and make a number of recommendations to maximize their potential for scientific discovery. These recommendations refer to the design of scalable and computationally efficient machine learning algorithms; the cyber-infrastructure to numerically simulate astrophysical sources, and to process and interpret multi-messenger astrophysics data; the management of gravitational wave detections to trigger real-time alerts for electromagnetic and astroparticle follow-ups; a vision to harness future developments of machine learning and cyber-infrastructure resources to cope with the big-data requirements; and the need to build a community of experts to realize the goals of multi-messenger astrophysics.
COMar 6, 2019
Denoising Gravitational Waves with Enhanced Deep Recurrent Denoising Auto-EncodersHongyu Shen, Daniel George, E. A. Huerta et al.
Denoising of time domain data is a crucial task for many applications such as communication, translation, virtual assistants etc. For this task, a combination of a recurrent neural net (RNNs) with a Denoising Auto-Encoder (DAEs) has shown promising results. However, this combined model is challenged when operating with low signal-to-noise ratio (SNR) data embedded in non-Gaussian and non-stationary noise. To address this issue, we design a novel model, referred to as 'Enhanced Deep Recurrent Denoising Auto-Encoder' (EDRDAE), that incorporates a signal amplifier layer, and applies curriculum learning by first denoising high SNR signals, before gradually decreasing the SNR until the signals become noise dominated. We showcase the performance of EDRDAE using time-series data that describes gravitational waves embedded in very noisy backgrounds. In addition, we show that EDRDAE can accurately denoise signals whose topology is significantly more complex than those used for training, demonstrating that our model generalizes to new classes of gravitational waves that are beyond the scope of established denoising algorithms.
GR-QCMar 5, 2019
Statistically-informed deep learning for gravitational wave parameter estimationHongyu Shen, E. A. Huerta, Eamonn O'Shea et al.
We introduce deep learning models to estimate the masses of the binary components of black hole mergers, $(m_1,m_2)$, and three astrophysical properties of the post-merger compact remnant, namely, the final spin, $a_f$, and the frequency and damping time of the ringdown oscillations of the fundamental $\ell=m=2$ bar mode, $(ω_R, ω_I)$. Our neural networks combine a modified $\texttt{WaveNet}$ architecture with contrastive learning and normalizing flow. We validate these models against a Gaussian conjugate prior family whose posterior distribution is described by a closed analytical expression. Upon confirming that our models produce statistically consistent results, we used them to estimate the astrophysical parameters $(m_1,m_2, a_f, ω_R, ω_I)$ of five binary black holes: $\texttt{GW150914}, \texttt{GW170104}, \texttt{GW170814}, \texttt{GW190521}$ and $\texttt{GW190630}$. We use $\texttt{PyCBC Inference}$ to directly compare traditional Bayesian methodologies for parameter estimation with our deep-learning-based posterior distributions. Our results show that our neural network models predict posterior distributions that encode physical correlations, and that our data-driven median results and 90$\%$ confidence intervals are similar to those produced with gravitational wave Bayesian analyses. This methodology requires a single V100 $\texttt{NVIDIA}$ GPU to produce median values and posterior distributions within two milliseconds for each event. This neural network, and a tutorial for its use, are available at the $\texttt{Data and Learning Hub for Science}$.
IMFeb 1, 2019
Deep Learning for Multi-Messenger Astrophysics: A Gateway for Discovery in the Big Data EraGabrielle Allen, Igor Andreoni, Etienne Bachelet et al.
This report provides an overview of recent work that harnesses the Big Data Revolution and Large Scale Computing to address grand computational challenges in Multi-Messenger Astrophysics, with a particular emphasis on real-time discovery campaigns. Acknowledging the transdisciplinary nature of Multi-Messenger Astrophysics, this document has been prepared by members of the physics, astronomy, computer science, data science, software and cyberinfrastructure communities who attended the NSF-, DOE- and NVIDIA-funded "Deep Learning for Multi-Messenger Astrophysics: Real-time Discovery at Scale" workshop, hosted at the National Center for Supercomputing Applications, October 17-19, 2018. Highlights of this report include unanimous agreement that it is critical to accelerate the development and deployment of novel, signal-processing algorithms that use the synergy between artificial intelligence (AI) and high performance computing to maximize the potential for scientific discovery with Multi-Messenger Astrophysics. We discuss key aspects to realize this endeavor, namely (i) the design and exploitation of scalable and computationally efficient AI algorithms for Multi-Messenger Astrophysics; (ii) cyberinfrastructure requirements to numerically simulate astrophysical sources, and to process and interpret Multi-Messenger Astrophysics data; (iii) management of gravitational wave detections and triggers to enable electromagnetic and astro-particle follow-ups; (iv) a vision to harness future developments of machine and deep learning and cyberinfrastructure resources to cope with the scale of discovery in the Big Data Era; (v) and the need to build a community that brings domain experts together with data scientists on equal footing to maximize and accelerate discovery in the nascent field of Multi-Messenger Astrophysics.
GR-QCNov 27, 2017
Denoising Gravitational Waves using Deep Learning with Recurrent Denoising AutoencodersHongyu Shen, Daniel George, E. A. Huerta et al.
Gravitational wave astronomy is a rapidly growing field of modern astrophysics, with observations being made frequently by the LIGO detectors. Gravitational wave signals are often extremely weak and the data from the detectors, such as LIGO, is contaminated with non-Gaussian and non-stationary noise, often containing transient disturbances which can obscure real signals. Traditional denoising methods, such as principal component analysis and dictionary learning, are not optimal for dealing with this non-Gaussian noise, especially for low signal-to-noise ratio gravitational wave signals. Furthermore, these methods are computationally expensive on large datasets. To overcome these issues, we apply state-of-the-art signal processing techniques, based on recent groundbreaking advancements in deep learning, to denoise gravitational wave signals embedded either in Gaussian noise or in real LIGO noise. We introduce SMTDAE, a Staired Multi-Timestep Denoising Autoencoder, based on sequence-to-sequence bi-directional Long-Short-Term-Memory recurrent neural networks. We demonstrate the advantages of using our unsupervised deep learning approach and show that, after training only using simulated Gaussian noise, SMTDAE achieves superior recovery performance for gravitational wave signals embedded in real non-Gaussian LIGO noise.
IMNov 20, 2017
Glitch Classification and Clustering for LIGO with Deep Transfer LearningDaniel George, Hongyu Shen, E. A. Huerta
The detection of gravitational waves with LIGO and Virgo requires a detailed understanding of the response of these instruments in the presence of environmental and instrumental noise. Of particular interest is the study of anomalous non-Gaussian noise transients known as glitches, since their high occurrence rate in LIGO/Virgo data can obscure or even mimic true gravitational wave signals. Therefore, successfully identifying and excising glitches is of utmost importance to detect and characterize gravitational waves. In this article, we present the first application of Deep Learning combined with Transfer Learning for glitch classification, using real data from LIGO's first discovery campaign labeled by Gravity Spy, showing that knowledge from pre-trained models for real-world object recognition can be transferred for classifying spectrograms of glitches. We demonstrate that this method enables the optimal use of very deep convolutional neural networks for glitch classification given small unbalanced training datasets, significantly reduces the training time, and achieves state-of-the-art accuracy above 98.8%. Once trained via transfer learning, we show that the networks can be truncated and used as feature extractors for unsupervised clustering to automatically group together new classes of glitches and anomalies. This novel capability is of critical importance to identify and remove new types of glitches which will occur as the LIGO/Virgo detectors gradually attain design sensitivity.
GR-QCJun 22, 2017
Deep Transfer Learning: A new deep learning glitch classification method for advanced LIGODaniel George, Hongyu Shen, E. A. Huerta
The exquisite sensitivity of the advanced LIGO detectors has enabled the detection of multiple gravitational wave signals. The sophisticated design of these detectors mitigates the effect of most types of noise. However, advanced LIGO data streams are contaminated by numerous artifacts known as glitches: non-Gaussian noise transients with complex morphologies. Given their high rate of occurrence, glitches can lead to false coincident detections, obscure and even mimic gravitational wave signals. Therefore, successfully characterizing and removing glitches from advanced LIGO data is of utmost importance. Here, we present the first application of Deep Transfer Learning for glitch classification, showing that knowledge from deep learning algorithms trained for real-world object recognition can be transferred for classifying glitches in time-series based on their spectrogram images. Using the Gravity Spy dataset, containing hand-labeled, multi-duration spectrograms obtained from real LIGO data, we demonstrate that this method enables optimal use of very deep convolutional neural networks for classification given small training datasets, significantly reduces the time for training the networks, and achieves state-of-the-art accuracy above 98.8%, with perfect precision-recall on 8 out of 22 classes. Furthermore, new types of glitches can be classified accurately given few labeled examples with this technique. Once trained via transfer learning, we show that the convolutional neural networks can be truncated and used as excellent feature extractors for unsupervised clustering methods to identify new classes based on their morphology, without any labeled examples. Therefore, this provides a new framework for dynamic glitch classification for gravitational wave detectors, which are expected to encounter new types of noise as they undergo gradual improvements to attain design sensitivity.