Su-Yun Huang

ML
5papers
2citations
Novelty30%
AI Score30

5 Papers

MLJan 30
Generative and Nonparametric Approaches for Conditional Distribution Estimation: Methods, Perspectives, and Comparative Evaluations

Yen-Shiu Chin, Zhi-Yu Jou, Toshinari Morimoto et al.

The inference of conditional distributions is a fundamental problem in statistics, essential for prediction, uncertainty quantification, and probabilistic modeling. A wide range of methodologies have been developed for this task. This article reviews and compares several representative approaches spanning classical nonparametric methods and modern generative models. We begin with the single-index method of Hall and Yao (2005), which estimates the conditional distribution through a dimension-reducing index and nonparametric smoothing of the resulting one-dimensional cumulative conditional distribution function. We then examine the basis-expansion approaches, including FlexCode (Izbicki and Lee, 2017) and DeepCDE (Dalmasso et al., 2020), which convert conditional density estimation into a set of nonparametric regression problems. In addition, we discuss two recent generative simulation-based methods that leverage modern deep generative architectures: the generative conditional distribution sampler (Zhou et al., 2023) and the conditional denoising diffusion probabilistic model (Fu et al., 2024; Yang et al., 2025). A systematic numerical comparison of these approaches is provided using a unified evaluation framework that ensures fairness and reproducibility. The performance metrics used for the estimated conditional distribution include the mean-squared errors of conditional mean and standard deviation, as well as the Wasserstein distance. We also discuss their flexibility and computational costs, highlighting the distinct advantages and limitations of each approach.

MLApr 9, 2020
TensorProjection Layer: A Tensor-Based Dimension Reduction Method in Deep Neural Networks

Toshinari Morimoto, Su-Yun Huang

In this paper, we propose a dimension reduction method specifically designed for tensor-structured feature data in deep neural networks. The method is implemented as a hidden layer, called the TensorProjection layer, which transforms input tensors into output tensors with reduced dimensions through mode-wise projections. The projection directions are treated as model parameters of the layer and are optimized during model training. Our method can serve as an alternative to pooling layers for summarizing image data, or to convolutional layers as a technique for reducing the number of channels. We conduct experiments on tasks such as medical image classification and segmentation, integrating the TensorProjection layer into commonly used baseline architectures to evaluate its effectiveness. Numerical experiments indicate that the proposed method can outperform traditional downsampling methods, such as pooling layers, in our tasks, suggesting it as a promising alternative for feature summarization.

IVNov 22, 2019
Two-stage dimension reduction for noisy high-dimensional images and application to Cryogenic Electron Microscopy

Szu-Chi Chung, Shao-Hsuan Wang, Po-Yao Niu et al.

Principal component analysis (PCA) is arguably the most widely used dimension-reduction method for vector-type data. When applied to a sample of images, PCA requires vectorization of the image data, which in turn entails solving an eigenvalue problem for the sample covariance matrix. We propose herein a two-stage dimension reduction (2SDR) method for image reconstruction from high-dimensional noisy image data. The first stage treats the image as a matrix, which is a tensor of order 2, and uses multilinear principal component analysis (MPCA) for matrix rank reduction and image denoising. The second stage vectorizes the reduced-rank matrix and achieves further dimension and noise reduction. Simulation studies demonstrate excellent performance of 2SDR, for which we also develop an asymptotic theory that establishes consistency of its rank selection. Applications to cryo-EM (cryogenic electronic microscopy), which has revolutionized structural biology, organic and medical chemistry, cellular and molecular physiology in the past decade, are also provided and illustrated with benchmark cryo-EM datasets. Connections to other contemporaneous developments in image reconstruction and high-dimensional statistical inference are also discussed.

NAAug 29, 2016
Integrating multiple random sketches for singular value decomposition

Ting-Li Chen, Dawei D. Chang, Su-Yun Huang et al.

The singular value decomposition (SVD) of large-scale matrices is a key tool in data analytics and scientific computing. The rapid growth in the size of matrices further increases the need for developing efficient large-scale SVD algorithms. Randomized SVD based on one-time sketching has been studied, and its potential has been demonstrated for computing a low-rank SVD. Instead of exploring different single random sketching techniques, we propose a Monte Carlo type integrated SVD algorithm based on multiple random sketches. The proposed integration algorithm takes multiple random sketches and then integrates the results obtained from the multiple sketched subspaces. So that the integrated SVD can achieve higher accuracy and lower stochastic variations. The main component of the integration is an optimization problem with a matrix Stiefel manifold constraint. The optimization problem is solved using Kolmogorov-Nagumo-type averages. Our theoretical analyses show that the singular vectors can be induced by population averaging and ensure the consistencies between the computed and true subspaces and singular vectors. Statistical analysis further proves a strong Law of Large Numbers and gives a rate of convergence by the Central Limit Theorem. Preliminary numerical results suggest that the proposed integrated SVD algorithm is promising.

STMar 12, 2015
Functional Inverse Regression in an Enlarged Dimension Reduction Space

Ting-Li Chen, Su-Yun Huang, Yanyuan Ma et al.

We consider an enlarged dimension reduction space in functional inverse regression. Our operator and functional analysis based approach facilitates a compact and rigorous formulation of the functional inverse regression problem. It also enables us to expand the possible space where the dimension reduction functions belong. Our formulation provides a unified framework so that the classical notions, such as covariance standardization, Mahalanobis distance, SIR and linear discriminant analysis, can be naturally and smoothly carried out in our enlarged space. This enlarged dimension reduction space also links to the linear discriminant space of Gaussian measures on a separable Hilbert space.