Halyun Jeong

LG
h-index18
9papers
62citations
Novelty57%
AI Score41

9 Papers

LGFeb 20, 2023
Federated Gradient Matching Pursuit

Halyun Jeong, Deanna Needell, Jing Qin

Traditional machine learning techniques require centralizing all training data on one server or data hub. Due to the development of communication technologies and a huge amount of decentralized data on many clients, collaborative machine learning has become the main interest while providing privacy-preserving frameworks. In particular, federated learning (FL) provides such a solution to learn a shared model while keeping training data at local clients. On the other hand, in a wide range of machine learning and signal processing applications, the desired solution naturally has a certain structure that can be framed as sparsity with respect to a certain dictionary. This problem can be formulated as an optimization problem with sparsity constraints and solving it efficiently has been one of the primary research topics in the traditional centralized setting. In this paper, we propose a novel algorithmic framework, federated gradient matching pursuit (FedGradMP), to solve the sparsity constrained minimization problem in the FL setting. We also generalize our algorithms to accommodate various practical FL scenarios when only a subset of clients participate per round, when the local model estimation at clients could be inexact, or when the model parameters are sparse with respect to general dictionaries. Our theoretical analysis shows the linear convergence of the proposed algorithms. A variety of numerical experiments are conducted to demonstrate the great potential of the proposed framework -- fast convergence both in communication rounds and computation time for many important scenarios without sophisticated parameter tuning.

MLNov 10, 2025
Infinite-Dimensional Operator/Block Kaczmarz Algorithms: Regret Bounds and $λ$-Effectiveness

Halyun Jeong, Palle E. T. Jorgensen, Hyun-Kyoung Kwon et al.

We present a variety of projection-based linear regression algorithms with a focus on modern machine-learning models and their algorithmic performance. We study the role of the relaxation parameter in generalized Kaczmarz algorithms and establish a priori regret bounds with explicit $λ$-dependence to quantify how much an algorithm's performance deviates from its optimal performance. A detailed analysis of relaxation parameter is also provided. Applications include: explicit regret bounds for the framework of Kaczmarz algorithm models, non-orthogonal Fourier expansions, and the use of regret estimates in modern machine learning models, including for noisy data, i.e., regret bounds for the noisy Kaczmarz algorithms. Motivated by machine-learning practice, our wider framework treats bounded operators (on infinite-dimensional Hilbert spaces), with updates realized as (block) Kaczmarz algorithms, leading to new and versatile results.

LGSep 3, 2024
Robust Fourier Neural Networks

Halyun Jeong, Jihun Han

Fourier embedding has shown great promise in removing spectral bias during neural network training. However, it can still suffer from high generalization errors, especially when the labels or measurements are noisy. We demonstrate that introducing a simple diagonal layer after the Fourier embedding layer makes the network more robust to measurement noise, effectively prompting it to learn sparse Fourier features. We provide theoretical justifications for this Fourier feature learning, leveraging recent developments in diagonal networks and implicit regularization in neural networks. Under certain conditions, our proposed approach can also learn functions that are noisy mixtures of nonlinear functions of Fourier features. Numerical experiments validate the effectiveness of our proposed architecture, supporting our theory.

LGMar 2, 2024
Stochastic gradient descent for streaming linear and rectified linear systems with adversarial corruptions

Halyun Jeong, Deanna Needell, Elizaveta Rebrova

We propose SGD-exp, a stochastic gradient descent approach for linear and ReLU regressions under Massart noise (adversarial semi-random corruption model) for the fully streaming setting. We show novel nearly linear convergence guarantees of SGD-exp to the true parameter with up to $50\%$ Massart corruption rate, and with any corruption rate in the case of symmetric oblivious corruptions. This is the first convergence guarantee result for robust ReLU regression in the streaming setting, and it shows the improved convergence rate over previous robust methods for $L_1$ linear regression due to a choice of an exponentially decaying step size, known for its efficiency in practice. Our analysis is based on the drift analysis of a discrete stochastic process, which could also be interesting on its own.

LGNov 28, 2025
Implicit Unitarity Bias in Tensor Factorization: A Theoretical Framework for Symmetry Group Discovery

Dongsung Huh, Halyun Jeong

While modern neural architectures typically generalize via smooth interpolation, it lacks the inductive biases required to uncover algebraic structures essential for systematic generalization. We present the first theoretical analysis of HyperCube, a differentiable tensor factorization architecture designed to bridge this gap. This work establishes an intrinsic geometric property of the HyperCube formulation: we prove that the architecture mediates a fundamental equivalence between geometric alignment and algebraic structure. Independent of the global optimization landscape, we show that the condition of geometric alignment imposes rigid algebraic constraints, proving that the feasible collinear manifold is non-empty if and only if the target is isotopic to a group. Within this manifold, we characterize the objective as a rank-maximizing potential that unconditionally drives factors toward full-rank, unitary representations. Finally, we propose the Collinearity Dominance mechanism to link these structural results to the global landscape. Supported by empirical scaling laws, we establish that global minima are achieved exclusively by unitary regular representations of group isotopes. This formalizes the HyperCube objective as a differentiable proxy for associativity, demonstrating how rigid geometric constraints enable the discovery of latent algebraic symmetry.

LGMay 23, 2025
Beyond Discreteness: Finite-Sample Analysis of Straight-Through Estimator for Quantization

Halyun Jeong, Jack Xin, Penghang Yin

Training quantized neural networks requires addressing the non-differentiable and discrete nature of the underlying optimization problem. To tackle this challenge, the straight-through estimator (STE) has become the most widely adopted heuristic, allowing backpropagation through discrete operations by introducing surrogate gradients. However, its theoretical properties remain largely unexplored, with few existing works simplifying the analysis by assuming an infinite amount of training data. In contrast, this work presents the first finite-sample analysis of STE in the context of neural network quantization. Our theoretical results highlight the critical role of sample size in the success of STE, a key insight absent from existing studies. Specifically, by analyzing the quantization-aware training of a two-layer neural network with binary weights and activations, we derive the sample complexity bound in terms of the data dimensionality that guarantees the convergence of STE-based optimization to the global minimum. Moreover, in the presence of label noises, we uncover an intriguing recurrence property of STE-gradient method, where the iterate repeatedly escape from and return to the optimal binary weights. Our analysis leverages tools from compressed sensing and dynamical systems theory.

IROct 14, 2020
Polar Deconvolution of Mixed Signals

Zhenan Fan, Halyun Jeong, Babhru Joshi et al.

The signal demixing problem seeks to separate a superposition of multiple signals into its constituent components. This paper studies a two-stage approach that first decompresses and subsequently deconvolves the noisy and undersampled observations of the superposition using two convex programs. Probabilistic error bounds are given on the accuracy with which this process approximates the individual signals. The theory of polar convolution of convex sets and gauge functions plays a central role in the analysis and solution process. If the measurements are random and the noise is bounded, this approach stably recovers low-complexity and mutually incoherent signals, with high probability and with near-optimal sample complexity. We develop an efficient algorithm, based on level-set and conditional-gradient methods, that solves the convex optimization problems with sublinear iteration complexity and linear space requirements. Numerical experiments on both real and synthetic data confirm the theory and the efficiency of the approach.

ITJan 28, 2020
Sub-Gaussian Matrices on Sets: Optimal Tail Dependence and Applications

Halyun Jeong, Xiaowei Li, Yaniv Plan et al.

Random linear mappings are widely used in modern signal processing, compressed sensing and machine learning. These mappings may be used to embed the data into a significantly lower dimension while at the same time preserving useful information. This is done by approximately preserving the distances between data points, which are assumed to belong to $\mathbb{R}^n$. Thus, the performance of these mappings is usually captured by how close they are to an isometry on the data. Gaussian linear mappings have been the object of much study, while the sub-Gaussian settings is not yet fully understood. In the latter case, the performance depends on the sub-Gaussian norm of the rows. In many applications, e.g., compressed sensing, this norm may be large, or even growing with dimension, and thus it is important to characterize this dependence. We study when a sub-Gaussian matrix can become a near isometry on a set, show that previous best known dependence on the sub-Gaussian norm was sub-optimal, and present the optimal dependence. Our result not only answers a remaining question posed by Liaw, Mehrabian, Plan and Vershynin in 2017, but also generalizes their work. We also develop a new Bernstein type inequality for sub-exponential random variables, and a new Hanson-Wright inequality for quadratic forms of sub-Gaussian random variables, in both cases improving the bounds in the sub-Gaussian regime under moment constraints. Finally, we illustrate popular applications such as Johnson-Lindenstrauss embeddings, null space property for 0-1 matrices, randomized sketches and blind demodulation, whose theoretical guarantees can be improved by our results (in the sub-Gaussian case).

NAJul 24, 2017
Convergence of the randomized Kaczmarz method for phase retrieval

Halyun Jeong, C. Sinan Güntürk

The classical Kaczmarz iteration and its randomized variants are popular tools for fast inversion of linear overdetermined systems. This method extends naturally to the setting of the phase retrieval problem via substituting at each iteration the phase of any measurement of the available approximate solution for the unknown phase of the measurement of the true solution. Despite the simplicity of the method, rigorous convergence guarantees that are available for the classical linear setting have not been established so far for the phase retrieval setting. In this short note, we provide a convergence result for the randomized Kaczmarz method for phase retrieval in $\mathbb{R}^d$. We show that with high probability a random measurement system of size $m \asymp d$ will be admissible for this method in the sense that convergence in the mean square sense is guaranteed with any prescribed probability. The convergence is exponential and comparable to the linear setting.