Brian M. Kurkoski

h-index20

4papers

124citations

Novelty57%

AI Score41

Ranked #69,569 of 194,257 authors (top 36%)#209 in IT (top 27%)

4 Papers

3.3ITJun 23, 2022

Sufficient Statistic Memory Approximate Message Passing

Lei Liu, Shunqi Huang, Brian M. Kurkoski

Approximate message passing (AMP) type algorithms have been widely used in the signal reconstruction of certain large random linear systems. A key feature of the AMP-type algorithms is that their dynamics can be correctly described by state evolution. However, state evolution does not necessarily guarantee the convergence of iterative algorithms. To solve the convergence problem of AMP-type algorithms in principle, this paper proposes a memory AMP (MAMP) under a sufficient statistic condition, named sufficient statistic MAMP (SS-MAMP). We show that the covariance matrices of SS-MAMP are L-banded and convergent. Given an arbitrary MAMP, we can construct the SS-MAMP by damping, which not only ensures the convergence, but also preserves the orthogonality, i.e., its dynamics can be correctly described by state evolution.

7.1LGSep 25, 2025

Binary Autoencoder for Mechanistic Interpretability of Large Language Models

Hakaze Cho, Haolin Yang, Brian M. Kurkoski et al.

Existing works are dedicated to untangling atomized numerical components (features) from the hidden states of Large Language Models (LLMs) for interpreting their mechanism. However, they typically rely on autoencoders constrained by some implicit training-time regularization on single training instances (i.e., $L_1$ normalization, top-k function, etc.), without an explicit guarantee of global sparsity among instances, causing a large amount of dense (simultaneously inactive) features, harming the feature sparsity and atomization. In this paper, we propose a novel autoencoder variant that enforces minimal entropy on minibatches of hidden activations, thereby promoting feature independence and sparsity across instances. For efficient entropy calculation, we discretize the hidden activations to 1-bit via a step function and apply gradient estimation to enable backpropagation, so that we term it as Binary Autoencoder (BAE) and empirically demonstrate two major applications: (1) Feature set entropy calculation. Entropy can be reliably estimated on binary hidden activations, which we empirically evaluate and leverage to characterize the inference dynamics of LLMs and In-context Learning. (2) Feature untangling. Similar to typical methods, BAE can extract atomized features from LLM's hidden states. To robustly evaluate such feature extraction capability, we refine traditional feature-interpretation methods to avoid unreliable handling of numerical tokens, and show that BAE avoids dense features while producing the largest number of interpretable ones among baselines, which confirms the effectiveness of BAE serving as a feature extractor.

4.3ITJun 4, 2021

Memory Approximate Message Passing

Lei Liu, Shunqi Huang, Brian M. Kurkoski

Approximate message passing (AMP) is a low-cost iterative parameter-estimation technique for certain high-dimensional linear systems with non-Gaussian distributions. However, AMP only applies to independent identically distributed (IID) transform matrices, but may become unreliable for other matrix ensembles, especially for ill-conditioned ones. To handle this difficulty, orthogonal/vector AMP (OAMP/VAMP) was proposed for general right-unitarily-invariant matrices. However, the Bayes-optimal OAMP/VAMP requires high-complexity linear minimum mean square error estimator. To solve the disadvantages of AMP and OAMP/VAMP, this paper proposes a memory AMP (MAMP), in which a long-memory matched filter is proposed for interference suppression. The complexity of MAMP is comparable to AMP. The asymptotic Gaussianity of estimation errors in MAMP is guaranteed by the orthogonality principle. A state evolution is derived to asymptotically characterize the performance of MAMP. Based on the state evolution, the relaxation parameters and damping vector in MAMP are optimized. For all right-unitarily-invariant matrices, the optimized MAMP converges to OAMP/VAMP, and thus is Bayes-optimal if it has a unique fixed point. Finally, simulations are provided to verify the validity and accuracy of the theoretical results.

10.3ITDec 20, 2020Code

Memory AMP

Lei Liu, Shunqi Huang, Brian M. Kurkoski

Approximate message passing (AMP) is a low-cost iterative parameter-estimation technique for certain high-dimensional linear systems with non-Gaussian distributions. AMP only applies to independent identically distributed (IID) transform matrices, but may become unreliable (e.g., perform poorly or even diverge) for other matrix ensembles, especially for ill-conditioned ones. To solve this issue, orthogonal/vector AMP (OAMP/VAMP) was proposed for general right-unitarily-invariant matrices. However, the Bayes-optimal OAMP/VAMP (BO-OAMP/VAMP) requires a high-complexity linear minimum mean square error (MMSE) estimator. This prevents OAMP/VAMP from being used in large-scale systems. To address the drawbacks of AMP and BO-OAMP/VAMP, this paper offers a memory AMP (MAMP) framework based on the orthogonality principle, which ensures that estimation errors in MAMP are asymptotically IID Gaussian. To realize the required orthogonality for MAMP, we provide an orthogonalization procedure for the local memory estimators. In addition, we propose a Bayes-optimal MAMP (BO-MAMP), in which a long-memory matched filter is used for interference suppression. The complexity of BO-MAMP is comparable to AMP. To asymptotically characterize the performance of BO-MAMP, a state evolution is derived. The relaxation parameters and damping vector in BO-MAMP are optimized based on state evolution. Most crucially, the state evolution of the optimized BO-MAMP converges to the same fixed point as that of the high-complexity BO-OAMP/VAMP for all right-unitarily-invariant matrices, and achieves the Bayes optimal MSE predicted by the replica method if its state evolution has a unique fixed point. Finally, simulations are provided to verify the theoretical results' validity and accuracy.