Weiming Feng

DS
h-index10
9papers
25citations
Novelty52%
AI Score52

9 Papers

DSMay 5
On Computing Total Variation Distance Between Mixtures of Product Distributions

Weiming Feng, Yucheng Fu, Minji Yang et al.

We study the problem of approximating the total variation distance between two mixtures of product distributions over an $n$-dimensional discrete domain. Given two mixtures $\mathbb{P}$ and $\mathbb{Q}$ with $k_1$ and $k_2$ product distributions over $[q]^n$, respectively, we give a randomized algorithm that approximates $d_{\mathrm{TV}}\left({\mathbb{P}},{\mathbb{Q}}\right)$ within a multiplicative error of $(1\pm \varepsilon)$ in time $\mathrm{poly}((nq)^{k_1+k_2},1/\varepsilon)$. We also study the special case of mixtures of Boolean subcubes over $\{0,1\}^n$. For this class, we give a deterministic algorithm that exactly computes the total variation distance in time $\mathrm{poly}(n,2^{O(k_1+k_2)})$, and show that exact computation is $\#\mathsf{P}$-hard when $k_1+k_2=Θ(n)$.

DSNov 4, 2025
Learning CNF formulas from uniform random solutions in the local lemma regime

Weiming Feng, Xiongxin Yang, Yixiao Yu et al.

We study the problem of learning a $n$-variables $k$-CNF formula $Φ$ from its i.i.d. uniform random solutions, which is equivalent to learning a Boolean Markov random field (MRF) with $k$-wise hard constraints. Revisiting Valiant's algorithm (Commun. ACM'84), we show that it can exactly learn (1) $k$-CNFs with bounded clause intersection size under Lovász local lemma type conditions, from $O(\log n)$ samples; and (2) random $k$-CNFs near the satisfiability threshold, from $\widetilde{O}(n^{\exp(-\sqrt{k})})$ samples. These results significantly improve the previous $O(n^k)$ sample complexity. We further establish new information-theoretic lower bounds on sample complexity for both exact and approximate learning from i.i.d. uniform random solutions.

DSApr 1
Rapid mixing in positively weighted restricted Boltzmann machines

Weiming Feng, Heng Guo, Minji Yang

We show polylogarithmic mixing time bounds for the alternating-scan sampler for positively weighted restricted Boltzmann machines. This is done via analysing the same chain and the Glauber dynamics for ferromagnetic two-spin systems, where we obtain new mixing time bounds up to the critical thresholds.

DSMar 14, 2025
Approximating the Total Variation Distance between Gaussians

Arnab Bhattacharyya, Weiming Feng, Piyush Srivastava

The total variation distance is a metric of central importance in statistics and probability theory. However, somewhat surprisingly, questions about computing it algorithmically appear not to have been systematically studied until very recently. In this paper, we contribute to this line of work by studying this question in the important special case of multivariate Gaussians. More formally, we consider the problem of approximating the total variation distance between two multivariate Gaussians to within an $ε$-relative error. Previous works achieved a fixed constant relative error approximation via closed-form formulas. In this work, we give algorithms that given any two $n$-dimensional Gaussians $D_1,D_2$, and any error bound $ε> 0$, approximate the total variation distance $D := d_{TV}(D_1,D_2)$ to $ε$-relative accuracy in $\text{poly}(n,\frac{1}ε,\log \frac{1}{D})$ operations. The main technical tool in our work is a reduction that helps us extend the recent progress on computing the TV-distance between discrete random variables to our continuous setting.

DSFeb 8, 2025
Approximating the total variation distance between spin systems

Weiming Feng, Hongyang Liu, Minji Yang

Spin systems form an important class of undirected graphical models. For two Gibbs distributions $μ$ and $ν$ induced by two spin systems on the same graph $G = (V, E)$, we study the problem of approximating the total variation distance $d_{TV}(μ,ν)$ with an $ε$-relative error. We propose a new reduction that connects the problem of approximating the TV-distance to sampling and approximate counting. Our applications include the hardcore model and the antiferromagnetic Ising model in the uniqueness regime, the ferromagnetic Ising model, and the general Ising model satisfying the spectral condition. Additionally, we explore the computational complexity of approximating the total variation distance $d_{TV}(μ_S,ν_S)$ between two marginal distributions on an arbitrary subset $S \subseteq V$. We prove that this problem remains hard even when both $μ$ and $ν$ admit polynomial-time sampling and approximate counting algorithms.

DSSep 16, 2025
Sublinear-Time Algorithms for Diagonally Dominant Systems and Applications to the Friedkin-Johnsen Model

Weiming Feng, Zelin Li, Pan Peng

We study sublinear-time algorithms for solving linear systems $Sz = b$, where $S$ is a diagonally dominant matrix, i.e., $|S_{ii}| \geq δ+ \sum_{j \ne i} |S_{ij}|$ for all $i \in [n]$, for some $δ\geq 0$. We present randomized algorithms that, for any $u \in [n]$, return an estimate $z_u$ of $z^*_u$ with additive error $\varepsilon$ or $\varepsilon \lVert z^*\rVert_\infty$, where $z^*$ is some solution to $Sz^* = b$, and the algorithm only needs to read a small portion of the input $S$ and $b$. For example, when the additive error is $\varepsilon$ and assuming $δ>0$, we give an algorithm that runs in time $O\left( \frac{\|b\|_\infty^2 S_{\max}}{δ^3 \varepsilon^2} \log \frac{\| b \|_\infty}{δ\varepsilon} \right)$, where $S_{\max} = \max_{i \in [n]} |S_{ii}|$. We also prove a matching lower bound, showing that the linear dependence on $S_{\max}$ is optimal. Unlike previous sublinear-time algorithms, which apply only to symmetric diagonally dominant matrices with non-negative diagonal entries, our algorithm works for general strictly diagonally dominant matrices ($δ> 0$) and a broader class of non-strictly diagonally dominant matrices $(δ= 0)$. Our approach is based on analyzing a simple probabilistic recurrence satisfied by the solution. As an application, we obtain an improved sublinear-time algorithm for opinion estimation in the Friedkin--Johnsen model.

DSSep 5, 2025
On approximating the $f$-divergence between two Ising models

Weiming Feng, Yucheng Fu

The $f$-divergence is a fundamental notion that measures the difference between two distributions. In this paper, we study the problem of approximating the $f$-divergence between two Ising models, which is a generalization of recent work on approximating the TV-distance. Given two Ising models $ν$ and $μ$, which are specified by their interaction matrices and external fields, the problem is to approximate the $f$-divergence $D_f(ν\,\|\,μ)$ within an arbitrary relative error $\mathrm{e}^{\pm \varepsilon}$. For $χ^α$-divergence with a constant integer $α$, we establish both algorithmic and hardness results. The algorithm works in a parameter regime that matches the hardness result. Our algorithm can be extended to other $f$-divergences such as $α$-divergence, Kullback-Leibler divergence, Rényi divergence, Jensen-Shannon divergence, and squared Hellinger distance.

DSApr 1, 2019
Distributed Metropolis Sampler with Optimal Parallelism

Weiming Feng, Thomas P. Hayes, Yitong Yin

The Metropolis-Hastings algorithm is a fundamental Markov chain Monte Carlo (MCMC) method for sampling and inference. With the advent of Big Data, distributed and parallel variants of MCMC methods are attracting increased attention. In this paper, we give a distributed algorithm that can correctly simulate sequential single-site Metropolis chains without any bias in a fully asynchronous message-passing model. Furthermore, if a natural Lipschitz condition is satisfied by the Metropolis filters, our algorithm can simulate $N$-step Metropolis chains within $O(N/n+\log n)$ rounds of asynchronous communications, where $n$ is the number of variables. For sequential single-site dynamics, whose mixing requires $Ω(n\log n)$ steps, this achieves an optimal linear speedup. For several well-studied important graphical models, including proper graph coloring, hardcore model, and Ising model, our condition for linear speedup is weaker than the respective uniqueness (mixing) conditions. The novel idea in our algorithm is to resolve updates in advance: the local Metropolis filters can often be executed correctly before the full information about neighboring spins is available. This achieves optimal parallelism without introducing any bias.

DSJul 17, 2018
Dynamic Sampling from Graphical Models

Weiming Feng, Nisheeth K. Vishnoi, Yitong Yin

In this paper, we study the problem of sampling from a graphical model when the model itself is changing dynamically with time. This problem derives its interest from a variety of inference, learning, and sampling settings in machine learning, computer vision, statistical physics, and theoretical computer science. While the problem of sampling from a static graphical model has received considerable attention, theoretical works for its dynamic variants have been largely lacking. The main contribution of this paper is an algorithm that can sample dynamically from a broad class of graphical models over discrete random variables. Our algorithm is parallel and Las Vegas: it knows when to stop and it outputs samples from the exact distribution. We also provide sufficient conditions under which this algorithm runs in time proportional to the size of the update, on general graphical models as well as well-studied specific spin systems. In particular we obtain, for the Ising model (ferromagnetic or anti-ferromagnetic) and for the hardcore model the first dynamic sampling algorithms that can handle both edge and vertex updates (addition, deletion, change of functions), both efficient within regimes that are close to the respective uniqueness regimes, beyond which, even for the static and approximate sampling, no local algorithms were known or the problem itself is intractable. Our dynamic sampling algorithm relies on a local resampling algorithm and a new "equilibrium" property that is shown to be satisfied by our algorithm at each step, and enables us to prove its correctness. This equilibrium property is robust enough to guarantee the correctness of our algorithm, helps us improve bounds on fast convergence on specific models, and should be of independent interest.