Hung Q. Ngo

6papers

66citations

Novelty52%

AI Score48

Ranked #53,465 of 201,326 authors (top 27%)#89 in DB (top 17%)

6 Papers

DBMay 28

Listing Even Cycles Faster than the Submodular-Width Barrier

Vasileios Nakos, Hung Q. Ngo, Andreas Panayi

A classic result of Alon, Yuster, and Zwick (AYZ, Algorithmica 1997) shows that all $2k$-cycles in an $m$-edge graph can be listed in $\tilde O(m^{2-1/k}+t)$ time, where $t$ is the output size. This bound underlies the {\em submodular width} of Marx (JACM 2013) and the PANDA framework of Abo Khamis, Ngo, and Suciu (PODS 2017), which extend AYZ to arbitrary conjunctive queries with degree constraints. A central open question is whether combinatorial algorithms can beat the submodular-width barrier. Bringmann and Gorbachev (STOC 2025) gave lower-bound evidence that submodular width may be optimal for general conjunctive queries under combinatorial algorithms. The picture changes for $2k$-cycles on undirected graphs, whose queries have self-joins and symmetric EDBs: recent works improve on AYZ for even-cycle detection and listing. Pinning down the complexity of $C_{2k}$-detection and listing is thus a natural step toward overcoming the submodular-width barrier for such queries. For detection, Dahlgaard, Knudsen, and St{ö}ckel (STOC 2017) solved $C_{2k}$-detection in $\tilde O(m^{2k/(k+1)})$ time. Listing is harder. Jin and Xu (STOC 2023), and independently Abboud, Khoury, Leibowitz, and Safier (FSTTCS 2023), listed 4-cycles in $\tilde O(m^{4/3}+t)$ time; Vassilevska~Williams and Westover (ITCS 2025) listed 6-cycles in $\tilde O(m^{8/5}+t)$ time, improving the AYZ bounds of $\tilde O(m^{3/2})$ and $\tilde O(m^{5/3})$. The general case has remained open for 30 years. Building on these works, we list $2k$-cycles in $\tilde O(m^{(2k^2-k+1)/(k^2+1)}+t)$ time, improving AYZ for every $k\geq 3$. The key ingredient is an \emph{asymmetric supersaturation} result for even cycles. Our algorithms use only join and project operators over multiple tree-decomposition plans, making them naturally implementable in database systems, in contrast to prior BFS-based graph approaches.

DBApr 6

PANDAExpress: a Simpler and Faster PANDA Algorithm

Mahmoud Abo Khamis, Hung Q. Ngo, Dan Suciu

PANDA is a powerful generic algorithm for answering conjunctive queries (CQs) and disjunctive datalog rules (DDRs) given input degree constraints. In the special case where degree constraints are cardinality constraints and the query is Boolean, PANDA runs in $\tilde O (N^{subw})$-time, where $N$ is the input size, and $subw$ is the submodular width of the query, a notion introduced by Daniel Marx (JACM 2013). When specialized to certain classes of sub-graph pattern finding problems, the $\tilde O(N^{subw})$ runtime matches the optimal runtime possible, modulo some conjectures in fine-grained complexity (Bringmann and Gorbachev (STOC 25)). The PANDA framework is much more general, as it handles arbitrary input degree constraints, which capture common statistics and integrity constraints used in relational database management systems, it works for queries with free variables, and for both CQs and DDRs. The key weakness of PANDA is the large $polylog(N)$-factor hidden in the $\tilde O(\cdot)$ notation. This makes PANDA completely impractical, and fall short of what is achievable with specialized algorithms. This paper resolves this weakness with two novel ideas. First, we prove a new probabilistic inequality that upper-bounds the output size of DDRs under arbitrary degree constraints. Second, the proof of this inequality directly leads to a new algorithm named PANDAExpress that is both simpler and faster than PANDA. The novel feature of PANDAExpress is a new partitioning scheme that uses arbitrary hyperplane cuts instead of axis-parallel hyperplanes used in PANDA. These hyperplanes are dynamically constructed based on data-skewness statistics carefully tracked throughout the algorithm's execution. As a result, PANDAExpress removes the $polylog(N)$-factor from the runtime of PANDA, matching the runtimes of intricate specialized algorithms, while retaining all its generality and power.

DBApr 6

Query Optimization and Evaluation via Information Theory: A Tutorial

Mahmoud Abo Khamis, Hung Q. Ngo, Dan Suciu

Database theory is exciting because it studies highly general and practically useful abstractions. Conjunctive query (CQ) evaluation is a prime example: it simultaneously generalizes graph pattern matching, constraint satisfaction, and statistical inference, among others. This generality is both the strength and the central challenge of the field. The query optimization and evaluation problem is fundamentally a "meta-algorithm" problem: given a query $Q$ and statistics $\cal S$ about the input database, how should one best answer $Q$? Because the problem is so general, it is often impossible for such a meta-algorithm to match the runtimes of specialized algorithms designed for a fixed query -- or so it seemed. The past fifteen years have witnessed an exciting development in database theory: a general framework, called PANDA, that emerged from advances in database theory, constraint satisfaction problems (CSP), and graph algorithms, for evaluating conjunctive queries given input data statistics. The key idea is to derive information-theoretically tight upper bounds on the cardinalities of intermediate relations produced during query evaluation. These bounds determine the costs of query plans, and crucially, the query plans themselves are derived directly from the mathematical proof of the upper bound. This tight coupling of proof and algorithm is what makes PANDA both principled and powerful. Remarkably, this generic algorithm matches -- and in some cases subsumes -- the runtimes of specialized algorithms for the same problems, including algorithms that exploit fast matrix multiplication. This paper is a tutorial on the PANDA framework. We illustrate the key ideas through concrete examples, conveying the main intuitions behind the theory.

LGOct 11, 2019

Rk-means: Fast Clustering for Relational Data

Ryan Curtin, Ben Moseley, Hung Q. Ngo et al.

Conventional machine learning algorithms cannot be applied until a data matrix is available to process. When the data matrix needs to be obtained from a relational database via a feature extraction query, the computation cost can be prohibitive, as the data matrix may be (much) larger than the total input relation size. This paper introduces Rk-means, or relational k -means algorithm, for clustering relational data tuples without having to access the full data matrix. As such, we avoid having to run the expensive feature extraction query and storing its output. Our algorithm leverages the underlying structures in relational data. It involves construction of a small {\it grid coreset} of the data matrix for subsequent cluster construction. This gives a constant approximation for the k -means objective, while having asymptotic runtime improvements over standard approaches of first running the database query and then clustering. Empirical results show orders-of-magnitude speedup, and Rk-means can run faster on the database than even just computing the data matrix.

DBDec 22, 2018

Functional Aggregate Queries with Additive Inequalities

Mahmoud Abo Khamis, Ryan R. Curtin, Benjamin Moseley et al.

Motivated by fundamental applications in databases and relational machine learning, we formulate and study the problem of answering functional aggregate queries (FAQ) in which some of the input factors are defined by a collection of additive inequalities between variables. We refer to these queries as FAQ-AI for short. To answer FAQ-AI in the Boolean semiring, we define relaxed tree decompositions and relaxed submodular and fractional hypertree width parameters. We show that an extension of the InsideOut algorithm using Chazelle's geometric data structure for solving the semigroup range search problem can answer Boolean FAQ-AI in time given by these new width parameters. This new algorithm achieves lower complexity than known solutions for FAQ-AI. It also recovers some known results in database query answering. Our second contribution is a relaxation of the set of polymatroids that gives rise to the counting version of the submodular width, denoted by #subw. This new width is sandwiched between the submodular and the fractional hypertree widths. Any FAQ and FAQ-AI over one semiring can be answered in time proportional to #subw and respectively to the relaxed version of #subw. We present three applications of our FAQ-AI framework to relational machine learning: k-means clustering, training linear support vector machines, and training models using non-polynomial loss. These optimization problems can be solved over a database asymptotically faster than computing the join of the database relations.

MLMay 23, 2016

On Optimality Conditions for Auto-Encoder Signal Recovery

Devansh Arpit, Yingbo Zhou, Hung Q. Ngo et al.

Auto-Encoders are unsupervised models that aim to learn patterns from observed data by minimizing a reconstruction cost. The useful representations learned are often found to be sparse and distributed. On the other hand, compressed sensing and sparse coding assume a data generating process, where the observed data is generated from some true latent signal source, and try to recover the corresponding signal from measurements. Looking at auto-encoders from this \textit{signal recovery perspective} enables us to have a more coherent view of these techniques. In this paper, in particular, we show that the \textit{true} hidden representation can be approximately recovered if the weight matrices are highly incoherent with unit $ \ell^{2} $ row length and the bias vectors takes the value (approximately) equal to the negative of the data mean. The recovery also becomes more and more accurate as the sparsity in hidden signals increases. Additionally, we empirically demonstrate that auto-encoders are capable of recovering the data generating dictionary when only data samples are given.