Qichen Wang

DB
h-index29
9papers
23citations
Novelty54%
AI Score52

9 Papers

12.6DBMar 16
Size Bound-Adorned Datalog

Christian Fattebert, Zhekai Jiang, Christoph Koch et al.

We introduce EDB-bounded datalog, a framework for deriving upper bounds on intermediate result sizes and the asymptotic complexity of recursive queries in datalog. We present an algorithm that, given an arbitrary datalog program, constructs an EDB-bounded datalog program in which every rule is adorned with a (non-recursive) conjunctive query that subsumes the result of the rule, thus acting as an upper bound. From such adornments, we define a notion of width based on (integral or fractional) edge-cover widths. Through the adornments and the width measure, we obtain, for every IDB predicate, worst-case upper bounds on their sizes, which are polynomial in the input data size, given a fixed program structure. Furthermore, with these size bounds, we also derive fixed-parameter tractable, output-sensitive asymptotic complexity bounds for evaluating the entire program. Additionally, by adapting our framework, we obtain a semi-decision procedure for datalog boundedness that efficiently rewrites most practical bounded programs into non-recursive equivalent programs.

28.7DBMar 16
Succinct Structure Representations for Efficient Query Optimization

Zhekai Jiang, Qichen Wang, Christoph Koch

Structural decomposition methods offer powerful theoretical guarantees for join evaluation, yet they are rarely used in real-world query optimizers. A major reason is the difficulty of combining cost-based plan search and structure-based evaluation. In this work, we bridge this gap by introducing meta-decompositions for acyclic queries, a novel representation that succinctly represents all possible join trees and enables their efficient enumeration. Meta-decompositions can be constructed in polynomial time and have sizes linear in the query size. We design an efficient polynomial-time cost-based optimizer based directly on the meta-decomposition, without the need to explicitly enumerate all possible join trees. We characterize plans found by this approach using a novel notion of width, which effectively implies the theoretical worst-case asymptotic bounds of intermediate result sizes and running time of any query plan. Experimental results demonstrate that, in practice, the plans in our class are consistently comparable to -- even in many cases better than -- the optimal ones found by the state-of-the-art dynamic programming approach, especially on large and complex queries, while our planning process runs by orders of magnitude faster, comparable to the time taken by common heuristic methods.

50.9DBMar 16
Towards Parameterized Hardness on Maintaining Conjunctive Queries

Qichen Wang

We investigate the fine-grained complexity of dynamically maintaining the result of fixed self-join free conjunctive queries under single-tuple updates. Prior work shows that free-connex queries can be maintained in update time $O(|D|^δ)$ for some $δ\in [0.5, 1]$, where $|D|$ is the size of the current database. However, a gap remains between the best known upper bound of $O(|D|)$ and lower bounds of $Ω(|D|^{0.5-ε})$ for any $ε\ge 0$. We narrow this gap by introducing two structural parameters to quantify the dynamic complexity of a conjunctive query: the height $k$ and the dimension $d$. We establish new fine-grained lower bounds showing that any algorithm maintaining a query with these parameters must incur update time $Ω(|D|^{1-1/\max(k,d)-ε})$, unless widely believed conjectures fail. These yield the first super-$\sqrt{|D|}$ lower bounds for maintaining free-connex queries, and suggest the tightness of current algorithms when considering arbitrarily large $k$ and~$d$. Complementing our lower bounds, we identify a data-dependent parameter, the generalized $H$-index $h(D)$, which is upper bounded by $|D|^{1/d}$, and design an efficient algorithm for maintaining star queries, a common class of height 2 free-connex queries. The algorithm achieves an instance-specific update time $O(h(D)^{d-1})$ with linear space $O(|D|)$. This matches our parameterized lower bound and provides instance-specific performance in favorable cases.

50.9QUANT-PHApr 13
A Methodological Analysis of Empirical Studies in Quantum Software Testing

Yuechen Li, Minqi Shao, Jianjun Zhao et al.

In quantum software engineering (QSE), quantum software testing (QST) has attracted increasing attention as quantum software systems grow in scale and complexity. Since QST evaluates quantum programs through execution under designed test inputs, empirical studies are widely used to assess the effectiveness of testing approaches. However, the design and reporting of empirical studies in QST remain highly diverse, and a shared methodological understanding has yet to emerge, making it difficult to interpret results and compare findings across studies. This paper presents a methodological analysis of empirical studies in QST through a systematic examination of 59 primary studies identified from a literature pool of size 384. We organize our analysis around ten research questions that cover key methodological dimensions of QST empirical studies, including objects under test, baseline comparison, testing setup, experimental configuration, and tool and artifact support. Through cross-study analysis along these dimensions, we characterize current empirical practices in QST, identify recurring limitations and inconsistencies, and highlight open methodological challenges. Based on our findings, we derive insights and recommendations to inform the design, execution, and reporting of future empirical studies in QST.

ITFeb 5, 2024
Multi-agent Reinforcement Learning for Energy Saving in Multi-Cell Massive MIMO Systems

Tianzhang Cai, Qichen Wang, Shuai Zhang et al.

We develop a multi-agent reinforcement learning (MARL) algorithm to minimize the total energy consumption of multiple massive MIMO (multiple-input multiple-output) base stations (BSs) in a multi-cell network while preserving the overall quality-of-service (QoS) by making decisions on the multi-level advanced sleep modes (ASMs) and antenna switching of these BSs. The problem is modeled as a decentralized partially observable Markov decision process (DEC-POMDP) to enable collaboration between individual BSs, which is necessary to tackle inter-cell interference. A multi-agent proximal policy optimization (MAPPO) algorithm is designed to learn a collaborative BS control policy. To enhance its scalability, a modified version called MAPPO-neighbor policy is further proposed. Simulation results demonstrate that the trained MAPPO agent achieves better performance compared to baseline policies. Specifically, compared to the auto sleep mode 1 (symbol-level sleeping) algorithm, the MAPPO-neighbor policy reduces power consumption by approximately 8.7% during low-traffic hours and improves energy efficiency by approximately 19% during high-traffic hours, respectively.

SIMay 20, 2024
Effective Clustering on Large Attributed Bipartite Graphs

Renchi Yang, Yidu Wu, Xiaoyang Lin et al.

Attributed bipartite graphs (ABGs) are an expressive data model for describing the interactions between two sets of heterogeneous nodes that are associated with rich attributes, such as customer-product purchase networks and author-paper authorship graphs. Partitioning the target node set in such graphs into k disjoint clusters (referred to as k-ABGC) finds widespread use in various domains, including social network analysis, recommendation systems, information retrieval, and bioinformatics. However, the majority of existing solutions towards k-ABGC either overlook attribute information or fail to capture bipartite graph structures accurately, engendering severely compromised result quality. The severity of these issues is accentuated in real ABGs, which often encompass millions of nodes and a sheer volume of attribute data, rendering effective k-ABGC over such graphs highly challenging. In this paper, we propose TPO, an effective and efficient approach to k-ABGC that achieves superb clustering performance on multiple real datasets. TPO obtains high clustering quality through two major contributions: (i) a novel formulation and transformation of the k-ABGC problem based on multi-scale attribute affinity specialized for capturing attribute affinities between nodes with the consideration of their multi-hop connections in ABGs, and (ii) a highly efficient solver that includes a suite of carefully-crafted optimizations for sidestepping explicit affinity matrix construction and facilitating faster convergence. Extensive experiments, comparing TPO against 19 baselines over 5 real ABGs, showcase the superior clustering quality of TPO measured against ground-truth labels. Moreover, compared to the state of the arts, TPO is often more than 40x faster over both small and large ABGs.

73.5ITApr 8
Energy Saving for Cell-Free Massive MIMO Networks: A Multi-Agent Deep Reinforcement Learning Approach

Qichen Wang, Keyu Li, Ozan Alp Topal et al.

This paper focuses on energy savings in downlink operation of cell-free massive MIMO (CF mMIMO) networks under dynamic traffic conditions. We propose a multi-agent deep reinforcement learning (MADRL) algorithm that enables each access point (AP) to autonomously control antenna re-configuration and advanced sleep mode (ASM) selection. After the training process, the proposed framework operates in a fully distributed manner, eliminating the need for centralized control and allowing each AP to dynamically adjust to real-time traffic fluctuations. Simulation results show that the proposed algorithm reduces power consumption (PC) by 56.23% compared to systems without any energy-saving scheme and by 30.12% relative to a non-learning mechanism that only utilizes the lightest sleep mode, with only a slight increase in drop ratio. Moreover, compared to the widely used deep Q-network (DQN) algorithm, it achieves a similar PC level but with a significantly lower drop ratio.

CLMar 12, 2025
ClaimTrust: Propagation Trust Scoring for RAG Systems

Hangkai Qian, Bo Li, Qichen Wang

The rapid adoption of retrieval-augmented generation (RAG) systems has revolutionized large-scale content generation but has also highlighted the challenge of ensuring trustworthiness in retrieved information. This paper introduces ClaimTrust, a propagation-based trust scoring framework that dynamically evaluates the reliability of documents in a RAG system. Using a modified PageRank-inspired algorithm, ClaimTrust propagates trust scores across documents based on relationships derived from extracted factual claims. We preprocess and analyze 814 political news articles from Kaggle's Fake News Detection Dataset to extract 2,173 unique claims and classify 965 meaningful relationships (supporting or contradicting). By representing the dataset as a document graph, ClaimTrust iteratively updates trust scores until convergence, effectively differentiating trustworthy articles from unreliable ones. Our methodology, which leverages embedding-based filtering for efficient claim comparison and relationship classification, achieves a 11.2% of significant connections while maintaining computational scalability. Experimental results demonstrate that ClaimTrust successfully assigns higher trust scores to verified documents while penalizing those containing false information. Future directions include fine-tuned claim extract and compare (Li et al., 2022), parameter optimization, enhanced language model utilization, and robust evaluation metrics to generalize the framework across diverse datasets and domains.

CVJul 8, 2025
AR2: Attention-Guided Repair for the Robustness of CNNs Against Common Corruptions

Fuyuan Zhang, Qichen Wang, Jianjun Zhao

Deep neural networks suffer from significant performance degradation when exposed to common corruptions such as noise, blur, weather, and digital distortions, limiting their reliability in real-world applications. In this paper, we propose AR2 (Attention-Guided Repair for Robustness), a simple yet effective method to enhance the corruption robustness of pretrained CNNs. AR2 operates by explicitly aligning the class activation maps (CAMs) between clean and corrupted images, encouraging the model to maintain consistent attention even under input perturbations. Our approach follows an iterative repair strategy that alternates between CAM-guided refinement and standard fine-tuning, without requiring architectural changes. Extensive experiments show that AR2 consistently outperforms existing state-of-the-art methods in restoring robustness on standard corruption benchmarks (CIFAR-10-C, CIFAR-100-C and ImageNet-C), achieving a favorable balance between accuracy on clean data and corruption robustness. These results demonstrate that AR2 provides a robust and scalable solution for enhancing model reliability in real-world environments with diverse corruptions.