Weidong Liu

ML
h-index44
25papers
1,625citations
Novelty53%
AI Score49

25 Papers

CLSep 9, 2023
Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf

Yuzhuang Xu, Shuo Wang, Peng Li et al. · ibm-research, tsinghua

Communication games, which we refer to as incomplete information games that heavily depend on natural language communication, hold significant research value in fields such as economics, social science, and artificial intelligence. In this work, we explore the problem of how to engage large language models (LLMs) in communication games, and in response, propose a tuning-free framework. Our approach keeps LLMs frozen, and relies on the retrieval and reflection on past communications and experiences for improvement. An empirical study on the representative and widely-studied communication game, ``Werewolf'', demonstrates that our framework can effectively play Werewolf game without tuning the parameters of the LLMs. More importantly, strategic behaviors begin to emerge in our experiments, suggesting that it will be a fruitful journey to engage LLMs in communication games and associated domains.

76.6CYMay 29
Student Competency Assessment and Presentation Methods Based on Algorithm Courses

Yingqi Zhang, Ninghan Zheng, Shanshan Li et al.

This full research paper describes the assessment and presentation of student competencies in algorithm courses, grounded in the CC2020 competency model. With the growing emphasis on bridging the gap between academic training and industry demands, competency-based education, which integrates knowledge, skills, and dispositions, has become pivotal in computer science education. To bridge the gap, we need to develop a comprehensive framework to evaluate competencies (knowledge, skills, and dispositions) in computer science education. The research aims to analyze learning behavior patterns, design methods for competency assessment in algorithm courses, and evaluate the difficulty of course experiments to inform curriculum design. We collected programming experiment and written assignment data from 169 students, adapting it to the xAPI specification for unified analysis. In this work, Markov process modeling was employed to analyze behavioral sequences, revealing cognitive patterns during programming tasks. Multiple methods were applied to quantify competencies (knowledge, skills, dispositions) and identify distinct student clusters. Course difficulty was quantified using proactiveness metrics derived from submission timeliness. This work contributes a scalable framework for competency assessment in algorithm courses and offers actionable insights for personalized teaching and curriculum optimization. Practically, it enables instructors to tailor interventions based on student clusters and optimize task difficulty. Future work will integrate more students' performance to validate competency models and extend the framework to broader computer science curricula.

CVMar 13, 2023Code
Super-Resolution Information Enhancement For Crowd Counting

Jiahao Xie, Wei Xu, Dingkang Liang et al.

Crowd counting is a challenging task due to the heavy occlusions, scales, and density variations. Existing methods handle these challenges effectively while ignoring low-resolution (LR) circumstances. The LR circumstances weaken the counting performance deeply for two crucial reasons: 1) limited detail information; 2) overlapping head regions accumulate in density maps and result in extreme ground-truth values. An intuitive solution is to employ super-resolution (SR) pre-processes for the input LR images. However, it complicates the inference steps and thus limits application potentials when requiring real-time. We propose a more elegant method termed Multi-Scale Super-Resolution Module (MSSRM). It guides the network to estimate the lost de tails and enhances the detailed information in the feature space. Noteworthy that the MSSRM is plug-in plug-out and deals with the LR problems with no inference cost. As the proposed method requires SR labels, we further propose a Super-Resolution Crowd Counting dataset (SR-Crowd). Extensive experiments on three datasets demonstrate the superiority of our method. The code will be available at https://github.com/PRIS-CV/MSSRM.git.

CLJul 12, 2023
Pluggable Neural Machine Translation Models via Memory-augmented Adapters

Yuzhuang Xu, Shuo Wang, Peng Li et al. · tsinghua

Although neural machine translation (NMT) models perform well in the general domain, it remains rather challenging to control their generation behavior to satisfy the requirement of different users. Given the expensive training cost and the data scarcity challenge of learning a new model from scratch for each user requirement, we propose a memory-augmented adapter to steer pretrained NMT models in a pluggable manner. Specifically, we construct a multi-granular memory based on the user-provided text samples and propose a new adapter architecture to combine the model representations and the retrieved results. We also propose a training strategy using memory dropout to reduce spurious dependencies between the NMT model and the memory. We validate our approach on both style- and domain-specific experiments and the results indicate that our method can outperform several representative pluggable baselines.

CRSep 8, 2022
Majority Vote for Distributed Differentially Private Sign Selection

Weidong Liu, Jiyuan Tu, Xiaojun Mao et al.

Privacy-preserving data analysis has become more prevalent in recent years. In this study, we propose a distributed group differentially private Majority Vote mechanism, for the sign selection problem in a distributed setup. To achieve this, we apply the iterative peeling to the stability function and use the exponential mechanism to recover the signs. For enhanced applicability, we study the private sign selection for mean estimation and linear regression problems, in distributed systems. Our method recovers the support and signs with the optimal signal-to-noise ratio as in the non-private scenario, which is better than contemporary works of private variable selections. Moreover, the sign selection consistency is justified by theoretical guarantees. Simulation studies are conducted to demonstrate the effectiveness of the proposed method.

MLJun 17, 2023
Distributed Semi-Supervised Sparse Statistical Inference

Jiyuan Tu, Weidong Liu, Xiaojun Mao et al.

The debiased estimator is a crucial tool in statistical inference for high-dimensional model parameters. However, constructing such an estimator involves estimating the high-dimensional inverse Hessian matrix, incurring significant computational costs. This challenge becomes particularly acute in distributed setups, where traditional methods necessitate computing a debiased estimator on every machine. This becomes unwieldy, especially with a large number of machines. In this paper, we delve into semi-supervised sparse statistical inference in a distributed setup. An efficient multi-round distributed debiased estimator, which integrates both labeled and unlabelled data, is developed. We will show that the additional unlabeled data helps to improve the statistical rate of each round of iteration. Our approach offers tailored debiasing methods for $M$-estimation and generalized linear models according to the specific form of the loss function. Our method also applies to a non-smooth loss like absolute deviation loss. Furthermore, our algorithm is computationally efficient since it requires only one estimation of a high-dimensional inverse covariance matrix. We demonstrate the effectiveness of our method by presenting simulation studies and real data applications that highlight the benefits of incorporating unlabeled data.

MLOct 4, 2023
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning

Weidong Liu, Jiyuan Tu, Xi Chen et al.

Reinforcement learning has emerged as one of the prominent topics attracting attention in modern statistical learning, with policy evaluation being a key component. Unlike the traditional machine learning literature on this topic, our work emphasizes statistical inference for the model parameters and value functions of reinforcement learning algorithms. While most existing analyses assume random rewards to follow standard distributions, we embrace the concept of robust statistics in reinforcement learning by simultaneously addressing issues of outlier contamination and heavy-tailed rewards within a unified framework. In this paper, we develop a fully online robust policy evaluation procedure, and establish the Bahadur-type representation of our estimator. Furthermore, we develop an online procedure to efficiently conduct statistical inference based on the asymptotic distribution. This paper connects robust statistics and statistical inference in reinforcement learning, offering a more versatile and reliable approach to online policy evaluation. Finally, we validate the efficacy of our algorithm through numerical experiments conducted in simulations and real-world reinforcement learning experiments.

CLFeb 17, 2024
OneBit: Towards Extremely Low-bit Large Language Models

Yuzhuang Xu, Xu Han, Zonghan Yang et al. · tsinghua

Model quantification uses low bit-width values to represent the weight matrices of existing models to be quantized, which is a promising approach to reduce both storage and computational overheads of deploying highly anticipated LLMs. However, current quantization methods suffer severe performance degradation when the bit-width is extremely reduced, and thus focus on utilizing 4-bit or 8-bit values to quantize models. This paper boldly quantizes the weight matrices of LLMs to 1-bit, paving the way for the extremely low bit-width deployment of LLMs. For this target, we introduce a 1-bit model compressing framework named OneBit, including a novel 1-bit parameter representation method to better quantize LLMs as well as an effective parameter initialization method based on matrix decomposition to improve the convergence speed of the quantization framework. Sufficient experimental results indicate that OneBit achieves good performance (at least 81% of the non-quantized performance on LLaMA models) with robust training processes when only using 1-bit weight matrices.

MLJan 2, 2024
Efficient Sparse Least Absolute Deviation Regression with Differential Privacy

Weidong Liu, Xiaojun Mao, Xiaofei Zhang et al.

In recent years, privacy-preserving machine learning algorithms have attracted increasing attention because of their important applications in many scientific fields. However, in the literature, most privacy-preserving algorithms demand learning objectives to be strongly convex and Lipschitz smooth, which thus cannot cover a wide class of robust loss functions (e.g., quantile/least absolute loss). In this work, we aim to develop a fast privacy-preserving learning solution for a sparse robust regression problem. Our learning loss consists of a robust least absolute loss and an $\ell_1$ sparse penalty term. To fast solve the non-smooth loss under a given privacy budget, we develop a Fast Robust And Privacy-Preserving Estimation (FRAPPE) algorithm for least absolute deviation regression. Our algorithm achieves a fast estimation by reformulating the sparse LAD problem as a penalized least square estimation problem and adopts a three-stage noise injection to guarantee the $(ε,δ)$-differential privacy. We show that our algorithm can achieve better privacy and statistical accuracy trade-off compared with the state-of-the-art privacy-preserving regression algorithms. In the end, we conduct experiments to verify the efficiency of our proposed FRAPPE algorithm.

MEDec 14, 2025
Robust Variational Bayes by Min-Max Median Aggregation

Jiawei Yan, Ju Liu, Weidong Liu et al.

We propose a robust and scalable variational Bayes (VB) framework designed to effectively handle contamination and outliers in dataset. Our approach partitions the data into $m$ disjoint subsets and formulates a joint optimization problem based on robust aggregation principles. A key insight is that the full posterior distribution is equivalent to the minimizer of the mean Kullback-Leibler (KL) divergence from the $m$-powered local posterior distributions. To enhance robustness, we replace the mean KL divergence with a min-max median formulation. The min-max formulation not only ensures consistency between the KL minimizer and the Evidence Lower Bound (ELBO) maximizer but also facilitates the establishment of improved statistical rates for the mean of variational posterior. We observe a notable discrepancy in the $m$-powered marginal log likelihood function contingent on the presence of local latent variables. To address this, we treat these two scenarios separately to guarantee the consistency of the aggregated variational posterior. Specifically, when local latent variables are present, we introduce an aggregate-and-rescale strategy. Theoretically, we provide a non-asymptotic analysis of our proposed posterior, incorporating a refined analysis of Bernstein-von Mises (BvM) theorem to accommodate a diverging number of subsets $m$. Our findings indicate that the two-stage approach yields a smaller approximation error compared to directly aggregating the $m$-powered local posteriors. Furthermore, we establish a nearly optimal statistical rate for the mean of the proposed posterior, advancing existing theories related to min-max median estimators. The efficacy of our method is demonstrated through extensive simulation studies.

LGJan 31, 2025
A Bias-Correction Decentralized Stochastic Gradient Algorithm with Momentum Acceleration

Yuchen Hu, Xi Chen, Weidong Liu et al.

Distributed stochastic optimization algorithms can simultaneously process large-scale datasets, significantly accelerating model training. However, their effectiveness is often hindered by the sparsity of distributed networks and data heterogeneity. In this paper, we propose a momentum-accelerated distributed stochastic gradient algorithm, termed Exact-Diffusion with Momentum (EDM), which mitigates the bias from data heterogeneity and incorporates momentum techniques commonly used in deep learning to enhance convergence rate. Our theoretical analysis demonstrates that the EDM algorithm converges sub-linearly to the neighborhood of the optimal solution, the radius of which is irrespective of data heterogeneity, when applied to non-convex objective functions; under the Polyak-Lojasiewicz condition, which is a weaker assumption than strong convexity, it converges linearly to the target region. Our analysis techniques employed to handle momentum in complex distributed parameter update structures yield a sufficiently tight convergence upper bound, offering a new perspective for the theoretical analysis of other momentum-based distributed algorithms.

CVDec 12, 2023
Vision-language Assisted Attribute Learning

Kongming Liang, Xinran Wang, Rui Wang et al.

Attribute labeling at large scale is typically incomplete and partial, posing significant challenges to model optimization. Existing attribute learning methods often treat the missing labels as negative or simply ignore them all during training, either of which could hamper the model performance to a great extent. To overcome these limitations, in this paper we leverage the available vision-language knowledge to explicitly disclose the missing labels for enhancing model learning. Given an image, we predict the likelihood of each missing attribute label assisted by an off-the-shelf vision-language model, and randomly select to ignore those with high scores in training. Our strategy strikes a good balance between fully ignoring and negatifying the missing labels, as these high scores are found to be informative on revealing label ambiguity. Extensive experiments show that our proposed vision-language assisted loss can achieve state-of-the-art performance on the newly cleaned VAW dataset. Qualitative evaluation demonstrates the ability of the proposed method in predicting more complete attributes.

LGMay 28, 2023
Acceleration of stochastic gradient descent with momentum by averaging: finite-sample rates and asymptotic normality

Kejie Tang, Weidong Liu, Yichen Zhang et al.

Stochastic gradient descent with momentum (SGDM) has been widely used in many machine learning and statistical applications. Despite the observed empirical benefits of SGDM over traditional SGD, the theoretical understanding of the role of momentum for different learning rates in the optimization process remains widely open. We analyze the finite-sample convergence rate of SGDM under the strongly convex settings and show that, with a large batch size, the mini-batch SGDM converges faster than the mini-batch SGD to a neighborhood of the optimal value. Additionally, our findings, supported by theoretical analysis and numerical experiments, indicate that SGDM permits broader choices of learning rates. Furthermore, we analyze the Polyak-averaging version of the SGDM estimator, establish its asymptotic normality, and justify its asymptotic equivalence to the averaged SGD. The asymptotic distribution of the averaged SGDM enables uncertainty quantification of the algorithm output and statistical inference of the model parameters.

MLFeb 11, 2022
Fast and Robust Sparsity Learning over Networks: A Decentralized Surrogate Median Regression Approach

Weidong Liu, Xiaojun Mao, Xin Zhang

Decentralized sparsity learning has attracted a significant amount of attention recently due to its rapidly growing applications. To obtain the robust and sparse estimators, a natural idea is to adopt the non-smooth median loss combined with a $\ell_1$ sparsity regularizer. However, most of the existing methods suffer from slow convergence performance caused by the {\em double} non-smooth objective. To accelerate the computation, in this paper, we proposed a decentralized surrogate median regression (deSMR) method for efficiently solving the decentralized sparsity learning problem. We show that our proposed algorithm enjoys a linear convergence rate with a simple implementation. We also investigate the statistical guarantee, and it shows that our proposed estimator achieves a near-oracle convergence rate without any restriction on the number of network nodes. Moreover, we establish the theoretical results for sparse support recovery. Thorough numerical experiments and real data study are provided to demonstrate the effectiveness of our method.

CVJun 16, 2021
Structured DropConnect for Uncertainty Inference in Image Classification

Wenqing Zheng, Jiyang Xie, Weidong Liu et al.

With the complexity of the network structure, uncertainty inference has become an important task to improve the classification accuracy for artificial intelligence systems. For image classification tasks, we propose a structured DropConnect (SDC) framework to model the output of a deep neural network by a Dirichlet distribution. We introduce a DropConnect strategy on weights in the fully connected layers during training. In test, we split the network into several sub-networks, and then model the Dirichlet distribution by match its moments with the mean and variance of the outputs of these sub-networks. The entropy of the estimated Dirichlet distribution is finally utilized for uncertainty inference. In this paper, this framework is implemented on LeNet$5$ and VGG$16$ models for misclassification detection and out-of-distribution detection on MNIST and CIFAR-$10$ datasets. Experimental results show that the performance of the proposed SDC can be comparable to other uncertainty inference methods. Furthermore, the SDC is adapted well to different network structures with certain generalization capabilities and research prospects.

MLMar 4, 2021
Variance Reduced Median-of-Means Estimator for Byzantine-Robust Distributed Inference

Jiyuan Tu, Weidong Liu, Xiaojun Mao et al.

This paper develops an efficient distributed inference algorithm, which is robust against a moderate fraction of Byzantine nodes, namely arbitrary and possibly adversarial machines in a distributed learning system. In robust statistics, the median-of-means (MOM) has been a popular approach to hedge against Byzantine failures due to its ease of implementation and computational efficiency. However, the MOM estimator has the shortcoming in terms of statistical efficiency. The first main contribution of the paper is to propose a variance reduced median-of-means (VRMOM) estimator, which improves the statistical efficiency over the vanilla MOM estimator and is computationally as efficient as the MOM. Based on the proposed VRMOM estimator, we develop a general distributed inference algorithm that is robust against Byzantine failures. Theoretically, our distributed algorithm achieves a fast convergence rate with only a constant number of rounds of communications. We also provide the asymptotic normality result for the purpose of statistical inference. To the best of our knowledge, this is the first normality result in the setting of Byzantine-robust distributed learning. The simulation results are also presented to illustrate the effectiveness of our method.

IRJul 4, 2020
Neural Interactive Collaborative Filtering

Lixin Zou, Long Xia, Yulong Gu et al.

In this paper, we study collaborative filtering in an interactive setting, in which the recommender agents iterate between making recommendations and updating the user profile based on the interactive feedback. The most challenging problem in this scenario is how to suggest items when the user profile has not been well established, i.e., recommend for cold-start users or warm-start users with taste drifting. Existing approaches either rely on overly pessimistic linear exploration strategy or adopt meta-learning based algorithms in a full exploitation way. In this work, to quickly catch up with the user's interests, we propose to represent the exploration policy with a neural network and directly learn it from the feedback data. Specifically, the exploration policy is encoded in the weights of multi-channel stacked self-attention neural networks and trained with efficient Q-learning by maximizing users' overall satisfaction in the recommender systems. The key insight is that the satisfied recommendations triggered by the exploration recommendation can be viewed as the exploration bonus (delayed reward) for its contribution on improving the quality of the user profile. Therefore, the proposed exploration policy, to balance between learning the user profile and making accurate recommendations, can be directly optimized by maximizing users' long-term satisfaction with reinforcement learning. Extensive experiments and analysis conducted on three benchmark collaborative filtering datasets have demonstrated the advantage of our method over state-of-the-art methods.

MLJun 18, 2020
Median Matrix Completion: from Embarrassment to Optimality

Weidong Liu, Xiaojun Mao, Raymond K. W. Wong

In this paper, we consider matrix completion with absolute deviation loss and obtain an estimator of the median matrix. Despite several appealing properties of median, the non-smooth absolute deviation loss leads to computational challenge for large-scale data sets which are increasingly common among matrix completion problems. A simple solution to large-scale problems is parallel computing. However, embarrassingly parallel fashion often leads to inefficient estimators. Based on the idea of pseudo data, we propose a novel refinement step, which turns such inefficient estimators into a rate (near-)optimal matrix completion procedure. The refined estimator is an approximation of a regularized least median estimator, and therefore not an ordinary regularized empirical risk estimator. This leads to a non-standard analysis of asymptotic behaviors. Empirical results are also provided to confirm the effectiveness of the proposed method.

MEJun 13, 2019
Distributed High-dimensional Regression Under a Quantile Loss Function

Xi Chen, Weidong Liu, Xiaojun Mao et al.

This paper studies distributed estimation and support recovery for high-dimensional linear regression model with heavy-tailed noise. To deal with heavy-tailed noise whose variance can be infinite, we adopt the quantile regression loss function instead of the commonly used squared loss. However, the non-smooth quantile loss poses new challenges to high-dimensional distributed estimation in both computation and theoretical development. To address the challenge, we transform the response variable and establish a new connection between quantile regression and ordinary linear regression. Then, we provide a distributed estimator that is both computationally and communicationally efficient, where only the gradient information is communicated at each iteration. Theoretically, we show that, after a constant number of iterations, the proposed estimator achieves a near-oracle convergence rate without any restriction on the number of machines. Moreover, we establish the theoretical guarantee for the support recovery. The simulation analysis is provided to demonstrate the effectiveness of our method.

IRFeb 13, 2019
Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems

Lixin Zou, Long Xia, Zhuoye Ding et al.

Recommender systems play a crucial role in our daily lives. Feed streaming mechanism has been widely used in the recommender system, especially on the mobile Apps. The feed streaming setting provides users the interactive manner of recommendation in never-ending feeds. In such an interactive manner, a good recommender system should pay more attention to user stickiness, which is far beyond classical instant metrics, and typically measured by {\bf long-term user engagement}. Directly optimizing the long-term user engagement is a non-trivial problem, as the learning target is usually not available for conventional supervised learning methods. Though reinforcement learning~(RL) naturally fits the problem of maximizing the long term rewards, applying RL to optimize long-term user engagement is still facing challenges: user behaviors are versatile and difficult to model, which typically consists of both instant feedback~(e.g. clicks, ordering) and delayed feedback~(e.g. dwell time, revisit); in addition, performing effective off-policy learning is still immature, especially when combining bootstrapping and function approximation. To address these issues, in this work, we introduce a reinforcement learning framework --- FeedRec to optimize the long-term user engagement. FeedRec includes two components: 1)~a Q-Network which designed in hierarchical LSTM takes charge of modeling complex user behaviors, and 2)~an S-Network, which simulates the environment, assists the Q-Network and voids the instability of convergence in policy learning. Extensive experiments on synthetic data and a real-world large scale data show that FeedRec effectively optimizes the long-term user engagement and outperforms state-of-the-arts.

CVJan 16, 2019
Attention-aware Multi-stroke Style Transfer

Yuan Yao, Jianqiang Ren, Xuansong Xie et al.

Neural style transfer has drawn considerable attention from both academic and industrial field. Although visual effect and efficiency have been significantly improved, existing methods are unable to coordinate spatial distribution of visual attention between the content image and stylized image, or render diverse level of detail via different brush strokes. In this paper, we tackle these limitations by developing an attention-aware multi-stroke style transfer model. We first propose to assemble self-attention mechanism into a style-agnostic reconstruction autoencoder framework, from which the attention map of a content image can be derived. By performing multi-scale style swap on content features and style features, we produce multiple feature maps reflecting different stroke patterns. A flexible fusion strategy is further presented to incorporate the salient characteristics from the attention map, which allows integrating multiple stroke patterns into different spatial regions of the output image harmoniously. We demonstrate the effectiveness of our method, as well as generate comparable stylized images with multiple stroke patterns against the state-of-the-art methods.

MLNov 29, 2018
Distributed Inference for Linear Support Vector Machine

Xiaozhou Wang, Zhuoyi Yang, Xi Chen et al.

The growing size of modern data brings many new challenges to existing statistical inference methodologies and theories, and calls for the development of distributed inferential approaches. This paper studies distributed inference for linear support vector machine (SVM) for the binary classification task. Despite a vast literature on SVM, much less is known about the inferential properties of SVM, especially in a distributed setting. In this paper, we propose a multi-round distributed linear-type (MDL) estimator for conducting inference for linear SVM. The proposed estimator is computationally efficient. In particular, it only requires an initial SVM estimator and then successively refines the estimator by solving simple weighted least squares problem. Theoretically, we establish the Bahadur representation of the estimator. Based on the representation, the asymptotic normality is further derived, which shows that the MDL estimator achieves the optimal statistical efficiency, i.e., the same efficiency as the classical linear SVM applying to the entire data set in a single machine setup. Moreover, our asymptotic result avoids the condition on the number of machines or data batches, which is commonly assumed in distributed estimation literature, and allows the case of diverging dimension. We provide simulation studies to demonstrate the performance of the proposed MDL estimator.

MLNov 28, 2018
First-order Newton-type Estimator for Distributed Estimation and Inference

Xi Chen, Weidong Liu, Yichen Zhang

This paper studies distributed estimation and inference for a general statistical problem with a convex loss that could be non-differentiable. For the purpose of efficient computation, we restrict ourselves to stochastic first-order optimization, which enjoys low per-iteration complexity. To motivate the proposed method, we first investigate the theoretical properties of a straightforward Divide-and-Conquer Stochastic Gradient Descent (DC-SGD) approach. Our theory shows that there is a restriction on the number of machines and this restriction becomes more stringent when the dimension $p$ is large. To overcome this limitation, this paper proposes a new multi-round distributed estimation procedure that approximates the Newton step only using stochastic subgradient. The key component in our method is the proposal of a computationally efficient estimator of $Σ^{-1} w$, where $Σ$ is the population Hessian matrix and $w$ is any given vector. Instead of estimating $Σ$ (or $Σ^{-1}$) that usually requires the second-order differentiability of the loss, the proposed First-Order Newton-type Estimator (FONE) directly estimates the vector of interest $Σ^{-1} w$ as a whole and is applicable to non-differentiable losses. Our estimator also facilitates the inference for the empirical risk minimizer. It turns out that the key term in the limiting covariance has the form of $Σ^{-1} w$, which can be estimated by FONE.

MEOct 18, 2018
Quantile Regression Under Memory Constraint

Xi Chen, Weidong Liu, Yichen Zhang

This paper studies the inference problem in quantile regression (QR) for a large sample size $n$ but under a limited memory constraint, where the memory can only store a small batch of data of size $m$. A natural method is the naïve divide-and-conquer approach, which splits data into batches of size $m$, computes the local QR estimator for each batch, and then aggregates the estimators via averaging. However, this method only works when $n=o(m^2)$ and is computationally expensive. This paper proposes a computationally efficient method, which only requires an initial QR estimator on a small batch of data and then successively refines the estimator via multiple rounds of aggregations. Theoretically, as long as $n$ grows polynomially in $m$, we establish the asymptotic normality for the obtained estimator and show that our estimator with only a few rounds of aggregations achieves the same efficiency as the QR estimator computed on all the data. Moreover, our result allows the case that the dimensionality $p$ goes to infinity. The proposed method can also be applied to address the QR problem under distributed computing environment (e.g., in a large-scale sensor network) or for real-time streaming data.

MEMar 17, 2012
Fast and Adaptive Sparse Precision Matrix Estimation in High Dimensions

Weidong Liu, Xi Luo

This paper proposes a new method for estimating sparse precision matrices in the high dimensional setting. It has been popular to study fast computation and adaptive procedures for this problem. We propose a novel approach, called Sparse Column-wise Inverse Operator, to address these two issues. We analyze an adaptive procedure based on cross validation, and establish its convergence rate under the Frobenius norm. The convergence rates under other matrix norms are also established. This method also enjoys the advantage of fast computation for large-scale problems, via a coordinate descent algorithm. Numerical merits are illustrated using both simulated and real datasets. In particular, it performs favorably on an HIV brain tissue dataset and an ADHD resting-state fMRI dataset.