Zirui Zhou

h-index16

9papers

1,183citations

Novelty52%

AI Score41

Ranked #63,860 of 194,257 authors (top 33%)#14,397 in LG (top 36%)

9 Papers

2.9OCJan 26, 2018

A Family of Inexact SQA Methods for Non-Smooth Convex Minimization with Provable Convergence Guarantees Based on the Luo-Tseng Error Bound Property

Man-Chung Yue, Zirui Zhou, Anthony Man-Cho So

We propose a new family of inexact sequential quadratic approximation (SQA) methods, which we call the inexact regularized proximal Newton ($\textsf{IRPN}$) method, for minimizing the sum of two closed proper convex functions, one of which is smooth and the other is possibly non-smooth. Our proposed method features strong convergence guarantees even when applied to problems with degenerate solutions while allowing the inner minimization to be solved inexactly. Specifically, we prove that when the problem possesses the so-called Luo-Tseng error bound (EB) property, $\textsf{IRPN}$ converges globally to an optimal solution, and the local convergence rate of the sequence of iterates generated by $\textsf{IRPN}$ is linear, superlinear, or even quadratic, depending on the choice of parameters of the algorithm. Prior to this work, such EB property has been extensively used to establish the linear convergence of various first-order methods. However, to the best of our knowledge, this work is the first to use the Luo-Tseng EB property to establish the superlinear convergence of SQA-type methods for non-smooth convex minimization. As a consequence of our result, $\textsf{IRPN}$ is capable of solving regularized regression or classification problems under the high-dimensional setting with provable convergence guarantees. We compare our proposed $\textsf{IRPN}$ with several empirically efficient algorithms by applying them to the $\ell_1$-regularized logistic regression problem. Experiment results show the competitiveness of our proposed method.

8.2CLDec 22, 2024Code

Evaluating LLM Reasoning in the Operations Research Domain with ORQA

Mahdi Mostajabdaveh, Timothy T. Yu, Samarendra Chandan Bindu Dash et al.

In this paper, we introduce and apply Operations Research Question Answering (ORQA), a new benchmark designed to assess the generalization capabilities of Large Language Models (LLMs) in the specialized technical domain of Operations Research (OR). This benchmark evaluates whether LLMs can emulate the knowledge and reasoning skills of OR experts when confronted with diverse and complex optimization problems. The dataset, developed by OR experts, features real-world optimization problems that demand multistep reasoning to construct their mathematical models. Our evaluations of various open source LLMs, such as LLaMA 3.1, DeepSeek, and Mixtral, reveal their modest performance, highlighting a gap in their ability to generalize to specialized technical domains. This work contributes to the ongoing discourse on LLMs generalization capabilities, offering valuable insights for future research in this area. The dataset and evaluation code are publicly available.

3.3AIJul 23, 2025

SMARTAPS: Tool-augmented LLMs for Operations Management

Timothy Tin Long Yu, Mahdi Mostajabdaveh, Jabo Serge Byusa et al.

Large language models (LLMs) present intriguing opportunities to enhance user interaction with traditional algorithms and tools in real-world applications. An advanced planning system (APS) is a sophisticated software that leverages optimization to help operations planners create, interpret, and modify an operational plan. While highly beneficial, many customers are priced out of using an APS due to the ongoing costs of consultants responsible for customization and maintenance. To address the need for a more accessible APS expressed by supply chain planners, we present SmartAPS, a conversational system built on a tool-augmented LLM. Our system provides operations planners with an intuitive natural language chat interface, allowing them to query information, perform counterfactual reasoning, receive recommendations, and execute scenario analysis to better manage their operation. A short video demonstrating the system has been released: https://youtu.be/KtIrJjlDbyw

15.1LGSep 19, 2021

Improving Fairness for Data Valuation in Horizontal Federated Learning

Zhenan Fan, Huang Fang, Zirui Zhou et al.

Federated learning is an emerging decentralized machine learning scheme that allows multiple data owners to work collaboratively while ensuring data privacy. The success of federated learning depends largely on the participation of data owners. To sustain and encourage data owners' participation, it is crucial to fairly evaluate the quality of the data provided by the data owners and reward them correspondingly. Federated Shapley value, recently proposed by Wang et al. [Federated Learning, 2020], is a measure for data value under the framework of federated learning that satisfies many desired properties for data valuation. However, there are still factors of potential unfairness in the design of federated Shapley value because two data owners with the same local data may not receive the same evaluation. We propose a new measure called completed federated Shapley value to improve the fairness of federated Shapley value. The design depends on completing a matrix consisting of all the possible contributions by different subsets of the data owners. It is shown under mild conditions that this matrix is approximately low-rank by leveraging concepts and tools from optimization. Both theoretical analysis and empirical evaluation verify that the proposed measure does improve fairness in many circumstances.

7.5LGSep 17, 2021Code

Achieving Model Fairness in Vertical Federated Learning

Changxin Liu, Zhenan Fan, Zirui Zhou et al.

Vertical federated learning (VFL) has attracted greater and greater interest since it enables multiple parties possessing non-overlapping features to strengthen their machine learning models without disclosing their private data and model parameters. Similar to other machine learning algorithms, VFL faces demands and challenges of fairness, i.e., the learned model may be unfairly discriminatory over some groups with sensitive attributes. To tackle this problem, we propose a fair VFL framework in this work. First, we systematically formulate the problem of training fair models in VFL, where the learning task is modelled as a constrained optimization problem. To solve it in a federated and privacy-preserving manner, we consider the equivalent dual form of the problem and develop an asynchronous gradient coordinate-descent ascent algorithm, where some active data parties perform multiple parallelized local updates per communication round to effectively reduce the number of communication rounds. The messages that the server sends to passive parties are deliberately designed such that the information necessary for local updates is released without intruding on the privacy of data and sensitive attributes. We rigorously study the convergence of the algorithm when applied to general nonconvex-concave min-max problems. We prove that the algorithm finds a $δ$-stationary point of the dual objective in $\mathcal{O}(δ^{-4})$ communication rounds under mild conditions. Finally, the extensive experiments on three benchmark datasets demonstrate the superior performance of our method in training fair models.

17.9LGSep 13, 2021

Training Fair Models in Federated Learning without Data Privacy Infringement

Xin Che, Jingdi Hu, Zirui Zhou et al.

Training fair machine learning models becomes more and more important. As many powerful models are trained by collaboration among multiple parties, each holding some sensitive data, it is natural to explore the feasibility of training fair models in federated learning so that the fairness of trained models, the data privacy of clients, and the collaboration between clients can be fully respected simultaneously. However, the task of training fair models in federated learning is challenging, since it is far from trivial to estimate the fairness of a model without knowing the private data of the participating parties, which is often constrained by privacy requirements in federated learning. In this paper, we first propose a federated estimation method to accurately estimate the fairness of a model without infringing the data privacy of any party. Then, we use the fairness estimation to formulate a novel problem of training fair models in federated learning. We develop FedFair, a well-designed federated learning framework, which can successfully train a fair model with high performance without data privacy infringement. Our extensive experiments on three real-world data sets demonstrate the excellent fair model training performance of our method.

36.8LGJul 7, 2020

Personalized Cross-Silo Federated Learning on Non-IID Data

Yutao Huang, Lingyang Chu, Zirui Zhou et al.

Non-IID data present a tough challenge for federated learning. In this paper, we explore a novel idea of facilitating pairwise collaborations between clients with similar data. We propose FedAMP, a new method employing federated attentive message passing to facilitate similar clients to collaborate more. We establish the convergence of FedAMP for both convex and non-convex models, and propose a heuristic method to further improve the performance of FedAMP when clients adopt deep neural networks as personalized models. Our extensive experiments on benchmark data sets demonstrate the superior performance of the proposed methods.

8.0OCJun 29, 2020Code

Non-Convex Exact Community Recovery in Stochastic Block Model

Peng Wang, Zirui Zhou, Anthony Man-Cho So

Community detection in graphs that are generated according to stochastic block models (SBMs) has received much attention lately. In this paper, we focus on the binary symmetric SBM -- in which a graph of $n$ vertices is randomly generated by first partitioning the vertices into two equal-sized communities and then connecting each pair of vertices with probability that depends on their community memberships -- and study the associated exact community recovery problem. Although the maximum-likelihood formulation of the problem is non-convex and discrete, we propose to tackle it using a popular iterative method called projected power iterations. To ensure fast convergence of the method, we initialize it using a point that is generated by another iterative method called orthogonal iterations, which is a classic method for computing invariant subspaces of a symmetric matrix. We show that in the logarithmic sparsity regime of the problem, with high probability the proposed two-stage method can exactly recover the two communities down to the information-theoretic limit in $\mathcal{O}(n\log^2n/\log\log n)$ time, which is competitive with a host of existing state-of-the-art methods that have the same recovery performance. We also conduct numerical experiments on both synthetic and real data sets to demonstrate the efficacy of our proposed method and complement our theoretical development.

30.2OCDec 11, 2015

A Unified Approach to Error Bounds for Structured Convex Optimization Problems

Zirui Zhou, Anthony Man-Cho So

Error bounds, which refer to inequalities that bound the distance of vectors in a test set to a given set by a residual function, have proven to be extremely useful in analyzing the convergence rates of a host of iterative methods for solving optimization problems. In this paper, we present a new framework for establishing error bounds for a class of structured convex optimization problems, in which the objective function is the sum of a smooth convex function and a general closed proper convex function. Such a class encapsulates not only fairly general constrained minimization problems but also various regularized loss minimization formulations in machine learning, signal processing, and statistics. Using our framework, we show that a number of existing error bound results can be recovered in a unified and transparent manner. To further demonstrate the power of our framework, we apply it to a class of nuclear-norm regularized loss minimization problems and establish a new error bound for this class under a strict complementarity-type regularity condition. We then complement this result by constructing an example to show that the said error bound could fail to hold without the regularity condition. Consequently, we obtain a rather complete answer to a question raised by Tseng. We believe that our approach will find further applications in the study of error bounds for structured convex optimization problems.