LGOct 27, 2023Code
Learning to Search Feasible and Infeasible Regions of Routing Problems with Flexible Neural k-OptYining Ma, Zhiguang Cao, Yeow Meng Chee
In this paper, we present Neural k-Opt (NeuOpt), a novel learning-to-search (L2S) solver for routing problems. It learns to perform flexible k-opt exchanges based on a tailored action factorization method and a customized recurrent dual-stream decoder. As a pioneering work to circumvent the pure feasibility masking scheme and enable the autonomous exploration of both feasible and infeasible regions, we then propose the Guided Infeasible Region Exploration (GIRE) scheme, which supplements the NeuOpt policy network with feasibility-related features and leverages reward shaping to steer reinforcement learning more effectively. Additionally, we equip NeuOpt with Dynamic Data Augmentation (D2A) for more diverse searches during inference. Extensive experiments on the Traveling Salesman Problem (TSP) and Capacitated Vehicle Routing Problem (CVRP) demonstrate that our NeuOpt not only significantly outstrips existing (masking-based) L2S solvers, but also showcases superiority over the learning-to-construct (L2C) and learning-to-predict (L2P) solvers. Notably, we offer fresh perspectives on how neural solvers can handle VRP constraints. Our code is available: https://github.com/yining043/NeuOpt.
LGOct 14, 2022
Learning Generalizable Models for Vehicle Routing Problems via Knowledge DistillationJieyi Bi, Yining Ma, Jiahai Wang et al.
Recent neural methods for vehicle routing problems always train and test the deep models on the same instance distribution (i.e., uniform). To tackle the consequent cross-distribution generalization concerns, we bring the knowledge distillation to this field and propose an Adaptive Multi-Distribution Knowledge Distillation (AMDKD) scheme for learning more generalizable deep models. Particularly, our AMDKD leverages various knowledge from multiple teachers trained on exemplar distributions to yield a light-weight yet generalist student model. Meanwhile, we equip AMDKD with an adaptive strategy that allows the student to concentrate on difficult distributions, so as to absorb hard-to-master knowledge more effectively. Extensive experimental results show that, compared with the baseline neural methods, our AMDKD is able to achieve competitive results on both unseen in-distribution and out-of-distribution instances, which are either randomly synthesized or adopted from benchmark datasets (i.e., TSPLIB and CVRPLIB). Notably, our AMDKD is generic, and consumes less computational resources for inference.
LGApr 25, 2022
Efficient Neural Neighborhood Search for Pickup and Delivery ProblemsYining Ma, Jingwen Li, Zhiguang Cao et al.
We present an efficient Neural Neighborhood Search (N2S) approach for pickup and delivery problems (PDPs). In specific, we design a powerful Synthesis Attention that allows the vanilla self-attention to synthesize various types of features regarding a route solution. We also exploit two customized decoders that automatically learn to perform removal and reinsertion of a pickup-delivery node pair to tackle the precedence constraint. Additionally, a diversity enhancement scheme is leveraged to further ameliorate the performance. Our N2S is generic, and extensive experiments on two canonical PDP variants show that it can produce state-of-the-art results among existing neural methods. Moreover, it even outstrips the well-known LKH3 solver on the more constrained PDP variant. Our implementation for N2S is available online.
DCFeb 9, 2023
FLAC: A Robust Failure-Aware Atomic Commit Protocol for Distributed TransactionsHexiang Pan, Quang-Trung Ta, Meihui Zhang et al.
In distributed transaction processing, atomic commit protocol (ACP) is used to ensure database consistency. With the use of commodity compute nodes and networks, failures such as system crashes and network partitioning are common. It is therefore important for ACP to dynamically adapt to the operating condition for efficiency while ensuring the consistency of the database. Existing ACPs often assume stable operating conditions, hence, they are either non-generalizable to different environments or slow in practice. In this paper, we propose a novel and practical ACP, called Failure-Aware Atomic Commit (FLAC). In essence, FLAC includes three protocols, which are specifically designed for three different environments: (i) no failure occurs, (ii) participant nodes might crash but there is no delayed connection, or (iii) both crashed nodes and delayed connection can occur. It models these environments as the failure-free, crash-failure, and network-failure robustness levels. During its operation, FLAC can monitor if any failure occurs and dynamically switch to operate the most suitable protocol, using a robustness level state machine, whose parameters are fine-tuned by reinforcement learning. Consequently, it improves both the response time and throughput, and effectively handles nodes distributed across the Internet where crash and network failures might occur. We implement FLAC in a distributed transactional key-value storage system based on Google Percolator and evaluate its performance with both a micro benchmark and a macro benchmark of real workload. The results show that FLAC achieves up to 2.22x throughput improvement and 2.82x latency speedup, compared to existing ACPs for high-contention workloads.
LGOct 14, 2022
Federated Best Arm Identification with Heterogeneous ClientsZhirui Chen, P. N. Karthik, Vincent Y. F. Tan et al.
We study best arm identification in a federated multi-armed bandit setting with a central server and multiple clients, when each client has access to a {\em subset} of arms and each arm yields independent Gaussian observations. The goal is to identify the best arm of each client subject to an upper bound on the error probability; here, the best arm is one that has the largest {\em average} value of the means averaged across all clients having access to the arm. Our interest is in the asymptotics as the error probability vanishes. We provide an asymptotic lower bound on the growth rate of the expected stopping time of any algorithm. Furthermore, we show that for any algorithm whose upper bound on the expected stopping time matches with the lower bound up to a multiplicative constant ({\em almost-optimal} algorithm), the ratio of any two consecutive communication time instants must be {\em bounded}, a result that is of independent interest. We thereby infer that an algorithm can communicate no more sparsely than at exponential time instants in order to be almost-optimal. For the class of almost-optimal algorithms, we present the first-of-its-kind asymptotic lower bound on the expected number of {\em communication rounds} until stoppage. We propose a novel algorithm that communicates at exponential time instants, and demonstrate that it is asymptotically almost-optimal.
CVMay 25, 2022
Primitive3D: 3D Object Dataset Synthesis from Randomly Assembled PrimitivesXinke Li, Henghui Ding, Zekun Tong et al.
Numerous advancements in deep learning can be attributed to the access to large-scale and well-annotated datasets. However, such a dataset is prohibitively expensive in 3D computer vision due to the substantial collection cost. To alleviate this issue, we propose a cost-effective method for automatically generating a large amount of 3D objects with annotations. In particular, we synthesize objects simply by assembling multiple random primitives. These objects are thus auto-annotated with part labels originating from primitives. This allows us to perform multi-task learning by combining the supervised segmentation with unsupervised reconstruction. Considering the large overhead of learning on the generated dataset, we further propose a dataset distillation strategy to remove redundant samples regarding a target dataset. We conduct extensive experiments for the downstream tasks of 3D object classification. The results indicate that our dataset, together with multi-task pretraining on its annotations, achieves the best performance compared to other commonly used datasets. Further study suggests that our strategy can improve the model performance by pretraining and fine-tuning scheme, especially for the dataset with a small scale. In addition, pretraining with the proposed dataset distillation method can save 86\% of the pretraining time with negligible performance degradation. We expect that our attempt provides a new data-centric perspective for training 3D deep models.
87.9QUANT-PHApr 25
A Mixture of Experts Vision Transformer for High-Fidelity Surface Code DecodingHoang Viet Nguyen, Manh Hung Nguyen, Hoang Ta et al.
Quantum error correction is a key ingredient for large scale quantum computation, protecting logical information from physical noise by encoding it into many physical qubits. Topological stabilizer codes are particularly appealing due to their geometric locality and practical relevance. In these codes, stabilizer measurements yield a syndrome that must be decoded into a recovery operation, making decoding a central bottleneck for scalable real time operation. Existing decoders are commonly classified into two categories. Classical algorithmic decoders provide strong and well established baselines, but may incur substantial computational overhead at large code distances or under stringent latency constraints. Machine learning based decoders offer fast GPU inference and flexible function approximation, yet many approaches do not explicitly exploit the lattice geometry and local structure of topological codes, which can limit performance. In this work, we propose QuantumSMoE, a quantum vision transformer based decoder that incorporates code structure through plus shaped embeddings and adaptive masking to capture local interactions and lattice connectivity, and improves scalability via a mixture of experts layer with a novel auxiliary loss. Experiments on the toric code demonstrate that QuantumSMoE outperforms state-of-the-art machine learning decoders as well as widely used classical baselines.
59.0ITApr 23
Sequence Reconstruction for Sticky Insertion/Deletion ChannelsVan Long Phuoc Pham, Yeow Meng Chee, Kui Cai et al.
The sequence reconstruction problem for insertion/deletion channels has attracted significant attention owing to their applications recently in some emerging data storage systems, such as racetrack memories, DNA-based data storage. Our goal is to investigate the reconstruction problem for sticky-insdel channels where both sticky-insertions and sticky-deletions occur. If there are only sticky-insertion errors, the reconstruction problem for sticky-insertion channel is a special case of the reconstruction problem for tandem-duplication channel which has been well-studied. In this work, we consider the $(t, s)$-sticky-insdel channel where there are at most $t$ sticky-insertion errors and $s$ sticky-deletion errors when we transmit a message through the channel. For the reconstruction problem, we are interested in the minimum number of distinct outputs from these channels that are needed to uniquely recover the transmitted vector. We first provide a recursive formula to determine the minimum number of distinct outputs required. Next, we provide an efficient algorithm to reconstruct the transmitted vector from erroneous sequences.
85.8QUANT-PHApr 19
Efficient Approximation of Quantum Channel Fidelity Exploiting SymmetryYeow Meng Chee, Hoang Ta, Van Khu Vu
Determining the optimal fidelity for the transmission of quantum information over noisy quantum channels is one of the central problems in quantum information theory. Recently, [Berta-Borderi-Fawzi-Scholz, Mathematical Programming, 2021] introduced an asymptotically converging semidefinite programming hierarchy of outer bounds for this quantity. However, the size of the semidefinite programs (SDPs) grows exponentially with respect to the level of the hierarchy, thus making their computation unscalable. In this work, by exploiting the symmetries in the SDP, we show that, for a fixed output dimension of the quantum channel, we can compute the SDP in time polynomial with respect to the level of the hierarchy and input dimension. As a direct consequence of our result, the optimal fidelity can be approximated with an accuracy of $ε$ in $\mathrm{poly}(1/ε, \text{input dimension})$ time.
ITJun 7, 2022
Neural Network Decoders for Permutation Codes Correcting Different ErrorsYeow Meng Chee, Hui Zhang
Permutation codes were extensively studied in order to correct different types of errors for the applications on power line communication and rank modulation for flash memory. In this paper, we introduce the neural network decoders for permutation codes to correct these errors with one-shot decoding, which treat the decoding as $n$ classification tasks for non-binary symbols for a code of length $n$. These are actually the first general decoders introduced to deal with any error type for these two applications. The performance of the decoders is evaluated by simulations with different error models.
LGJan 23, 2025
Optimal Multi-Objective Best Arm Identification with Fixed ConfidenceZhirui Chen, P. N. Karthik, Yeow Meng Chee et al.
We consider a multi-armed bandit setting with finitely many arms, in which each arm yields an $M$-dimensional vector reward upon selection. We assume that the reward of each dimension (a.k.a. {\em objective}) is generated independently of the others. The best arm of any given objective is the arm with the largest component of mean corresponding to the objective. The end goal is to identify the best arm of {\em every} objective in the shortest (expected) time subject to an upper bound on the probability of error (i.e., fixed-confidence regime). We establish a problem-dependent lower bound on the limiting growth rate of the expected stopping time, in the limit of vanishing error probabilities. This lower bound, we show, is characterised by a max-min optimisation problem that is computationally expensive to solve at each time step. We propose an algorithm that uses the novel idea of {\em surrogate proportions} to sample the arms at each time step, eliminating the need to solve the max-min optimisation problem at each step. We demonstrate theoretically that our algorithm is asymptotically optimal. In addition, we provide extensive empirical studies to substantiate the efficiency of our algorithm. While existing works on pure exploration with multi-objective multi-armed bandits predominantly focus on {\em Pareto frontier identification}, our work fills the gap in the literature by conducting a formal investigation of the multi-objective best arm identification problem.
LGJan 17, 2024
Fixed-Budget Differentially Private Best Arm IdentificationZhirui Chen, P. N. Karthik, Yeow Meng Chee et al.
We study best arm identification (BAI) in linear bandits in the fixed-budget regime under differential privacy constraints, when the arm rewards are supported on the unit interval. Given a finite budget $T$ and a privacy parameter $\varepsilon>0$, the goal is to minimise the error probability in finding the arm with the largest mean after $T$ sampling rounds, subject to the constraint that the policy of the decision maker satisfies a certain {\em $\varepsilon$-differential privacy} ($\varepsilon$-DP) constraint. We construct a policy satisfying the $\varepsilon$-DP constraint (called {\sc DP-BAI}) by proposing the principle of {\em maximum absolute determinants}, and derive an upper bound on its error probability. Furthermore, we derive a minimax lower bound on the error probability, and demonstrate that the lower and the upper bounds decay exponentially in $T$, with exponents in the two bounds matching order-wise in (a) the sub-optimality gaps of the arms, (b) $\varepsilon$, and (c) the problem complexity that is expressible as the sum of two terms, one characterising the complexity of standard fixed-budget BAI (without privacy constraints), and the other accounting for the $\varepsilon$-DP constraint. Additionally, we present some auxiliary results that contribute to the derivation of the lower bound on the error probability. These results, we posit, may be of independent interest and could prove instrumental in proving lower bounds on error probabilities in several other bandit problems. Whereas prior works provide results for BAI in the fixed-budget regime without privacy constraints or in the fixed-confidence regime with privacy constraints, our work fills the gap in the literature by providing the results for BAI in the fixed-budget regime under the $\varepsilon$-DP constraint.
DCMar 4, 2021
Serverless Data Science -- Are We There Yet? A Case Study of Model ServingYuncheng Wu, Tien Tuan Anh Dinh, Guoyu Hu et al.
Machine learning (ML) is an important part of modern data science applications. Data scientists today have to manage the end-to-end ML life cycle that includes both model training and model serving, the latter of which is essential, as it makes their works available to end-users. Systems of model serving require high performance, low cost, and ease of management. Cloud providers are already offering model serving choices, including managed services and self-rented servers. Recently, serverless computing, whose advantages include high elasticity and a fine-grained cost model, brings another option for model serving. Our goal in this paper is to examine the viability of serverless as a mainstream model serving platform. To this end, we first conduct a comprehensive evaluation of the performance and cost of serverless against other model serving systems on Amazon Web Service and Google Cloud Platform. We find that serverless outperforms many cloud-based alternatives. Further, there are settings under which it even achieves better performance than GPU-based systems. Next, we present the design space of serverless model serving, which comprises multiple dimensions, including cloud platforms, serving runtimes, and other function-specific parameters. For each dimension, we analyze the impact of different choices and provide suggestions for data scientists to better utilize serverless model serving. Finally, we discuss challenges and opportunities in building a more practical serverless model serving system.
CRDec 25, 2019
Efficient Algorithm for the Linear Complexity of Sequences and Some Related ConsequencesYeow Meng Chee, Johan Chrisnata, Tuvi Etzion et al.
The linear complexity of a sequence $s$ is one of the measures of its predictability. It represents the smallest degree of a linear recursion which the sequence satisfies. There are several algorithms to find the linear complexity of a periodic sequence $s$ of length $N$ (where $N$ is of some given form) over a finite field $F_q$ in $O(N)$ symbol field operations. The first such algorithm is The Games-Chan Algorithm which considers binary sequences of period $2^n$, and is known for its extreme simplicity. We generalize this algorithm and apply it efficiently for several families of binary sequences. Our algorithm is very simple, it requires $βN$ bit operations for a small constant $β$, where $N$ is the period of the sequence. We make an analysis on the number of bit operations required by the algorithm and compare it with previous algorithms. In the process, the algorithm also finds the recursion for the shortest linear feedback shift-register which generates the sequence. Some other interesting properties related to shift-register sequences, which might not be too surprising but generally unnoted, are also consequences of our exposition.
CRAug 26, 2012
On Bringer-Chabanne EPIR Protocol for Polynomial EvaluationYeow Meng Chee, Huaxiong Wang, Liang Feng Zhang
Extended private information retrieval (EPIR) was defined by \cite{BCPT07} at CANS'07 and generalized by \cite{BC09} at AFRICACRYPT'09. In the generalized setting, EPIR allows a user to evaluate a function on a database block such that the database can learn neither which function has been evaluated nor on which block the function has been evaluated and the user learns no more information on the database blocks except for the expected result. An EPIR protocol for evaluating polynomials over a finite field $L$ was proposed by Bringer and Chabanne in \cite{BC09}. We show that the protocol does not satisfy the correctness requirement as they have claimed. In particular, we show that it does not give the user the expected result with large probability if one of the coefficients of the polynomial to be evaluated is primitive in $L$ and the others belong to the prime subfield of $L$.