Wei Cui

LG
h-index25
34papers
919citations
Novelty47%
AI Score56

34 Papers

94.0CLMay 28Code
CanLegalRAGBench: Evaluating Retrieval-Augmented Generation on Canadian Case Law

Ethan Zhao, Maksym Taranukhin, Wei Cui et al.

RAG-based legal assistants have been growing in popularity, but LLM hallucinations remain a key issue and potentially undermines justice. While benchmarks have been developed to evaluate progress, many rely on synthetic queries rather than realistic legal scenarios. Moreover, Canadian law remains underrepresented in existing evaluations. To address this gap, we introduce CanLegalRAGBench, a Canadian legal QA benchmark based on realistic queries and expert-annotated answers grounded in case law. Our evaluation shows that retrieval performance is sensitive to design choices and that open-source embedding models are competitive with closed source models. However, it also reveals the limitation of automatic evaluations that penalize systems for retrieving alternative relevant documents. We also find that generated answers often diverge from gold responses, either with hallucinations or by producing overly detailed or irrelevant content, with 8-29% of claims not being supported by the retrieved documents. We hope this benchmark will help drive continued progress in addressing limitations of legal RAG systems.

DCJun 7, 2022
Tutel: Adaptive Mixture-of-Experts at Scale

Changho Hwang, Wei Cui, Yifan Xiong et al. · microsoft-research

Sparsely-gated mixture-of-experts (MoE) has been widely adopted to scale deep learning models to trillion-plus parameters with fixed computational cost. The algorithmic performance of MoE relies on its token routing mechanism that forwards each input token to the right sub-models or experts. While token routing dynamically determines the amount of expert workload at runtime, existing systems suffer inefficient computation due to their static execution, namely static parallelism and pipelining, which does not adapt to the dynamic workload. We present Flex, a highly scalable stack design and implementation for MoE with dynamically adaptive parallelism and pipelining. Flex designs an identical layout for distributing MoE model parameters and input data, which can be leveraged by all possible parallelism or pipelining methods without any mathematical inequivalence or tensor migration overhead. This enables adaptive parallelism/pipelining optimization at zero cost during runtime. Based on this key design, Flex also implements various MoE acceleration techniques. Aggregating all techniques, Flex finally delivers huge speedup at any scale -- 4.96x and 5.75x speedup of a single MoE layer over 16 and 2,048 A100 GPUs, respectively, over the previous state-of-the-art. Our evaluation shows that Flex efficiently and effectively runs a real-world MoE-based model named SwinV2-MoE, built upon Swin Transformer V2, a state-of-the-art computer vision architecture. On efficiency, Flex accelerates SwinV2-MoE, achieving up to 1.55x and 2.11x speedup in training and inference over Fairseq, respectively. On effectiveness, the SwinV2-MoE model achieves superior accuracy in both pre-training and down-stream computer vision tasks such as COCO object detection than the counterpart dense model, indicating the readiness of Flex for end-to-end real-world model training and inference.

81.3LGMay 27
Conf-Gen: Conformal Uncertainty Quantification for Generative Models

Gabriel Loaiza-Ganem, Kevin Zhang, Wei Cui et al.

Conformal prediction (CP) and its extension, conformal risk control (CRC), are established frameworks for quantifying uncertainty in supervised machine learning through formal guarantees. However, recent breakthroughs in artificial intelligence (AI) have been driven by unsupervised generative models, such as large language models (LLMs) and image generators, which are not directly compatible with CP or CRC. In this work we introduce conformal generation (Conf-Gen), a general framework adapting CRC to generative tasks while relaxing its theoretical assumptions. Conf-Gen unifies and generalizes previous attempts to apply CP to LLMs, and extends conformal methodology to entirely new domains. We demonstrate the flexibility of Conf-Gen through some novel applications, including obtaining conformal guarantees on: image generators producing non-memorized images, conversational AI systems having asked enough clarifying questions, and the output of AI agents being correct.

CVSep 21, 2022
AirFi: Empowering WiFi-based Passive Human Gesture Recognition to Unseen Environment via Domain Generalization

Dazhuo Wang, Jianfei Yang, Wei Cui et al.

WiFi-based smart human sensing technology enabled by Channel State Information (CSI) has received great attention in recent years. However, CSI-based sensing systems suffer from performance degradation when deployed in different environments. Existing works solve this problem by domain adaptation using massive unlabeled high-quality data from the new environment, which is usually unavailable in practice. In this paper, we propose a novel augmented environment-invariant robust WiFi gesture recognition system named AirFi that deals with the issue of environment dependency from a new perspective. The AirFi is a novel domain generalization framework that learns the critical part of CSI regardless of different environments and generalizes the model to unseen scenarios, which does not require collecting any data for adaptation to the new environment. AirFi extracts the common features from several training environment settings and minimizes the distribution differences among them. The feature is further augmented to be more robust to environments. Moreover, the system can be further improved by few-shot learning techniques. Compared to state-of-the-art methods, AirFi is able to work in different environment settings without acquiring any CSI data from the new environment. The experimental results demonstrate that our system remains robust in the new environment and outperforms the compared systems.

MENov 3, 2015
A Coherent Integration Method Based on Radon Non-uniform FRFT for Random Pulse Repetition Interval (RPRI) Radar

Jing Tian, Xiang-Gen Xia, Gang Yang et al.

To solve the range cell migration (RCM) and spectrum spread during the integration time induced by the motion of a target, this paper proposes a new coherent integration method based on Radon non-uniform FRFT (NUFRFT) for random pulse repetition interval (RPRI) radar. In this method, RCM is eliminated via searching in the motion parameters space and the spectrum spread is resolved by using NUFRFT. Comparisons with other popular methods, moving target detection (MTD), Radon-Fourier transform (RFT), and Radon-Fractional Fourier Transform (RFRFT) are performed. The simulation results demonstrate that the proposed method can detect the moving target even in low SNR scenario and is superior to the other two methods.

AISep 16, 2023
Empowering In-Browser Deep Learning Inference on Edge Devices with Just-in-Time Kernel Optimizations

Fucheng Jia, Shiqi Jiang, Ting Cao et al.

Web is increasingly becoming the primary platform to deliver AI services onto edge devices, making in-browser deep learning (DL) inference more prominent. Nevertheless, the heterogeneity of edge devices, combined with the underdeveloped state of Web hardware acceleration practices, hinders current in-browser inference from achieving its full performance potential on target devices. To address this issue, this paper presents the pioneering inbrowser inference system, nnJIT, which enables just-in-time (JIT) auto-generation of optimized computing kernels for edge devices. nnJIT is built upon two novel techniques that significantly reduce kernel search and compilation overhead while improving performance firmly: Tensor-Web Compiling Co-Design lowers compiling costs by around 100X through eliminating redundant and ineffective compiling passes; Web-Specific Lite Kernel Optimization Space reduces kernel tuning costs by focusing on Web programming requirements and efficient device resource utilization, pruning the optimization space from millions to only dozens. nnJIT is evaluated for modern models, e.g., BART, T5, and Llama 2, on a range of edge devices including laptops and smartphones using different browsers and hardware from ARM, Intel, AMD and Nvidia. The results show that nnJIT can achieve up to 8.2X faster within 30 seconds compared to the existing baselines.

CVDec 18, 2025Code
SARMAE: Masked Autoencoder for SAR Representation Learning

Danxu Liu, Di Wang, Hebaixu Wang et al.

Synthetic Aperture Radar (SAR) imagery plays a critical role in all-weather, day-and-night remote sensing applications. However, existing SAR-oriented deep learning is constrained by data scarcity, while the physically grounded speckle noise in SAR imagery further hampers fine-grained semantic representation learning. To address these challenges, we propose SARMAE, a Noise-Aware Masked Autoencoder for self-supervised SAR representation learning. Specifically, we construct SAR-1M, the first million-scale SAR dataset, with additional paired optical images, to enable large-scale pre-training. Building upon this, we design Speckle-Aware Representation Enhancement (SARE), which injects SAR-specific speckle noise into masked autoencoders to facilitate noise-aware and robust representation learning. Furthermore, we introduce Semantic Anchor Representation Constraint (SARC), which leverages paired optical priors to align SAR features and ensure semantic consistency. Extensive experiments across multiple SAR datasets demonstrate that SARMAE achieves state-of-the-art performance on classification, detection, and segmentation tasks. Code and models will be available at https://github.com/MiliLab/SARMAE.

96.4ROApr 15
RobotPan: A 360$^\circ$ Surround-View Robotic Vision System for Embodied Perception

Jiahao Ma, Qiang Zhang, Peiran Liu et al.

Surround-view perception is increasingly important for robotic navigation and loco-manipulation, especially in human-in-the-loop settings such as teleoperation, data collection, and emergency takeover. However, current robotic visual interfaces are often limited to narrow forward-facing views, or, when multiple on-board cameras are available, require cumbersome manual switching that interrupts the operator's workflow. Both configurations suffer from motion-induced jitter that causes simulator sickness in head-mounted displays. We introduce a surround-view robotic vision system that combines six cameras with LiDAR to provide full 360$^\circ$ visual coverage, while meeting the geometric and real-time constraints of embodied deployment. We further present \textsc{RobotPan}, a feed-forward framework that predicts \emph{metric-scaled} and \emph{compact} 3D Gaussians from calibrated sparse-view inputs for real-time rendering, reconstruction, and streaming. \textsc{RobotPan} lifts multi-view features into a unified spherical coordinate representation and decodes Gaussians using hierarchical spherical voxel priors, allocating fine resolution near the robot and coarser resolution at larger radii to reduce computational redundancy without sacrificing fidelity. To support long sequences, our online fusion updates dynamic content while preventing unbounded growth in static regions by selectively updating appearance. Finally, we release a multi-sensor dataset tailored to 360$^\circ$ novel view synthesis and metric 3D reconstruction for robotics, covering navigation, manipulation, and locomotion on real platforms. Experiments show that \textsc{RobotPan} achieves competitive quality against prior feed-forward reconstruction and view-synthesis methods while producing substantially fewer Gaussians, enabling practical real-time embodied deployment. Project website: https://robotpan.github.io/

31.6LGApr 16
One-shot learning for the complex dynamical behaviors of weakly nonlinear forced oscillators

Teng Ma, Luca Rosafalco, Wei Cui et al.

Extrapolative prediction of complex nonlinear dynamics remains a central challenge in engineering. This study proposes a one-shot learning method to identify global frequency-response curves from a single excitation time history by learning governing equations. We introduce MEv-SINDy (Multi-frequency Evolutionary Sparse Identification of Nonlinear Dynamics) to infer the governing equations of non-autonomous and multi-frequency systems. The methodology leverages the Generalized Harmonic Balance (GHB) method to decompose complex forced responses into a set of slow-varying evolution equations. We validated the capabilities of MEv-SINDy on two critical Micro-Electro-Mechanical Systems (MEMS). These applications include a nonlinear beam resonator and a MEMS micromirror. Our results show that the model trained on a single point accurately predicts softening/hardening effects and jump phenomena across a wide range of excitation levels. This approach significantly reduces the data acquisition burden for the characterization and design of nonlinear microsystems.

HEP-THSep 21, 2022
Machine Learning on generalized Complete Intersection Calabi-Yau Manifolds

Wei Cui, Xin Gao, Juntao Wang

Generalized Complete Intersection Calabi-Yau Manifold (gCICY) is a new construction of Calabi-Yau manifolds established recently. However, the generation of new gCICYs using standard algebraic method is very laborious. Due to this complexity, the number of gCICYs and their classification still remain unknown. In this paper, we try to make some progress in this direction using neural network. The results showed that our trained models can have a high precision on the existing type $(1,1)$ and type $(2,1)$ gCICYs in the literature. Moreover, They can achieve a $97\%$ precision in predicting new gCICY which is generated differently from those used for training and testing. This shows that machine learning could be an effective method to classify and generate new gCICY.

LGFeb 23, 2023
Uncertainty Injection: A Deep Learning Method for Robust Optimization

Wei Cui, Wei Yu

This paper proposes a paradigm of uncertainty injection for training deep learning model to solve robust optimization problems. The majority of existing studies on deep learning focus on the model learning capability, while assuming the quality and accuracy of the inputs data can be guaranteed. However, in realistic applications of deep learning for solving optimization problems, the accuracy of inputs, which are the problem parameters in this case, plays a large role. This is because, in many situations, it is often costly or sometime impossible to obtain the problem parameters accurately, and correspondingly, it is highly desirable to develop learning algorithms that can account for the uncertainties in the input and produce solutions that are robust against these uncertainties. This paper presents a novel uncertainty injection scheme for training machine learning models that are capable of implicitly accounting for the uncertainties and producing statistically robust solutions. We further identify the wireless communications as an application field where uncertainties are prevalent in problem parameters such as the channel coefficients. We show the effectiveness of the proposed training scheme in two applications: the robust power loading for multiuser multiple-input-multiple-output (MIMO) downlink transmissions; and the robust power control for device-to-device (D2D) networks.

LGJul 11, 2023
Reinforcement Learning with Non-Cumulative Objective

Wei Cui, Wei Yu

In reinforcement learning, the objective is almost always defined as a \emph{cumulative} function over the rewards along the process. However, there are many optimal control and reinforcement learning problems in various application fields, especially in communications and networking, where the objectives are not naturally expressed as summations of the rewards. In this paper, we recognize the prevalence of non-cumulative objectives in various problems, and propose a modification to existing algorithms for optimizing such objectives. Specifically, we dive into the fundamental building block for many optimal control and reinforcement learning algorithms: the Bellman optimality equation. To optimize a non-cumulative objective, we replace the original summation operation in the Bellman update rule with a generalized operation corresponding to the objective. Furthermore, we provide sufficient conditions on the form of the generalized operation as well as assumptions on the Markov decision process under which the globally optimal convergence of the generalized Bellman updates can be guaranteed. We demonstrate the idea experimentally with the bottleneck objective, i.e., the objectives determined by the minimum reward along the process, on classical optimal control and reinforcement learning tasks, as well as on two network routing problems on maximizing the flow rates.

LGApr 26, 2024Code
Tabular Data Contrastive Learning via Class-Conditioned and Feature-Correlation Based Augmentation

Wei Cui, Rasa Hosseinzadeh, Junwei Ma et al.

Contrastive learning is a model pre-training technique by first creating similar views of the original data, and then encouraging the data and its corresponding views to be close in the embedding space. Contrastive learning has witnessed success in image and natural language data, thanks to the domain-specific augmentation techniques that are both intuitive and effective. Nonetheless, in tabular domain, the predominant augmentation technique for creating views is through corrupting tabular entries via swapping values, which is not as sound or effective. We propose a simple yet powerful improvement to this augmentation technique: corrupting tabular data conditioned on class identity. Specifically, when corrupting a specific tabular entry from an anchor row, instead of randomly sampling a value in the same feature column from the entire table uniformly, we only sample from rows that are identified to be within the same class as the anchor row. We assume the semi-supervised learning setting, and adopt the pseudo labeling technique for obtaining class identities over all table rows. We also explore the novel idea of selecting features to be corrupted based on feature correlation structures. Extensive experiments show that the proposed approach consistently outperforms the conventional corruption method for tabular data classification tasks. Our code is available at https://github.com/willtop/Tabular-Class-Conditioned-SSL.

ROFeb 17
MeshMimic: Geometry-Aware Humanoid Motion Learning through 3D Scene Reconstruction

Qiang Zhang, Jiahao Ma, Peiran Liu et al.

Humanoid motion control has witnessed significant breakthroughs in recent years, with deep reinforcement learning (RL) emerging as a primary catalyst for achieving complex, human-like behaviors. However, the high dimensionality and intricate dynamics of humanoid robots make manual motion design impractical, leading to a heavy reliance on expensive motion capture (MoCap) data. These datasets are not only costly to acquire but also frequently lack the necessary geometric context of the surrounding physical environment. Consequently, existing motion synthesis frameworks often suffer from a decoupling of motion and scene, resulting in physical inconsistencies such as contact slippage or mesh penetration during terrain-aware tasks. In this work, we present MeshMimic, an innovative framework that bridges 3D scene reconstruction and embodied intelligence to enable humanoid robots to learn coupled "motion-terrain" interactions directly from video. By leveraging state-of-the-art 3D vision models, our framework precisely segments and reconstructs both human trajectories and the underlying 3D geometry of terrains and objects. We introduce an optimization algorithm based on kinematic consistency to extract high-quality motion data from noisy visual reconstructions, alongside a contact-invariant retargeting method that transfers human-environment interaction features to the humanoid agent. Experimental results demonstrate that MeshMimic achieves robust, highly dynamic performance across diverse and challenging terrains. Our approach proves that a low-cost pipeline utilizing only consumer-grade monocular sensors can facilitate the training of complex physical interactions, offering a scalable path toward the autonomous evolution of humanoid robots in unstructured environments.

QUANT-PHFeb 2, 2024
Variational Quantum Circuits Enhanced Generative Adversarial Network

Runqiu Shu, Xusheng Xu, Man-Hong Yung et al.

Generative adversarial network (GAN) is one of the widely-adopted machine-learning frameworks for a wide range of applications such as generating high-quality images, video, and audio contents. However, training a GAN could become computationally expensive for large neural networks. In this work, we propose a hybrid quantum-classical architecture for improving GAN (denoted as QC-GAN). The performance was examed numerically by benchmarking with a classical GAN using MindSpore Quantum on the task of hand-written image generation. The generator of the QC-GAN consists of a quantum variational circuit together with a one-layer neural network, and the discriminator consists of a traditional neural network. Leveraging the entangling and expressive power of quantum circuits, our hybrid architecture achieved better performance (Frechet Inception Distance) than the classical GAN, with much fewer training parameters and number of iterations for convergence. We have also demonstrated the superiority of QC-GAN over an alternative quantum GAN, namely pathGAN, which could hardly generate 16$\times$16 or larger images. This work demonstrates the value of combining ideas from quantum computing with machine learning for both areas of Quantum-for-AI and AI-for-Quantum.

CHEM-PHJan 22, 2024
Quantum-Inspired Machine Learning for Molecular Docking

Runqiu Shu, Bowen Liu, Zhaoping Xiong et al.

Molecular docking is an important tool for structure-based drug design, accelerating the efficiency of drug development. Complex and dynamic binding processes between proteins and small molecules require searching and sampling over a wide spatial range. Traditional docking by searching for possible binding sites and conformations is computationally complex and results poorly under blind docking. Quantum-inspired algorithms combining quantum properties and annealing show great advantages in solving combinatorial optimization problems. Inspired by this, we achieve an improved in blind docking by using quantum-inspired combined with gradients learned by deep learning in the encoded molecular space. Numerical simulation shows that our method outperforms traditional docking algorithms and deep learning-based algorithms over 10\%. Compared to the current state-of-the-art deep learning-based docking algorithm DiffDock, the success rate of Top-1 (RMSD<2) achieves an improvement from 33\% to 35\% in our same setup. In particular, a 6\% improvement is realized in the high-precision region(RMSD<1) on molecules data unseen in DiffDock, which demonstrates the well-generalized of our method.

CVMay 7, 2025
Occupancy World Model for Robots

Zhang Zhang, Qiang Zhang, Wei Cui et al.

Understanding and forecasting the scene evolutions deeply affect the exploration and decision of embodied agents. While traditional methods simulate scene evolutions through trajectory prediction of potential instances, current works use the occupancy world model as a generative framework for describing fine-grained overall scene dynamics. However, existing methods cluster on the outdoor structured road scenes, while ignoring the exploration of forecasting 3D occupancy scene evolutions for robots in indoor scenes. In this work, we explore a new framework for learning the scene evolutions of observed fine-grained occupancy and propose an occupancy world model based on the combined spatio-temporal receptive field and guided autoregressive transformer to forecast the scene evolutions, called RoboOccWorld. We propose the Conditional Causal State Attention (CCSA), which utilizes camera poses of next state as conditions to guide the autoregressive transformer to adapt and understand the indoor robotics scenarios. In order to effectively exploit the spatio-temporal cues from historical observations, Hybrid Spatio-Temporal Aggregation (HSTA) is proposed to obtain the combined spatio-temporal receptive field based on multi-scale spatio-temporal windows. In addition, we restructure the OccWorld-ScanNet benchmark based on local annotations to facilitate the evaluation of the indoor 3D occupancy scene evolution prediction task. Experimental results demonstrate that our RoboOccWorld outperforms state-of-the-art methods in indoor 3D occupancy scene evolution prediction task. The code will be released soon.

LGMar 31, 2024
Transfer Learning with Reconstruction Loss

Wei Cui, Wei Yu

In most applications of utilizing neural networks for mathematical optimization, a dedicated model is trained for each specific optimization objective. However, in many scenarios, several distinct yet correlated objectives or tasks often need to be optimized on the same set of problem inputs. Instead of independently training a different neural network for each problem separately, it would be more efficient to exploit the correlations between these objectives and to train multiple neural network models with shared model parameters and feature representations. To achieve this, this paper first establishes the concept of common information: the shared knowledge required for solving the correlated tasks, then proposes a novel approach for model training by adding into the model an additional reconstruction stage associated with a new reconstruction loss. This loss is for reconstructing the common information starting from a selected hidden layer in the model. The proposed approach encourages the learned features to be general and transferable, and therefore can be readily used for efficient transfer learning. For numerical simulations, three applications are studied: transfer learning on classifying MNIST handwritten digits, the device-to-device wireless network power allocation, and the multiple-input-single-output network downlink beamforming and localization. Simulation results suggest that the proposed approach is highly efficient in data and model complexity, is resilient to over-fitting, and has competitive performances.

ROJul 27, 2025
Humanoid Occupancy: Enabling A Generalized Multimodal Occupancy Perception System on Humanoid Robots

Wei Cui, Haoyu Wang, Wenkang Qin et al.

Humanoid robot technology is advancing rapidly, with manufacturers introducing diverse heterogeneous visual perception modules tailored to specific scenarios. Among various perception paradigms, occupancy-based representation has become widely recognized as particularly suitable for humanoid robots, as it provides both rich semantic and 3D geometric information essential for comprehensive environmental understanding. In this work, we present Humanoid Occupancy, a generalized multimodal occupancy perception system that integrates hardware and software components, data acquisition devices, and a dedicated annotation pipeline. Our framework employs advanced multi-modal fusion techniques to generate grid-based occupancy outputs encoding both occupancy status and semantic labels, thereby enabling holistic environmental understanding for downstream tasks such as task planning and navigation. To address the unique challenges of humanoid robots, we overcome issues such as kinematic interference and occlusion, and establish an effective sensor layout strategy. Furthermore, we have developed the first panoramic occupancy dataset specifically for humanoid robots, offering a valuable benchmark and resource for future research and development in this domain. The network architecture incorporates multi-modal feature fusion and temporal information integration to ensure robust perception. Overall, Humanoid Occupancy delivers effective environmental perception for humanoid robots and establishes a technical foundation for standardizing universal visual modules, paving the way for the widespread deployment of humanoid robots in complex real-world scenarios.

CVOct 9, 2025
SimCast: Enhancing Precipitation Nowcasting with Short-to-Long Term Knowledge Distillation

Yifang Yin, Shengkai Chen, Yiyao Li et al.

Precipitation nowcasting predicts future radar sequences based on current observations, which is a highly challenging task driven by the inherent complexity of the Earth system. Accurate nowcasting is of utmost importance for addressing various societal needs, including disaster management, agriculture, transportation, and energy optimization. As a complementary to existing non-autoregressive nowcasting approaches, we investigate the impact of prediction horizons on nowcasting models and propose SimCast, a novel training pipeline featuring a short-to-long term knowledge distillation technique coupled with a weighted MSE loss to prioritize heavy rainfall regions. Improved nowcasting predictions can be obtained without introducing additional overhead during inference. As SimCast generates deterministic predictions, we further integrate it into a diffusion-based framework named CasCast, leveraging the strengths from probabilistic models to overcome limitations such as blurriness and distribution shift in deterministic outputs. Extensive experimental results on three benchmark datasets validate the effectiveness of the proposed framework, achieving mean CSI scores of 0.452 on SEVIR, 0.474 on HKO-7, and 0.361 on MeteoNet, which outperforms existing approaches by a significant margin.

CLSep 21, 2025
FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions

Bowen Qin, Chen Yue, Fang Yin et al.

We conduct a moderate-scale contamination-free (to some extent) evaluation of current large reasoning models (LRMs) with some preliminary findings. We also release ROME, our evaluation benchmark for vision language models intended to test reasoning from visual clues. We attach links to the benchmark, evaluation data, and other updates on this website: https://flageval-baai.github.io/LRM-Eval/

LGMar 12, 2025
DRESS: Disentangled Representation-based Self-Supervised Meta-Learning for Diverse Tasks

Wei Cui, Tongzi Wu, Jesse C. Cresswell et al.

Meta-learning represents a strong class of approaches for solving few-shot learning tasks. Nonetheless, recent research suggests that simply pre-training a generic encoder can potentially surpass meta-learning algorithms. In this paper, we first discuss the reasons why meta-learning fails to stand out in these few-shot learning experiments, and hypothesize that it is due to the few-shot learning tasks lacking diversity. We propose DRESS, a task-agnostic Disentangled REpresentation-based Self-Supervised meta-learning approach that enables fast model adaptation on highly diversified few-shot learning tasks. Specifically, DRESS utilizes disentangled representation learning to create self-supervised tasks that can fuel the meta-training process. Furthermore, we also propose a class-partition based metric for quantifying the task diversity directly on the input space. We validate the effectiveness of DRESS through experiments on datasets with multiple factors of variation and varying complexity. The results suggest that DRESS is able to outperform competing methods on the majority of the datasets and task setups. Through this paper, we advocate for a re-examination of proper setups for task adaptation studies, and aim to reignite interest in the potential of meta-learning for solving few-shot learning tasks via disentangled representations.

QUANT-PHJan 4, 2022
Efficient Quantum Feature Extraction for CNN-based Learning

Tong Dou, Guofeng Zhang, Wei Cui

Recent work has begun to explore the potential of parametrized quantum circuits (PQCs) as general function approximators. In this work, we propose a quantum-classical deep network structure to enhance classical CNN model discriminability. The convolutional layer uses linear filters to scan the input data. Moreover, we build PQC, which is a more potent function approximator, with more complex structures to capture the features within the receptive field. The feature maps are obtained by sliding the PQCs over the input in a similar way as CNN. We also give a training algorithm for the proposed model. The hybrid models used in our design are validated by numerical simulation. We demonstrate the reasonable classification performances on MNIST and we compare the performances with models in different settings. The results disclose that the model with ansatz in high expressibility achieves lower cost and higher accuracy.

ITDec 8, 2021
Active Sensing for Communications by Learning

Foad Sohrabi, Tao Jiang, Wei Cui et al.

This paper proposes a deep learning approach to a class of active sensing problems in wireless communications in which an agent sequentially interacts with an environment over a predetermined number of time frames to gather information in order to perform a sensing or actuation task for maximizing some utility function. In such an active learning setting, the agent needs to design an adaptive sensing strategy sequentially based on the observations made so far. To tackle such a challenging problem in which the dimension of historical observations increases over time, we propose to use a long short-term memory (LSTM) network to exploit the temporal correlations in the sequence of observations and to map each observation to a fixed-size state information vector. We then use a deep neural network (DNN) to map the LSTM state at each time frame to the design of the next measurement step. Finally, we employ another DNN to map the final LSTM state to the desired solution. We investigate the performance of the proposed framework for adaptive channel sensing problems in wireless communications. In particular, we consider the adaptive beamforming problem for mmWave beam alignment and the adaptive reconfigurable intelligent surface sensing problem for reflection alignment. Numerical results demonstrate that the proposed deep active sensing strategy outperforms the existing adaptive or nonadaptive sensing schemes.

ITJan 3, 2021
Phase Transitions in Recovery of Structured Signals from Corrupted Measurements

Zhongxing Sun, Wei Cui, Yulong Liu

This paper is concerned with the problem of recovering a structured signal from a relatively small number of corrupted random measurements. Sharp phase transitions have been numerically observed in practice when different convex programming procedures are used to solve this problem. This paper is devoted to presenting theoretical explanations for these phenomenons by employing some basic tools from Gaussian process theory. Specifically, we identify the precise locations of the phase transitions for both constrained and penalized recovery procedures. Our theoretical results show that these phase transitions are determined by some geometric measures of structure, e.g., the spherical Gaussian width of a tangent cone and the Gaussian (squared) distance to a scaled subdifferential. By utilizing the established phase transition theory, we further investigate the relationship between these two kinds of recovery procedures, which also reveals an optimal strategy (in the sense of Lagrange theory) for choosing the tradeoff parameter in the penalized recovery procedure. Numerical experiments are provided to verify our theoretical results.

SPDec 22, 2020
Scalable Deep Reinforcement Learning for Routing and Spectrum Access in Physical Layer

Wei Cui, Wei Yu

This paper proposes a novel scalable reinforcement learning approach for simultaneous routing and spectrum access in wireless ad-hoc networks. In most previous works on reinforcement learning for network optimization, the network topology is assumed to be fixed, and a different agent is trained for each transmission node -- this limits scalability and generalizability. Further, routing and spectrum access are typically treated as separate tasks. Moreover, the optimization objective is usually a cumulative metric along the route, e.g., number of hops or delay. In this paper, we account for the physical-layer signal-to-interference-plus-noise ratio (SINR) in a wireless network and further show that bottleneck objective such as the minimum SINR along the route can also be optimized effectively using reinforcement learning. Specifically, we propose a scalable approach in which a single agent is associated with each flow and makes routing and spectrum access decisions as it moves along the frontier nodes. The agent is trained according to the physical-layer characteristics of the environment using a novel rewarding scheme based on the Monte Carlo estimation of the future bottleneck SINR. It learns to avoid interference by intelligently making joint routing and spectrum allocation decisions based on the geographical location information of the neighbouring nodes.

CRNov 6, 2020
Threats and Opportunities: Blockchain Meets Quantum Computation

Wei Cui, Tong Dou, Shilu Yan

This article considered deficiencies of the flourishing blockchain technology manifested by the development of quantum computation. We show that the future blockchain technology would under constant threats from the following aspects: 1) Speed up the generation of nonces; 2) Faster searching for hash collisions; 3) Break the security of the classical encryption. We also demonstrate that incorporating some quantum properties into blockchain makes it more robust and more efficient. For example people can establish a quantum-security blockchain system that utilizes quantum key distribution (QKD), and quantum synchronization and detectable Byzantine agreement (DBA) can help the blockchain systems achieve faster consensus even if there exist a number of malicious nodes.

LGJun 18, 2020
Algorithmic Decision Making with Conditional Fairness

Renzhe Xu, Peng Cui, Kun Kuang et al.

Nowadays fairness issues have raised great concerns in decision-making systems. Various fairness notions have been proposed to measure the degree to which an algorithm is unfair. In practice, there frequently exist a certain set of variables we term as fair variables, which are pre-decision covariates such as users' choices. The effects of fair variables are irrelevant in assessing the fairness of the decision support algorithm. We thus define conditional fairness as a more sound fairness metric by conditioning on the fairness variables. Given different prior knowledge of fair variables, we demonstrate that traditional fairness notations, such as demographic parity and equalized odds, are special cases of our conditional fairness notations. Moreover, we propose a Derivable Conditional Fairness Regularizer (DCFR), which can be integrated into any decision-making model, to track the trade-off between precision and fairness of algorithmic decision making. Specifically, an adversarial representation based conditional independence loss is proposed in our DCFR to measure the degree of unfairness. With extensive experiments on three real-world datasets, we demonstrate the advantages of our conditional fairness notation and DCFR.

MLAug 29, 2019
Data-based wind disaster climate identification algorithm and extreme wind speed prediction

Wei Cui, Teng Ma, Lin Zhao et al.

An extreme wind speed estimation method that considers wind hazard climate types is critical for design wind load calculation for building structures affected by mixed climates. However, it is very difficult to obtain wind hazard climate types from meteorological data records, because they restrict the application of extreme wind speed estimation in mixed climates. This paper first proposes a wind hazard type identification algorithm based on a numerical pattern recognition method that utilizes feature extraction and generalization. Next, it compares six commonly used machine learning models using K-fold cross-validation. Finally, it takes meteorological data from three locations near the southeast coast of China as examples to examine the algorithm performance. Based on classification results, the extreme wind speeds calculated based on mixed wind hazard types is compared with those obtained from conventional methods, and the effects on structural design for different return periods are discussed.

SPMay 11, 2019
ECG Identification under Exercise and Rest Situations via Various Learning Methods

Zihan Wang, Yaoguang Li, Wei Cui

As the advancement of information security, human recognition as its core technology, has absorbed an increasing amount of attention in the past few years. A myriad of biometric features including fingerprint, face, iris, have been applied to security systems, which are occasionally considered vulnerable to forgery and spoofing attacks. Due to the difficulty of being fabricated, electrocardiogram (ECG) has attracted much attention. Though many works have shown the excellent human identification provided by ECG, most current ECG human identification (ECGID) researches only focus on rest situation. In this manuscript, we overcome the oversimplification of previous researches and evaluate the performance under both exercise and rest situations, especially the influence of exercise on ECGID. By applying various existing learning methods to our ECG dataset, we find that current methods which can well support the identification of individuals under rests, do not suffice to present satisfying ECGID performance under exercise situations, therefore exposing the deficiency of existing ECG identification methods.

CYJan 29, 2019
Performance comparison of an AI-based Adaptive Learning System in China

Wei Cui, Zhen Xue, Khanh-Phuong Thai

Adaptive learning systems stand apart from traditional learning systems by offering a personalized learning experience to students according to their different knowledge states. Adaptive systems collect and analyse students' behavior data, update learner profiles, then accordingly provide timely individualized feedback to each student. Such interactions between the learning system and students can improve the engagement of students and the efficiency of learning. This paper evaluates the effectiveness of an adaptive learning system, "Yixue Squirrel AI" (or Yixue), on English and math learning in middle school. The effectiveness of the Yixue's math and English learning systems is respectively compared against (1) traditional classroom math instruction conducted by expert human teachers and (2) BOXFiSH, another adaptive learning platform for English language learning. Results suggest that students achieved better performance using Yixue adaptive learning system than both traditional classroom instruction by expert teachers and another adaptive learning platform.

SPAug 4, 2018
Spatial Deep Learning for Wireless Scheduling

Wei Cui, Kaiming Shen, Wei Yu

The optimal scheduling of interfering links in a dense wireless network with full frequency reuse is a challenging task. The traditional method involves first estimating all the interfering channel strengths then optimizing the scheduling based on the model. This model-based method is however resource intensive and computationally hard because channel estimation is expensive in dense networks; furthermore, finding even a locally optimal solution of the resulting optimization problem may be computationally complex. This paper shows that by using a deep learning approach, it is possible to bypass the channel estimation and to schedule links efficiently based solely on the geographic locations of the transmitters and the receivers, due to the fact that in many propagation environments, the wireless channel strength is largely a function of the distance dependent path-loss. This is accomplished by unsupervised training over randomly deployed networks, and by using a novel neural network architecture that computes the geographic spatial convolutions of the interfering or interfered neighboring nodes along with subsequent multiple feedback stages to learn the optimum solution. The resulting neural network gives near-optimal performance for sum-rate maximization and is capable of generalizing to larger deployment areas and to deployments of different link densities. Moreover, to provide fairness, this paper proposes a novel scheduling approach that utilizes the sum-rate optimal scheduling algorithm over judiciously chosen subsets of links for maximizing a proportional fairness objective over the network. The proposed approach shows highly competitive and generalizable network utility maximization results.

SPDec 11, 2017
Identifying the Mislabeled Training Samples of ECG Signals using Machine Learning

Yaoguang Li, Wei Cui, Cong Wang

The classification accuracy of electrocardiogram signal is often affected by diverse factors in which mislabeled training samples issue is one of the most influential problems. In order to mitigate this negative effect, the method of cross validation is introduced to identify the mislabeled samples. The method utilizes the cooperative advantages of different classifiers to act as a filter for the training samples. The filter removes the mislabeled training samples and retains the correctly labeled ones with the help of 10-fold cross validation. Consequently, a new training set is provided to the final classifiers to acquire higher classification accuracies. Finally, we numerically show the effectiveness of the proposed method with the MIT-BIH arrhythmia database.

CVMay 24, 2017
Optical Mapping Near-eye Three-dimensional Display with Correct Focus Cues

Wei Cui, Liang Gao

We present an optical mapping near-eye (OMNI) three-dimensional display method for wearable devices. By dividing a display screen into different sub-panels and optically mapping them to various depths, we create a multiplane volumetric image with correct focus cues for depth perception. The resultant system can drive the eye's accommodation to the distance that is consistent with binocular stereopsis, thereby alleviating the vergence-accommodation conflict, the primary cause for eye fatigue and discomfort. Compared with the previous methods, the OMNI display offers prominent advantages in adaptability, image dynamic range, and refresh rate.