57.9ROJun 2Code
OpenEAI-Platform: An Open-source Embodied Artificial Intelligence Hardware-Software Unified PlatformJinyuan Zhang, Luoyi Fan, Leiyu Wang et al.
Embodied AI in the real world requires both accurate hardware and robust vision-language-action (VLA) policies. We present OpenEAI-Platform, a fully open-source platform that integrates a low-cost 6+1 degree-of-freedom (dof) robotic arm (OpenEAI-Arm) and a reproducible VLA model (OpenEAI-VLA). OpenEAI-Arm provides open-source mechanical designs for low manufacturing cost and compliant control methods for higher accuracy. OpenEAI-VLA builds on Qwen3-VL-4B and uses a Diffusion Transformer action head, and is trained in two stages with only open-source robot and multimodal datasets. Across four real-world manipulation tasks, OpenEAI-Arm outperforms two commercial 6+1-dof arms under the same policy, and OpenEAI-VLA achieves success rates comparable to the large-scale pretrained pi0 baseline with only limited pretraining data. We will release the full hardware designs, drivers, models, and training/data pipelines to support reproducible research and scalable data collection. Our codes, layouts, and models will be released after the paper is accepted.
LGAug 11, 2024
Pareto Front Shape-Agnostic Pareto Set Learning in Multi-Objective OptimizationRongguang Ye, Longcan Chen, Wei-Bin Kou et al.
Pareto set learning (PSL) is an emerging approach for acquiring the complete Pareto set of a multi-objective optimization problem. Existing methods primarily rely on the mapping of preference vectors in the objective space to Pareto optimal solutions in the decision space. However, the sampling of preference vectors theoretically requires prior knowledge of the Pareto front shape to ensure high performance of the PSL methods. Designing a sampling strategy of preference vectors is difficult since the Pareto front shape cannot be known in advance. To make Pareto set learning work effectively in any Pareto front shape, we propose a Pareto front shape-agnostic Pareto Set Learning (GPSL) that does not require the prior information about the Pareto front. The fundamental concept behind GPSL is to treat the learning of the Pareto set as a distribution transformation problem. Specifically, GPSL can transform an arbitrary distribution into the Pareto set distribution. We demonstrate that training a neural network by maximizing hypervolume enables the process of distribution transformation. Our proposed method can handle any shape of the Pareto front and learn the Pareto set without requiring prior knowledge. Experimental results show the high performance of our proposed method on diverse test problems compared with recent Pareto set learning algorithms.
CVApr 22, 2023
Detecting Adversarial Faces Using Only Real Face Self-PerturbationsQian Wang, Yongqin Xian, Hefei Ling et al.
Adversarial attacks aim to disturb the functionality of a target system by adding specific noise to the input samples, bringing potential threats to security and robustness when applied to facial recognition systems. Although existing defense techniques achieve high accuracy in detecting some specific adversarial faces (adv-faces), new attack methods especially GAN-based attacks with completely different noise patterns circumvent them and reach a higher attack success rate. Even worse, existing techniques require attack data before implementing the defense, making it impractical to defend newly emerging attacks that are unseen to defenders. In this paper, we investigate the intrinsic generality of adv-faces and propose to generate pseudo adv-faces by perturbing real faces with three heuristically designed noise patterns. We are the first to train an adv-face detector using only real faces and their self-perturbations, agnostic to victim facial recognition systems, and agnostic to unseen attacks. By regarding adv-faces as out-of-distribution data, we then naturally introduce a novel cascaded system for adv-face detection, which consists of training data self-perturbations, decision boundary regularization, and a max-pooling-based binary classifier focusing on abnormal local color aberrations. Experiments conducted on LFW and CelebA-HQ datasets with eight gradient-based and two GAN-based attacks validate that our method generalizes to a variety of unseen adversarial attacks.
CVJul 22, 2023
A Stronger Stitching Algorithm for Fisheye Images based on Deblurring and RegistrationJing Hao, Jingming Xie, Jinyuan Zhang et al.
Fisheye lens, which is suitable for panoramic imaging, has the prominent advantage of a large field of view and low cost. However, the fisheye image has a severe geometric distortion which may interfere with the stage of image registration and stitching. Aiming to resolve this drawback, we devise a stronger stitching algorithm for fisheye images by combining the traditional image processing method with deep learning. In the stage of fisheye image correction, we propose the Attention-based Nonlinear Activation Free Network (ANAFNet) to deblur fisheye images corrected by Zhang calibration method. Specifically, ANAFNet adopts the classical single-stage U-shaped architecture based on convolutional neural networks with soft-attention technique and it can restore a sharp image from a blurred image effectively. In the part of image registration, we propose the ORB-FREAK-GMS (OFG), a comprehensive image matching algorithm, to improve the accuracy of image registration. Experimental results demonstrate that panoramic images of superior quality stitching by fisheye images can be obtained through our method.
NEApr 12, 2024Code
Evolutionary Preference Sampling for Pareto Set LearningRongguang Ye, Longcan Chen, Jinyuan Zhang et al.
Recently, Pareto Set Learning (PSL) has been proposed for learning the entire Pareto set using a neural network. PSL employs preference vectors to scalarize multiple objectives, facilitating the learning of mappings from preference vectors to specific Pareto optimal solutions. Previous PSL methods have shown their effectiveness in solving artificial multi-objective optimization problems (MOPs) with uniform preference vector sampling. The quality of the learned Pareto set is influenced by the sampling strategy of the preference vector, and the sampling of the preference vector needs to be decided based on the Pareto front shape. However, a fixed preference sampling strategy cannot simultaneously adapt the Pareto front of multiple MOPs. To address this limitation, this paper proposes an Evolutionary Preference Sampling (EPS) strategy to efficiently sample preference vectors. Inspired by evolutionary algorithms, we consider preference sampling as an evolutionary process to generate preference vectors for neural network training. We integrate the EPS strategy into five advanced PSL methods. Extensive experiments demonstrate that our proposed method has a faster convergence speed than baseline algorithms on 7 testing problems. Our implementation is available at https://github.com/rG223/EPS.
41.6CRMay 14
Capacitive Touchscreens at Risk: A Practical Side-Channel Attack on Smartphones via Electromagnetic EmanationsYukun Cheng, Changhai Ou, Shiyu Zhu et al.
Capacitive touchscreens in modern smartphones introduce severe side-channel vulnerabilities. However, existing attacks often require restrictive conditions or invasive measurements. This paper presents TESLA, a novel, contactless electromagnetic (EM) side-channel attack that exploits inherent EM emanations during touchscreen scanning. We demonstrate that these emanations encode the spatiotemporal evolution of touch interactions, forming a unified leakage basis. By secretly placing an EM probe near the victim's device, TESLA enables attackers to extract highly sensitive information, including screen-unlocking PIN codes, keyboard inputs, interacting application categories, and continuous handwriting trajectories. Compared to existing attacks, TESLA offers a broader range of attack targets, more efficient sample acquisition, and operations in practical attack scenarios. Extensive evaluations on popular commercial smartphones, specifically the iPhone X, Xiaomi 10 Pro, Samsung S10, and Huawei Mate 30 Pro, validate the effectiveness of TESLA. It achieves remarkable inference accuracy in diverse settings such as private meeting rooms and public libraries, with success rates of 99.3% for PIN code recognition, 97.6% for keyboard input reconstruction, and 95.0% for application inference, respectively. Simultaneously, it attains a 76.8% character recognition accuracy and a high geometric similarity (Jaccard index of 0.74) for 2D handwriting trajectory reconstruction.
86.2AIApr 27
Can Current Agents Close the Discovery-to-Application Gap? A Case Study in MinecraftZhou Ziheng, Huacong Tang, Jinyuan Zhang et al.
Discovering causal regularities and applying them to build functional systems--the discovery-to-application loop--is a hallmark of general intelligence, yet evaluating this capacity has been hindered by the vast complexity gap between scientific discovery and real-world engineering. We introduce SciCrafter, a Minecraft-based benchmark that operationalizes this loop through parameterized redstone circuit tasks. Agents must ignite lamps in specified patterns (e.g., simultaneously or in timed sequences); scaling target parameters substantially increases construction complexity and required knowledge, forcing genuine discovery rather than reliance on memorized solutions. Evaluating frontier models including GPT-5.2, Gemini-3-Pro, and Claude-Opus-4.5 under a general-purpose code agent scaffold, we find that all plateau at approximately 26% success rate. To diagnose these failures, we decompose the loop into four capacities--knowledge gap identification, experimental discovery, knowledge consolidation, and knowledge application--and design targeted interventions whose marginal contributions serve as proxies for corresponding gaps. Our analysis reveals that although the general knowledge application capability still remains as the biggest gap across all models, for frontier models the knowledge gap identification starts to become a major hurdle--indicating the bottleneck is shifting from solving problems right to raising the right problems for current AI. We release SciCrafter as a diagnostic probe for future research on AI systems that navigate the full discovery-to-application loop.
94.2MAApr 28
Pythia: Toward Predictability-Driven Agent-Native LLM ServingShan Yu, Junyi Shu, Yuanjiang Ni et al.
As LLM applications grow more complex, developers are increasingly adopting multi-agent architectures to decompose workflows into specialized, collaborative components, introducing structure that constrains agent behavior and exposes useful semantic predictability. Unlike traditional LLM serving, which operates under highly dynamic and uncertain conditions, this structured topology enables opportunities to reduce runtime uncertainty -- yet existing systems fail to exploit it, treating agentic workloads as generic traffic and incurring significant inefficiencies. Our analysis of production traces from an agent-serving platform and an internal coding assistant reveals key bottlenecks, including low prefix cache hit rates, severe resource contention from long-context requests, and substantial queuing delays due to suboptimal scaling. To address these challenges, we propose Pythia, a multi-agent serving system that captures workflow semantics through a simple interface at the serving layer, unlocking new optimization opportunities and substantially improving throughput and job completion time over state-of-the-art baselines.
LGApr 12, 2024
Data-Driven Preference Sampling for Pareto Front LearningRongguang Ye, Lei Chen, Weiduo Liao et al.
Pareto front learning is a technique that introduces preference vectors in a neural network to approximate the Pareto front. Previous Pareto front learning methods have demonstrated high performance in approximating simple Pareto fronts. These methods often sample preference vectors from a fixed Dirichlet distribution. However, no fixed sampling distribution can be adapted to diverse Pareto fronts. Efficiently sampling preference vectors and accurately estimating the Pareto front is a challenge. To address this challenge, we propose a data-driven preference vector sampling framework for Pareto front learning. We utilize the posterior information of the objective functions to adjust the parameters of the sampling distribution flexibly. In this manner, the proposed method can sample preference vectors from the location of the Pareto front with a high probability. Moreover, we design the distribution of the preference vector as a mixture of Dirichlet distributions to improve the performance of the model in disconnected Pareto fronts. Extensive experiments validate the superiority of the proposed method compared with state-of-the-art algorithms.
70.6IRApr 6
FAVE: Flow-based Average Velocity Establishment for Sequential RecommendationKe Shi, Yao Zhang, Feng Guo et al.
Generative recommendation has emerged as a transformative paradigm for capturing the dynamic evolution of user intents in sequential recommendation. While flow-based methods improve the efficiency of diffusion models, they remain hindered by the ``Noise-to-Data'' paradigm, which introduces two critical inefficiencies: prior mismatch, where generation starts from uninformative noise, forcing a lengthy recovery trajectory; and linear redundancy, where iterative solvers waste computation on modeling deterministic preference transitions. To address these limitations, we propose a Flow-based Average Velocity Establishment (Fave) framework for one-step generation recommendation that learns a direct trajectory from an informative prior to the target distribution. Fave is structured via a progressive two-stage training strategy. In Stage 1, we establish a stable preference space through dual-end semantic alignment, applying constraints at both the source (user history) and target (next item) to prevent representation collapse. In Stage 2, we directly resolve the efficiency bottlenecks by introducing a semantic anchor prior, which initializes the flow with a masked embedding from the user's interaction history, providing an informative starting point. Then we learn a global average velocity, consolidating the multi-step trajectory into a single displacement vector, and enforce trajectory straightness via a JVP-based consistency constraint to ensure one-step generation. Extensive experiments on three benchmarks demonstrate that Fave not only achieves state-of-the-art recommendation performance but also delivers an order-of-magnitude improvement in inference efficiency, making it practical for latency-sensitive scenarios.
NEAug 3, 2017
Preselection via Classification: A Case Study on Evolutionary Multiobjective OptimizationJinyuan Zhang, Aimin Zhou, Ke Tang et al.
In evolutionary algorithms, a preselection operator aims to select the promising offspring solutions from a candidate offspring set. It is usually based on the estimated or real objective values of the candidate offspring solutions. In a sense, the preselection can be treated as a classification procedure, which classifies the candidate offspring solutions into promising ones and unpromising ones. Following this idea, we propose a classification based preselection (CPS) strategy for evolutionary multiobjective optimization. When applying classification based preselection, an evolutionary algorithm maintains two external populations (training data set) that consist of some selected good and bad solutions found so far; then it trains a classifier based on the training data set in each generation. Finally it uses the classifier to filter the unpromising candidate offspring solutions and choose a promising one from the generated candidate offspring set for each parent solution. In such cases, it is not necessary to estimate or evaluate the objective values of the candidate offspring solutions. The classification based preselection is applied to three state-of-the-art multiobjective evolutionary algorithms (MOEAs) and is empirically studied on two sets of test instances. The experimental results suggest that classification based preselection can successfully improve the performance of these MOEAs.