SYMar 18
STLts-Div: Diversified Trace Synthesis from STL Specifications Using MILP (Extended Version)Martin Jouve-Genty, Han Su, Sota Sato et al.
Modern cyber-physical systems are complex, and requirements are often written in Signal Temporal Logic (STL). Writing the right STL is difficult in practice; engineers benefit from concrete executions that illustrate what a specification actually admits. Trace synthesis addresses this need, but a single witness rarely suffices to understand intent or explore edge cases - diverse satisfying behaviors are far more informative. We introduce diversified trace synthesis: the automatic generation of sets of behaviorally diverse traces that satisfy a given STL formula. Building on a MILP encoding of STL and system model, we formalize three complementary diversification objectives - Boolean distance, random Boolean distance, and value distance - all captured by an objective function and solved iteratively. We implement these ideas in STLts-Div, a lightweight Python tool that integrates with Gurobi.
CVNov 30, 2025
S2AM3D: Scale-controllable Part Segmentation of 3D Point CloudHan Su, Tianyu Huang, Zichen Wan et al.
Part-level point cloud segmentation has recently attracted significant attention in 3D computer vision. Nevertheless, existing research is constrained by two major challenges: native 3D models lack generalization due to data scarcity, while introducing 2D pre-trained knowledge often leads to inconsistent segmentation results across different views. To address these challenges, we propose S2AM3D, which incorporates 2D segmentation priors with 3D consistent supervision. We design a point-consistent part encoder that aggregates multi-view 2D features through native 3D contrastive learning, producing globally consistent point features. A scale-aware prompt decoder is then proposed to enable real-time adjustment of segmentation granularity via continuous scale signals. Simultaneously, we introduce a large-scale, high-quality part-level point cloud dataset with more than 100k samples, providing ample supervision signals for model training. Extensive experiments demonstrate that S2AM3D achieves leading performance across multiple evaluation settings, exhibiting exceptional robustness and controllability when handling complex structures and parts with significant size variations.
AIDec 19, 2023
Parameterized Decision-making with Multi-modal Perception for Autonomous DrivingYuyang Xia, Shuncheng Liu, Quanlin Yu et al.
Autonomous driving is an emerging technology that has advanced rapidly over the last decade. Modern transportation is expected to benefit greatly from a wise decision-making framework of autonomous vehicles, including the improvement of mobility and the minimization of risks and travel time. However, existing methods either ignore the complexity of environments only fitting straight roads, or ignore the impact on surrounding vehicles during optimization phases, leading to weak environmental adaptability and incomplete optimization objectives. To address these limitations, we propose a parameterized decision-making framework with multi-modal perception based on deep reinforcement learning, called AUTO. We conduct a comprehensive perception to capture the state features of various traffic participants around the autonomous vehicle, based on which we design a graph-based model to learn a state representation of the multi-modal semantic features. To distinguish between lane-following and lane-changing, we decompose an action of the autonomous vehicle into a parameterized action structure that first decides whether to change lanes and then computes an exact action to execute. A hybrid reward function takes into account aspects of safety, traffic efficiency, passenger comfort, and impact to guide the framework to generate optimal actions. In addition, we design a regularization term and a multi-worker paradigm to enhance the training. Extensive experiments offer evidence that AUTO can advance state-of-the-art in terms of both macroscopic and microscopic effectiveness.
HCMay 22, 2024
A New Era in Human Factors Engineering: A Survey of the Applications and Prospects of Large Multimodal ModelsLi Fan, Lee Ching-Hung, Han Su et al.
In recent years, the potential applications of Large Multimodal Models (LMMs) in fields such as healthcare, social psychology, and industrial design have attracted wide research attention, providing new directions for human factors research. For instance, LMM-based smart systems have become novel research subjects of human factors studies, and LMM introduces new research paradigms and methodologies to this field. Therefore, this paper aims to explore the applications, challenges, and future prospects of LMM in the domain of human factors and ergonomics through an expert-LMM collaborated literature review. Specifically, a novel literature review method is proposed, and research studies of LMM-based accident analysis, human modelling and intervention design are introduced. Subsequently, the paper discusses future trends of the research paradigm and challenges of human factors and ergonomics studies in the era of LMMs. It is expected that this study can provide a valuable perspective and serve as a reference for integrating human factors with artificial intelligence.
MLJan 31, 2025
Statistical Inference for Generative Model ComparisonZijun Gao, Yan Sun, Han Su
Generative models have achieved remarkable success across a range of applications, yet their evaluation still lacks principled uncertainty quantification. In this paper, we develop a method for comparing how close different generative models are to the underlying distribution of test samples. Particularly, our approach employs the Kullback-Leibler (KL) divergence to measure the distance between a generative model and the unknown test distribution, as KL requires no tuning parameters such as the kernels used by RKHS-based distances, and is the only $f$-divergence that admits a crucial cancellation to enable the uncertainty quantification. Furthermore, we extend our method to comparing conditional generative models and leverage Edgeworth expansions to address limited-data settings. On simulated datasets with known ground truth, we show that our approach realizes effective coverage rates, and has higher power compared to kernel-based methods. When applied to generative models on image and text datasets, our procedure yields conclusions consistent with benchmark metrics but with statistical confidence.
LGNov 20, 2025
Broad stochastic configuration residual learning system for norm-convergent universal approximationHan Su, Zhongyan Li, Wanquan Liu
Universal approximation serves as the foundation of neural network learning algorithms. However, some networks establish their universal approximation property by demonstrating that the iterative errors converge in probability measure rather than the more rigorous norm convergence, which makes the universal approximation property of randomized learning networks highly sensitive to random parameter selection, Broad residual learning system (BRLS), as a member of randomized learning models, also encounters this issue. We theoretically demonstrate the limitation of its universal approximation property, that is, the iterative errors do not satisfy norm convergence if the selection of random parameters is inappropriate and the convergence rate meets certain conditions. To address this issue, we propose the broad stochastic configuration residual learning system (BSCRLS) algorithm, which features a novel supervisory mechanism adaptively constraining the range settings of random parameters on the basis of BRLS framework, Furthermore, we prove the universal approximation theorem of BSCRLS based on the more stringent norm convergence. Three versions of incremental BSCRLS algorithms are presented to satisfy the application requirements of various network updates. Solar panels dust detection experiments are performed on publicly available dataset and compared with 13 deep and broad learning algorithms. Experimental results reveal the effectiveness and superiority of BSCRLS algorithms.
CVNov 28, 2025
Convolutional Feature Noise Reduction for 2D Cardiac MR Image SegmentationHong Zheng, Nan Mu, Han Su et al.
Noise reduction constitutes a crucial operation within Digital Signal Processing. Regrettably, it frequently remains neglected when dealing with the processing of convolutional features in segmentation networks. This oversight could trigger the butterfly effect, impairing the subsequent outcomes within the entire feature system. To complete this void, we consider convolutional features following Gaussian distributions as feature signal matrices and then present a simple and effective feature filter in this study. The proposed filter is fundamentally a low-amplitude pass filter primarily aimed at minimizing noise in feature signal inputs and is named Convolutional Feature Filter (CFF). We conducted experiments on two established 2D segmentation networks and two public cardiac MR image datasets to validate the effectiveness of the CFF, and the experimental findings demonstrated a decrease in noise within the feature signal matrices. To enable a numerical observation and analysis of this reduction, we developed a binarization equation to calculate the information entropy of feature signals.
AIOct 29, 2025
Energy-Efficient Autonomous Driving with Adaptive Perception and Robust DecisionYuyang Xia, Zibo Liang, Liwei Deng et al.
Autonomous driving is an emerging technology that is expected to bring significant social, economic, and environmental benefits. However, these benefits come with rising energy consumption by computation engines, limiting the driving range of vehicles, especially electric ones. Perception computing is typically the most power-intensive component, as it relies on largescale deep learning models to extract environmental features. Recently, numerous studies have employed model compression techniques, such as sparsification, quantization, and distillation, to reduce computational consumption. However, these methods often result in either a substantial model size or a significant drop in perception accuracy compared to high-computation models. To address these challenges, we propose an energy-efficient autonomous driving framework, called EneAD. In the adaptive perception module, a perception optimization strategy is designed from the perspective of data management and tuning. Firstly, we manage multiple perception models with different computational consumption and adjust the execution framerate dynamically. Then, we define them as knobs and design a transferable tuning method based on Bayesian optimization to identify promising knob values that achieve low computation while maintaining desired accuracy. To adaptively switch the knob values in various traffic scenarios, a lightweight classification model is proposed to distinguish the perception difficulty in different scenarios. In the robust decision module, we propose a decision model based on reinforcement learning and design a regularization term to enhance driving stability in the face of perturbed perception results. Extensive experiments evidence the superiority of our framework in both energy consumption and driving performance. EneAD can reduce perception consumption by 1.9x to 3.5x and thus improve driving range by 3.9% to 8.5%
CVMay 27, 2019
Giant Panda Face Recognition Using Small DatasetWojciech Michal Matkowski, Adams Wai Kin Kong, Han Su et al.
Giant panda (panda) is a highly endangered animal. Significant efforts and resources have been put on panda conservation. To measure effectiveness of conservation schemes, estimating its population size in wild is an important task. The current population estimation approaches, including capture-recapture, human visual identification and collection of DNA from hair or feces, are invasive, subjective, costly or even dangerous to the workers who perform these tasks in wild. Cameras have been widely installed in the regions where pandas live. It opens a new possibility for non-invasive image based panda recognition. Panda face recognition is naturally a small dataset problem, because of the number of pandas in the world and the number of qualified images captured by the cameras in each encounter. In this paper, a panda face recognition algorithm, which includes alignment, large feature set extraction and matching is proposed and evaluated on a dataset consisting of 163 images. The experimental results are encouraging.
LGMay 16, 2019
STAR: A Concise Deep Learning Framework for Citywide Human Mobility PredictionHongnian Wang, Han Su
Human mobility forecasting in a city is of utmost importance to transportation and public safety, but with the process of urbanization and the generation of big data, intensive computing and determination of mobility pattern have become challenging. This study focuses on how to improve the accuracy and efficiency of predicting citywide human mobility via a simpler solution. A spatio-temporal mobility event prediction framework based on a single fully-convolutional residual network (STAR) is proposed. STAR is a highly simple, general and effective method for learning a single tensor representing the mobility event. Residual learning is utilized for training the deep network to derive the detailed result for scenarios of citywide prediction. Extensive benchmark evaluation results on real-world data demonstrate that STAR outperforms state-of-the-art approaches in single- and multi-step prediction while utilizing fewer parameters and achieving higher efficiency.