Jiamin Xu

CV
h-index7
14papers
313citations
Novelty50%
AI Score55

14 Papers

CVMar 23, 2022
Hybrid Mesh-neural Representation for 3D Transparent Object Reconstruction

Jiamin Xu, Zihan Zhu, Hujun Bao et al. · eth-zurich

We propose a novel method to reconstruct the 3D shapes of transparent objects using hand-held captured images under natural light conditions. It combines the advantage of explicit mesh and multi-layer perceptron (MLP) network, a hybrid representation, to simplify the capture setting used in recent contributions. After obtaining an initial shape through the multi-view silhouettes, we introduce surface-based local MLPs to encode the vertex displacement field (VDF) for the reconstruction of surface details. The design of local MLPs allows to represent the VDF in a piece-wise manner using two layer MLP networks, which is beneficial to the optimization algorithm. Defining local MLPs on the surface instead of the volume also reduces the searching space. Such a hybrid representation enables us to relax the ray-pixel correspondences that represent the light path constraint to our designed ray-cell correspondences, which significantly simplifies the implementation of single-image based environment matting algorithm. We evaluate our representation and reconstruction algorithm on several transparent objects with ground truth models. Our experiments show that our method can produce high-quality reconstruction results superior to state-of-the-art methods using a simplified data acquisition setup.

81.0SYMar 24
Influence Functions for Data Attribution in Linear System Identification and LQR Control

Jiachen Li, Shihao Li, Soovadeep Bakshi et al.

When a controller is designed from an identified model, its performance ultimately depends on the trajectories used for identification, but pinpointing which ones help or hurt remains an open problem. We bring influence functions, a data attribution tool from machine learning, into this setting by chaining two closed form sensitivity analyses across a regularized least squares identification and an infinite horizon LQR pipeline. On the identification side, the quadratic loss admits an exact leave one trajectory out parameter shift and a reusable first order approximation with a Neumann series error bound. On the control side, we implicitly differentiate through the DARE via its discrete Lyapunov structure and compress the cost gradient to a single adjoint Lyapunov solve. The resulting scores track true LOTO retraining with Pearson correlations above 0.99 and speedups of 7 to 60 times on linear systems of dimension 2 to 10.

16.7SYMar 25
Datamodel-Based Data Selection for Nonlinear Data-Enabled Predictive Control

Jiachen Li, Shihao Li, Jiamin Xu et al.

Data-Enabled Predictive Control (DeePC) has emerged as a powerful framework for controlling unknown systems directly from input-output data. For nonlinear systems, recent work has proposed selecting relevant subsets of data columns based on geometric proximity to the current operating point. However, such proximity-based selection ignores the control objective: different reference trajectories may benefit from different data even at the same operating point. In this paper, we propose a datamodel-based approach that learns a context-dependent influence function mapping the current initial trajectory and reference trajectory to column importance scores. Adapting the linear datamodel framework from machine learning, we model closed-loop cost as a linear function of column inclusion indicators, with coefficients that depend on the control context. Training on closed-loop simulations, our method captures which data columns actually improve tracking performance for specific control tasks. Experimental results demonstrate that task-aware selection substantially outperforms geometry-based heuristics, particularly when using small data subsets.

68.4SYMar 23
Stochastic Trajectory Influence Functions for LQR: Joint Sensitivity Through Dynamics and Noise Covariance

Jiachen Li, Shihao Li, Soovadeep Bakshi et al.

Model-based controllers learned from data have the biases and noise of their training trajectories, making it important to know which trajectories help or hurt closed-loop performance. Influence functions, widely used in machine learning for data attribution, approximate this effect through first-order parameter-shift surrogates, avoiding costly retraining. Applying them to stochastic LQR, however, is nontrivial because the cost depends on the learned dynamics through the Riccati equation, and the process-noise covariance is estimated from the same residuals. We develop a three-level influence hierarchy that accounts for both channels.

56.7SYMar 23
IF-CPS: Influence Functions for Cyber-Physical Systems -- A Unified Framework for Diagnosis, Curation, and Safety Attribution

Jiachen Li, Shihao Li, Soovadeep Bakshi et al.

Neural network controllers trained via behavior cloning are increasingly deployed in cyber-physical systems (CPS), yet practitioners lack tools to trace controller failures back to training data. Existing data attribution methods assume i.i.d.\ data and standard loss targets, ignoring CPS-specific properties: closed-loop dynamics, safety constraints, and temporal trajectory structure. We propose IF-CPS, a modular influence function framework with three CPS-adapted variants: safety influence (attributing constraint violations), trajectory influence (temporal discounting over trajectories), and propagated influence (tracing effects through plant dynamics). We evaluate IF-CPS on six benchmarks across diagnosis, curation, and safety attribution tasks. IF-CPS improves over standard influence functions in the majority of settings, achieving AUROC $1.00$ in Pendulum (5-10\% poisoning), $0.92$ vs.\ $0.50$ in HVAC (10\%), and the strongest constraint-boundary correlation (Spearman $ρ= 0.55$ in Pendulum).

59.9CVMay 13
HetScene: Heterogeneity-Aware Diffusion for Dense Indoor Scene Generation

Zini Chen, Junming Huang, Rong Zhang et al.

Generating controllable and physically plausible indoor scenes is a pivotal prerequisite for constructing high-fidelity simulation environments for embodied AI. However, existing deeplearning-based methods usually treat all objects as homogeneous instances within a unified generation process. While effective for sparse and simplistic layouts, they struggle to model realistic layouts with dense object arrangements and complex spatial dependencies, leadingto limited scalability and degraded physical plausibility. To deal with these challenges, we revisit indoor layout generation from the perspective of structural heterogeneity and decompose the objects into primary objects and secondary objects according to their distinct roles in shaping a scene. Based on this decomposition, we propose HetScene, a heterogeneous two-stage generation framework that decouples indoor layout synthesis into Structural Layout Generation (SLG) and Contextual Layout Generation (CLG). SLG first generates globally coherent structural layouts with only primary objects conditioned on text descriptions, top-down binary room masks, and spatial relation graphs, establishing a stable global macro-skeleton of large core furniture.

CVApr 27, 2024
High-quality Surface Reconstruction using Gaussian Surfels

Pinxuan Dai, Jiamin Xu, Wenxiang Xie et al.

We propose a novel point-based representation, Gaussian surfels, to combine the advantages of the flexible optimization procedure in 3D Gaussian points and the surface alignment property of surfels. This is achieved by directly setting the z-scale of 3D Gaussian points to 0, effectively flattening the original 3D ellipsoid into a 2D ellipse. Such a design provides clear guidance to the optimizer. By treating the local z-axis as the normal direction, it greatly improves optimization stability and surface alignment. While the derivatives to the local z-axis computed from the covariance matrix are zero in this setting, we design a self-supervised normal-depth consistency loss to remedy this issue. Monocular normal priors and foreground masks are incorporated to enhance the quality of the reconstruction, mitigating issues related to highlights and background. We propose a volumetric cutting method to aggregate the information of Gaussian surfels so as to remove erroneous points in depth maps generated by alpha blending. Finally, we apply screened Poisson reconstruction method to the fused depth maps to extract the surface mesh. Experimental results show that our method demonstrates superior performance in surface reconstruction compared to state-of-the-art neural volume rendering and point-based rendering methods.

59.5LGMay 8
Integrating Causal DAGs in Deep RL: Activating Minimal Markovian States with Multi-Order Exposure

Jiamin Xu, Jacqueline Maasch, Kyra Gan

Online reinforcement learning (RL) relies on the Markov property for guaranteed performance, but real-world applications often lack well-defined states given raw observed variables. While causal RL has attracted growing interest, existing work typically assumes Markovian states are provided and focuses on using causality to accelerate learning, leaving a fundamental gap: \emph{given a longitudinal causal graph over observed variables, how does one construct MDP states that provably satisfy the Markov property?} We address this by providing a procedure that constructs a provably minimal state representation. In deep RL, we observe that the minimal representation alone empirically fails to improve performance, indicating that neural networks cannot directly exploit Markovian minimality. To address this, we propose \textbf{MOSE} (Multi-Order State Exposure), which feeds multi-order historical state constructions into the same $Q$-function. MOSE consistently outperforms both the minimal state construction and single-window policies on common benchmarks and synthetic datasets. Including the minimal representation alongside MOSE can further improve performance. Our results establish a core principle for causal deep RL: minimal sufficiency is not enough, and \emph{controlled redundancy} is necessary to unlock the benefit of causal state information.

LGNov 11, 2025
Algorithm-Relative Trajectory Valuation in Policy Gradient Control

Shihao Li, Jiachen Li, Jiamin Xu et al.

We study how trajectory value depends on the learning algorithm in policy-gradient control. Using Trajectory Shapley in an uncertain LQR, we find a negative correlation between Persistence of Excitation (PE) and marginal value under vanilla REINFORCE ($r\approx-0.38$). We prove a variance-mediated mechanism: (i) for fixed energy, higher PE yields lower gradient variance; (ii) near saddles, higher variance increases escape probability, raising marginal contribution. When stabilized (state whitening or Fisher preconditioning), this variance channel is neutralized and information content dominates, flipping the correlation positive ($r\approx+0.29$). Hence, trajectory value is algorithm-relative. Experiments validate the mechanism and show decision-aligned scores (Leave-One-Out) complement Shapley for pruning, while Shapley identifies toxic subsets.

CVDec 23, 2024
Detail-Preserving Latent Diffusion for Stable Shadow Removal

Jiamin Xu, Yuxin Zheng, Zelong Li et al.

Achieving high-quality shadow removal with strong generalizability is challenging in scenes with complex global illumination. Due to the limited diversity in shadow removal datasets, current methods are prone to overfitting training data, often leading to reduced performance on unseen cases. To address this, we leverage the rich visual priors of a pre-trained Stable Diffusion (SD) model and propose a two-stage fine-tuning pipeline to adapt the SD model for stable and efficient shadow removal. In the first stage, we fix the VAE and fine-tune the denoiser in latent space, which yields substantial shadow removal but may lose some high-frequency details. To resolve this, we introduce a second stage, called the detail injection stage. This stage selectively extracts features from the VAE encoder to modulate the decoder, injecting fine details into the final results. Experimental results show that our method outperforms state-of-the-art shadow removal techniques. The cross-dataset evaluation further demonstrates that our method generalizes effectively to unseen data, enhancing the applicability of shadow removal methods.

67.6ROApr 1
Behavioral Score Diffusion: Model-Free Trajectory Planning via Kernel-Based Score Estimation from Data

Shihao Li, Jiachen Li, Jiamin Xu et al.

Diffusion-based trajectory optimization has emerged as a powerful planning paradigm, but existing methods require either learned score networks trained on large datasets or analytical dynamics models for score computation. We introduce \emph{Behavioral Score Diffusion} (BSD), a training-free and model-free trajectory planner that computes the diffusion score function directly from a library of trajectory data via kernel-weighted estimation. At each denoising step, BSD retrieves relevant trajectories using a triple-kernel weighting scheme -- diffusion proximity, state context, and goal relevance -- and computes a Nadaraya-Watson estimate of the denoised trajectory. The diffusion noise schedule naturally controls kernel bandwidths, creating a multi-scale nonparametric regression: broad averaging of global behavioral patterns at high noise, fine-grained local interpolation at low noise. This coarse-to-fine structure handles nonlinear dynamics without linearization or parametric assumptions. Safety is preserved by applying shielded rollout on kernel-estimated state trajectories, identical to existing model-based approaches. We evaluate BSD on four robotic systems of increasing complexity (3D--6D state spaces) in a parking scenario. BSD with fixed bandwidth achieves 98.5\% of the model-based baseline's average reward across systems while requiring no dynamics model, using only 1{,}000 pre-collected trajectories. BSD substantially outperforms nearest-neighbor retrieval (18--63\% improvement), confirming that the diffusion denoising mechanism is essential for effective data-driven planning.

LGFeb 7, 2025
From Restless to Contextual: A Thresholding Bandit Reformulation For Finite-horizon Performance

Jiamin Xu, Ivan Nazarov, Aditya Rastogi et al.

This paper addresses the poor finite-horizon performance of existing online \emph{restless bandit} (RB) algorithms, which stems from the prohibitive sample complexity of learning a full \emph{Markov decision process} (MDP) for each agent. We argue that superior finite-horizon performance requires \emph{rapid convergence} to a \emph{high-quality} policy. Thus motivated, we introduce a reformulation of online RBs as a \emph{budgeted thresholding contextual bandit}, which simplifies the learning problem by encoding long-term state transitions into a scalar reward. We prove the first non-asymptotic optimality of an oracle policy for a simplified finite-horizon setting. We propose a practical learning policy under a heterogeneous-agent, multi-state setting, and show that it achieves a sublinear regret, achieving \emph{faster convergence} than existing methods. This directly translates to higher cumulative reward, as empirically validated by significant gains over state-of-the-art algorithms in large-scale heterogeneous environments. Our work provides a new pathway for achieving practical, sample-efficient learning in finite-horizon RBs.

CVMay 23, 2023
SAR-to-Optical Image Translation via Thermodynamics-inspired Network

Mingjin Zhang, Jiamin Xu, Chengyu He et al.

Synthetic aperture radar (SAR) is prevalent in the remote sensing field but is difficult to interpret in human visual perception. Recently, SAR-to-optical (S2O) image conversion methods have provided a prospective solution for interpretation. However, since there is a huge domain difference between optical and SAR images, they suffer from low image quality and geometric distortion in the produced optical images. Motivated by the analogy between pixels during the S2O image translation and molecules in a heat field, Thermodynamics-inspired Network for SAR-to-Optical Image Translation (S2O-TDN) is proposed in this paper. Specifically, we design a Third-order Finite Difference (TFD) residual structure in light of the TFD equation of thermodynamics, which allows us to efficiently extract inter-domain invariant features and facilitate the learning of the nonlinear translation mapping. In addition, we exploit the first law of thermodynamics (FLT) to devise an FLT-guided branch that promotes the state transition of the feature values from the unstable diffusion state to the stable one, aiming to regularize the feature diffusion and preserve image structures during S2O image translation. S2O-TDN follows an explicit design principle derived from thermodynamic theory and enjoys the advantage of explainability. Experiments on the public SEN1-2 dataset show the advantages of the proposed S2O-TDN over the current methods with more delicate textures and higher quantitative results.

SYNov 16, 2021
Graph neural network-based fault diagnosis: a review

Zhiwen Chen, Jiamin Xu, Cesare Alippi et al.

Graph neural network (GNN)-based fault diagnosis (FD) has received increasing attention in recent years, due to the fact that data coming from several application domains can be advantageously represented as graphs. Indeed, this particular representation form has led to superior performance compared to traditional FD approaches. In this review, an easy introduction to GNN, potential applications to the field of fault diagnosis, and future perspectives are given. First, the paper reviews neural network-based FD methods by focusing on their data representations, namely, time-series, images, and graphs. Second, basic principles and principal architectures of GNN are introduced, with attention to graph convolutional networks, graph attention networks, graph sample and aggregate, graph auto-encoder, and spatial-temporal graph convolutional networks. Third, the most relevant fault diagnosis methods based on GNN are validated through the detailed experiments, and conclusions are made that the GNN-based methods can achieve good fault diagnosis performance. Finally, discussions and future challenges are provided.