IVJul 25, 2023
One for Multiple: Physics-informed Synthetic Data Boosts Generalizable Deep Learning for Fast MRI ReconstructionZi Wang, Xiaotong Yu, Chengyan Wang et al.
Magnetic resonance imaging (MRI) is a widely used radiological modality renowned for its radiation-free, comprehensive insights into the human body, facilitating medical diagnoses. However, the drawback of prolonged scan times hinders its accessibility. The k-space undersampling offers a solution, yet the resultant artifacts necessitate meticulous removal during image reconstruction. Although Deep Learning (DL) has proven effective for fast MRI image reconstruction, its broader applicability across various imaging scenarios has been constrained. Challenges include the high cost and privacy restrictions associated with acquiring large-scale, diverse training data, coupled with the inherent difficulty of addressing mismatches between training and target data in existing DL methodologies. Here, we present a novel Physics-Informed Synthetic data learning framework for Fast MRI, called PISF. PISF marks a breakthrough by enabling generalized DL for multi-scenario MRI reconstruction through a single trained model. Our approach separates the reconstruction of a 2D image into many 1D basic problems, commencing with 1D data synthesis to facilitate generalization. We demonstrate that training DL models on synthetic data, coupled with enhanced learning techniques, yields in vivo MRI reconstructions comparable to or surpassing those of models trained on matched realistic datasets, reducing the reliance on real-world MRI data by up to 96%. Additionally, PISF exhibits remarkable generalizability across multiple vendors and imaging centers. Its adaptability to diverse patient populations has been validated through evaluations by ten experienced medical professionals. PISF presents a feasible and cost-effective way to significantly boost the widespread adoption of DL in various fast MRI applications.
CRFeb 24
AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMsChe Wang, Jiaming Zhang, Ziqi Zhang et al. · gatech
The integration of external data services (e.g., Model Context Protocol, MCP) has made large language model-based agents increasingly powerful for complex task execution. However, this advancement introduces critical security vulnerabilities, particularly indirect prompt injection (IPI) attacks. Existing attack methods are limited by their reliance on static patterns and evaluation on simple language models, failing to address the fast-evolving nature of modern AI agents. We introduce AdapTools, a novel adaptive IPI attack framework that selects stealthier attack tools and generates adaptive attack prompts to create a rigorous security evaluation environment. Our approach comprises two key components: (1) Adaptive Attack Strategy Construction, which develops transferable adversarial strategies for prompt optimization, and (2) Attack Enhancement, which identifies stealthy tools capable of circumventing task-relevance defenses. Comprehensive experimental evaluation shows that AdapTools achieves a 2.13 times improvement in attack success rate while degrading system utility by a factor of 1.78. Notably, the framework maintains its effectiveness even against state-of-the-art defense mechanisms. Our method advances the understanding of IPI attacks and provides a useful reference for future research.
LGApr 14, 2023
Convex Dual Theory Analysis of Two-Layer Convolutional Neural Networks with Soft-ThresholdingChunyan Xiong, Mengli Lu, Xiaotong Yu et al.
Soft-thresholding has been widely used in neural networks. Its basic network structure is a two-layer convolution neural network with soft-thresholding. Due to the network's nature of nonlinearity and nonconvexity, the training process heavily depends on an appropriate initialization of network parameters, resulting in the difficulty of obtaining a globally optimal solution. To address this issue, a convex dual network is designed here. We theoretically analyze the network convexity and numerically confirm that the strong duality holds. This conclusion is further verified in the linear fitting and denoising experiments. This work provides a new way to convexify soft-thresholding neural networks.
AIFeb 24
ICON: Indirect Prompt Injection Defense for Agents based on Inference-Time CorrectionChe Wang, Fuyao Zhang, Jiaming Zhang et al.
Large Language Model (LLM) agents are susceptible to Indirect Prompt Injection (IPI) attacks, where malicious instructions in retrieved content hijack the agent's execution. Existing defenses typically rely on strict filtering or refusal mechanisms, which suffer from a critical limitation: over-refusal, prematurely terminating valid agentic workflows. We propose ICON, a probing-to-mitigation framework that neutralizes attacks while preserving task continuity. Our key insight is that IPI attacks leave distinct over-focusing signatures in the latent space. We introduce a Latent Space Trace Prober to detect attacks based on high intensity scores. Subsequently, a Mitigating Rectifier performs surgical attention steering that selectively manipulate adversarial query key dependencies while amplifying task relevant elements to restore the LLM's functional trajectory. Extensive evaluations on multiple backbones show that ICON achieves a competitive 0.4% ASR, matching commercial grade detectors, while yielding a over 50% task utility gain. Furthermore, ICON demonstrates robust Out of Distribution(OOD) generalization and extends effectively to multi-modal agents, establishing a superior balance between security and efficiency.
CVNov 27, 2024Code
MvKeTR: Chest CT Report Generation with Multi-View Perception and Knowledge EnhancementXiwei Deng, Xianchun He, Jianfeng Bao et al.
CT report generation (CTRG) aims to automatically generate diagnostic reports for 3D volumes, relieving clinicians' workload and improving patient care. Despite clinical value, existing works fail to effectively incorporate diagnostic information from multiple anatomical views and lack related clinical expertise essential for accurate and reliable diagnosis. To resolve these limitations, we propose a novel Multi-view perception Knowledge-enhanced TansfoRmer (MvKeTR) to mimic the diagnostic workflow of clinicians. Just as radiologists first examine CT scans from multiple planes, a Multi-View Perception Aggregator (MVPA) with view-aware attention is proposed to synthesize diagnostic information from multiple anatomical views effectively. Then, inspired by how radiologists further refer to relevant clinical records to guide diagnostic decision-making, a Cross-Modal Knowledge Enhancer (CMKE) is devised to retrieve the most similar reports based on the query volume to incorporate domain knowledge into the diagnosis procedure. Furthermore, instead of traditional MLPs, we employ Kolmogorov-Arnold Networks (KANs) as the fundamental building blocks of both modules, which exhibit superior parameter efficiency and reduced spectral bias to better capture high-frequency components critical for CT interpretation while mitigating overfitting. Extensive experiments on the public CTRG-Chest-548 K dataset demonstrate that our method outpaces prior state-of-the-art (SOTA) models across almost all metrics. The code is available at https://github.com/xiweideng/MvKeTR.
53.1LGMar 19
MARLIN: Multi-Agent Reinforcement Learning for Incremental DAG DiscoveryDong Li, Zhengzhang Chen, Xujiang Zhao et al.
Uncovering causal structures from observational data is crucial for understanding complex systems and making informed decisions. While reinforcement learning (RL) has shown promise in identifying these structures in the form of a directed acyclic graph (DAG), existing methods often lack efficiency, making them unsuitable for online applications. In this paper, we propose MARLIN, an efficient multi agent RL based approach for incremental DAG learning. MARLIN uses a DAG generation policy that maps a continuous real valued space to the DAG space as an intra batch strategy, then incorporates two RL agents state specific and state invariant to uncover causal relationships and integrates these agents into an incremental learning framework. Furthermore, the framework leverages a factored action space to enhance parallelization efficiency. Extensive experiments on synthetic and real datasets demonstrate that MARLIN outperforms state of the art methods in terms of both efficiency and effectiveness.
74.2CRMay 2
Tracing the Dynamics of Refusal: Exploiting Latent Refusal Trajectories for Robust Jailbreak DetectionXulin Hu, Che Wang, Wei Yang Bryan Lim et al.
Representation Engineering typically relies on static refusal vectors derived from terminal representations. We move beyond this paradigm, demonstrating that refusal is a dynamic and sparse process rather than a localized outcome. Using Causal Tracing, we uncover the Refusal Trajectory-a persistent upstream signature that remains intact even when adversarial attacks (e.g., GCG) suppress terminal signals. Leveraging this, we propose SALO (Sparse Activation Localization Operator), an inference-time detector designed to capture these latent patterns. SALO effectively recovers defense capabilities against forced-decoding attacks, improving detection rates from ~0% to >90% where methods relying on terminal states perform poorly.
45.6AIApr 30
Intent2Tx: Benchmarking LLMs for Translating Natural Language Intents into Ethereum TransactionsZhuoran Pan, Yue Li, Zhi Guan et al.
The emergence of Large Language Models (LLMs) offers a transformative interface for Web3, yet existing benchmarks fail to capture the complexity of translating high-level user intents into functionally correct, state-dependent on-chain transactions. We present \textsc{Intent2Tx}, a high-fidelity benchmark featuring 29,921 single-step and 1,575 multi-step instances meticulously derived from 300 days of real-world Ethereum mainnet traces. Unlike prior works that rely on synthetic instructions, \textsc{Intent2Tx} grounds natural language intents in real-world protocol interactions across 11 categories, including diverse long-tail Decentralized Finance (DeFi) primitives. To enable rigorous evaluation, we propose an execution-aware framework that transcends surface-level text matching by employing differential state analysis on forked mainnet environments. Our extensive evaluation of 16 state-of-the-art LLMs reveals that while scaling and retrieval-augmentation enhance logical consistency and parameter precision, current models struggle with out-of-distribution generalization and multi-step planning. Crucially, our execution-based analysis demonstrates that syntactically valid outputs often fail to achieve intended state transitions, highlighting a significant gap in current "reasoning-to-execution" capabilities. \textsc{Intent2Tx} serves as a critical foundation for developing autonomous, reliable agents in intent-centric Web3 ecosystems. Code and data: https://anonymous.4open.science/r/Intent2Tx_Bench-97FF .
MLJan 27, 2025
Measuring Heterogeneity in Machine Learning with Distributed Energy DistanceMengchen Fan, Baocheng Geng, Roman Shterenberg et al.
In distributed and federated learning, heterogeneity across data sources remains a major obstacle to effective model aggregation and convergence. We focus on feature heterogeneity and introduce energy distance as a sensitive measure for quantifying distributional discrepancies. While we show that energy distance is robust for detecting data distribution shifts, its direct use in large-scale systems can be prohibitively expensive. To address this, we develop Taylor approximations that preserve key theoretical quantitative properties while reducing computational overhead. Through simulation studies, we show how accurately capturing feature discrepancies boosts convergence in distributed learning. Finally, we propose a novel application of energy distance to assign penalty weights for aligning predictions across heterogeneous nodes, ultimately enhancing coordination in federated and distributed settings.
LGNov 16, 2024
An Oversampling-enhanced Multi-class Imbalanced Classification Framework for Patient Health Status Prediction Using Patient-reported OutcomesYang Yan, Zhong Chen, Cai Xu et al.
Patient-reported outcomes (PROs) directly collected from cancer patients being treated with radiation therapy play a vital role in assisting clinicians in counseling patients regarding likely toxicities. Precise prediction and evaluation of symptoms or health status associated with PROs are fundamental to enhancing decision-making and planning for the required services and support as patients transition into survivorship. However, the raw PRO data collected from hospitals exhibits some intrinsic challenges such as incomplete item reports and imbalance patient toxicities. To the end, in this study, we explore various machine learning techniques to predict patient outcomes related to health status such as pain levels and sleep discomfort using PRO datasets from a cancer photon/proton therapy center. Specifically, we deploy six advanced machine learning classifiers -- Random Forest (RF), XGBoost, Gradient Boosting (GB), Support Vector Machine (SVM), Multi-Layer Perceptron with Bagging (MLP-Bagging), and Logistic Regression (LR) -- to tackle a multi-class imbalance classification problem across three prevalent cancer types: head and neck, prostate, and breast cancers. To address the class imbalance issue, we employ an oversampling strategy, adjusting the training set sample sizes through interpolations of in-class neighboring samples, thereby augmenting minority classes without deviating from the original skewed class distribution. Our experimental findings across multiple PRO datasets indicate that the RF and XGB methods achieve robust generalization performance, evidenced by weighted AUC and detailed confusion matrices, in categorizing outcomes as mild, intermediate, and severe post-radiation therapy. These results underscore the models' effectiveness and potential utility in clinical settings.
CVNov 6, 2025
A Linear Fractional Transformation Model and Calibration Method for Light Field CameraZhong Chen, Changfeng Chen
Accurate calibration of internal parameters is a crucial yet challenging prerequisite for 3D reconstruction using light field cameras. In this paper, we propose a linear fractional transformation(LFT) parameter $α$ to decoupled the main lens and micro lens array (MLA). The proposed method includes an analytical solution based on least squares, followed by nonlinear refinement. The method for detecting features from the raw images is also introduced. Experimental results on both physical and simulated data have verified the performance of proposed method. Based on proposed model, the simulation of raw light field images becomes faster, which is crucial for data-driven deep learning methods. The corresponding code can be obtained from the author's website.
LGOct 19, 2025
SolverLLM: Leveraging Test-Time Scaling for Optimization Problem via LLM-Guided SearchDong Li, Xujiang Zhao, Linlin Yu et al.
Large Language Models (LLMs) offer promising capabilities for tackling complex reasoning tasks, including optimization problems. However, existing methods either rely on prompt engineering, which leads to poor generalization across problem types, or require costly supervised training. We introduce SolverLLM, a training-free framework that leverages test-time scaling to solve diverse optimization problems. Rather than solving directly, SolverLLM generates mathematical formulations and translates them into solver-ready code, guided by a novel Monte Carlo Tree Search (MCTS) strategy. To enhance the search process, we modify classical MCTS with (1) dynamic expansion for adaptive formulation generation, (2) prompt backpropagation to guide exploration via outcome-driven feedback, and (3) uncertainty backpropagation to incorporate reward reliability into decision-making. Experiments on six standard benchmark datasets demonstrate that SolverLLM outperforms both prompt-based and learning-based baselines, achieving strong generalization without additional training.
CVAug 31, 2025
Face4FairShifts: A Large Image Benchmark for Fairness and Robust Learning across Visual DomainsYumeng Lin, Dong Li, Xintao Wu et al.
Ensuring fairness and robustness in machine learning models remains a challenge, particularly under domain shifts. We present Face4FairShifts, a large-scale facial image benchmark designed to systematically evaluate fairness-aware learning and domain generalization. The dataset includes 100,000 images across four visually distinct domains with 39 annotations within 14 attributes covering demographic and facial features. Through extensive experiments, we analyze model performance under distribution shifts and identify significant gaps. Our findings emphasize the limitations of existing related datasets and the need for more effective fairness-aware domain adaptation techniques. Face4FairShifts provides a comprehensive testbed for advancing equitable and reliable AI systems. The dataset is available online at https://meviuslab.github.io/Face4FairShifts/.
CRMar 23, 2021
TrustCross: Enabling Confidential Interoperability across Blockchains Using Trusted HardwareYing Lan, Jianbo Gao, Ke Wang et al.
With the rapid development of blockchain technology, different types of blockchains are adopted and interoperability across blockchains has received widespread attention. There have been many cross-chain solutions proposed in recent years, including notary scheme, sidechain, and relay chain. However, most of the existing platforms do not take confidentiality into account, although privacy has become an important concern for blockchain. In this paper, we present TrustCross, a privacy-preserving cross-chain platform to enable confidential interoperability across blockchains. The key insight behind TrustCross is to encrypt cross-chain communication data on the relay chain with the assistance of trusted execution environment and employ fine-grained access control to protect user privacy. Our experimental results show that TrustCross achieves reasonable latency and high scalability on the contract calls across heterogeneous blockchains.
SEJun 2, 2020
Kaya: A Testing Framework for Blockchain-based Decentralized ApplicationsZhenhao Wu, Jiashuo Zhang, Jianbo Gao et al.
In recent years, many decentralized applications based on blockchain (DApp) have been developed. However, due to inadequate testing, DApps are easily exposed to serious vulnerabilities. We find three main challenges for DApp testing, i.e., the inherent complexity of DApp, inconvenient pre-state setting, and not-so-readable logs. In this paper, we propose a testing framework named Kaya to bridge these gaps. Kaya has three main functions. Firstly, Kaya proposes DApp behavior description language (DBDL) to make writing test cases easier. Test cases written in DBDL can also be automatically executed by Kaya. Secondly, Kaya supports a flexible and convenient way for test engineers to set the blockchain pre-states easily. Thirdly, Kaya transforms incomprehensible addresses into readable variables for easy comprehension. With these functions, Kaya can help test engineers test DApps more easily. Besides, to fit the various application environments, we provide two ways for test engineers to use Kaya, i.e., UI and command-line. Our experimental case demonstrates the potential of Kaya in helping test engineers to test DApps more easily.
MED-PHApr 9, 2019
Accelerated Nuclear Magnetic Resonance Spectroscopy with Deep LearningXiaobo Qu, Yihui Huang, Hengfa Lu et al.
Nuclear magnetic resonance (NMR) spectroscopy serves as an indispensable tool in chemistry and biology but often suffers from long experimental time. We present a proof-of-concept of application of deep learning and neural network for high-quality, reliable, and very fast NMR spectra reconstruction from limited experimental data. We show that the neural network training can be achieved using solely synthetic NMR signal, which lifts the prohibiting demand for a large volume of realistic training data usually required in the deep learning approach.
SENov 9, 2018
EASYFLOW: Keep Ethereum Away From OverflowJianbo Gao, Han Liu, Chao Liu et al.
While Ethereum smart contracts enabled a wide range of blockchain applications, they are extremely vulnerable to different forms of security attacks. Due to the fact that transactions to smart contracts commonly involve cryptocurrency transfer, any successful attacks can lead to money loss or even financial disorder. In this paper, we focus on the overflow attacks in Ethereum , mainly because they widely rooted in many smart contracts and comparatively easy to exploit. We have developed EASYFLOW , an overflow detector at Ethereum Virtual Machine level. The key insight behind EASYFLOW is a taint analysis based tracking technique to analyze the propagation of involved taints. Specifically, EASYFLOW can not only divide smart contracts into safe contracts, manifested overflows, well-protected overflows and potential overflows, but also automatically generate transactions to trigger potential overflows. In our preliminary evaluation, EASYFLOW managed to find potentially vulnerable Ethereum contracts with little runtime overhead.
CVAug 17, 2017
High Efficient Reconstruction of Single-shot T2 Mapping from OverLapping-Echo Detachment Planar Imaging Based on Deep Residual NetworkCongbo Cai, Yiqing Zeng, Chao Wang et al.
Purpose: An end-to-end deep convolutional neural network (CNN) based on deep residual network (ResNet) was proposed to efficiently reconstruct reliable T2 mapping from single-shot OverLapping-Echo Detachment (OLED) planar imaging. Methods: The training dataset was obtained from simulations carried out on SPROM software developed by our group. The relationship between the original OLED image containing two echo signals and the corresponded T2 mapping was learned by ResNet training. After the ResNet was trained, it was applied to reconstruct the T2 mapping from simulation and in vivo human brain data. Results: Though the ResNet was trained entirely on simulated data, the trained network was generalized well to real human brain data. The results from simulation and in vivo human brain experiments show that the proposed method significantly outperformed the echo-detachment-based method. Reliable T2 mapping was achieved within tens of milliseconds after the network had been trained while the echo-detachment-based OLED reconstruction method took minutes. Conclusion: The proposed method will greatly facilitate real-time dynamic and quantitative MR imaging via OLED sequence, and ResNet has the potential to reconstruct images from complex MRI sequence efficiently.
MLApr 6, 2016
Hankel Matrix Nuclear Norm Regularized Tensor Completion for $N$-dimensional Exponential SignalsJiaxi Ying, Hengfa Lu, Qingtao Wei et al.
Signals are generally modeled as a superposition of exponential functions in spectroscopy of chemistry, biology and medical imaging. For fast data acquisition or other inevitable reasons, however, only a small amount of samples may be acquired and thus how to recover the full signal becomes an active research topic. But existing approaches can not efficiently recover $N$-dimensional exponential signals with $N\geq 3$. In this paper, we study the problem of recovering N-dimensional (particularly $N\geq 3$) exponential signals from partial observations, and formulate this problem as a low-rank tensor completion problem with exponential factor vectors. The full signal is reconstructed by simultaneously exploiting the CANDECOMP/PARAFAC structure and the exponential structure of the associated factor vectors. The latter is promoted by minimizing an objective function involving the nuclear norm of Hankel matrices. Experimental results on simulated and real magnetic resonance spectroscopy data show that the proposed approach can successfully recover full signals from very limited samples and is robust to the estimated tensor rank.
CRNov 26, 2015
The Scale-free Network of Passwords : Visualization and Estimation of Empirical PasswordsXiujia Guo, Haibo Chen, Xuqin Liu et al.
In this paper, we present a novel vision of large scale of empirical password sets available and improve the understanding of passwords by revealing their interconnections and considering the security on a level of the whole password set instead of one single password level. Through the visualization of Yahoo, Phpbb, 12306, etc. we, for the first time, show what the spatial structure of empirical password sets are like and take the community and clustering patterns of the passwords into account to shed lights on the definition of popularity of a password based on their frequency and degree separately. Furthermore, we propose a model of statistical guessing attack from the perspective of the data's topological space, which provide an explanation of the "cracking curve". We also give a lower bound of the minimum size of the dictionary needed to compromise arbitrary ratio of any given password set by proving that it is equivalent to the minimum dominating set problem, which is a NP-complete problem. Hence the minimal dictionary problem is also NP-complete.
MED-PHApr 29, 2015
Projected Iterative Soft-thresholding Algorithm for Tight Frames in Compressed Sensing Magnetic Resonance ImagingYunsong Liu, Zhifang Zhan, Jian-Feng Cai et al.
Compressed sensing has shown great potentials in accelerating magnetic resonance imaging. Fast image reconstruction and high image quality are two main issues faced by this new technology. It has been shown that, redundant image representations, e.g. tight frames, can significantly improve the image quality. But how to efficiently solve the reconstruction problem with these redundant representation systems is still challenging. This paper attempts to address the problem of applying iterative soft-thresholding algorithm (ISTA) to tight frames based magnetic resonance image reconstruction. By introducing the canonical dual frame to construct the orthogonal projection operator on the range of the analysis sparsity operator, we propose a projected iterative soft-thresholding algorithm (pISTA) and further accelerate it by incorporating the strategy proposed by Beck and Teboulle in 2009. We theoretically prove that pISTA converges to the minimum of a function with a balanced tight frame sparsity. Experimental results demonstrate that the proposed algorithm achieves better reconstruction than the widely used synthesis sparse model and the accelerated pISTA converges faster or comparable to the state-of-art smoothing FISTA. One major advantage of pISTA is that only one extra parameter, the step size, is introduced and the numerical solution is stable to it in terms of image reconstruction errors, thus allowing easily setting in many fast magnetic resonance imaging applications.
CVMar 10, 2015
Fast Multi-class Dictionaries Learning with Geometrical Directions in MRI ReconstructionZhifang Zhan, Jian-Feng Cai, Di Guo et al.
Objective: Improve the reconstructed image with fast and multi-class dictionaries learning when magnetic resonance imaging is accelerated by undersampling the k-space data. Methods: A fast orthogonal dictionary learning method is introduced into magnetic resonance image reconstruction to providing adaptive sparse representation of images. To enhance the sparsity, image is divided into classified patches according to the same geometrical direction and dictionary is trained within each class. A new sparse reconstruction model with the multi-class dictionaries is proposed and solved using a fast alternating direction method of multipliers. Results: Experiments on phantom and brain imaging data with acceleration factor up to 10 and various undersampling patterns are conducted. The proposed method is compared with state-of-the-art magnetic resonance image reconstruction methods. Conclusion: Artifacts are better suppressed and image edges are better preserved than the compared methods. Besides, the computation of the proposed approach is much faster than the typical K-SVD dictionary learning method in magnetic resonance image reconstruction. Significance: The proposed method can be exploited in undersapmled magnetic resonance imaging to reduce data acquisition time and reconstruct images with better image quality.
CVJan 23, 2013
Spread spectrum compressed sensing MRI using chirp radio frequency pulsesXiaobo Qu, Ying Chen, Xiaoxing Zhuang et al.
Compressed sensing has shown great potential in reducing data acquisition time in magnetic resonance imaging (MRI). Recently, a spread spectrum compressed sensing MRI method modulates an image with a quadratic phase. It performs better than the conventional compressed sensing MRI with variable density sampling, since the coherence between the sensing and sparsity bases are reduced. However, spread spectrum in that method is implemented via a shim coil which limits its modulation intensity and is not convenient to operate. In this letter, we propose to apply chirp (linear frequency-swept) radio frequency pulses to easily control the spread spectrum. To accelerate the image reconstruction, an alternating direction algorithm is modified by exploiting the complex orthogonality of the quadratic phase encoding. Reconstruction on the acquired data demonstrates that more image features are preserved using the proposed approach than those of conventional CS-MRI.