LGMay 19, 2022
Time Series Anomaly Detection via Reinforcement Learning-Based Model SelectionJiuqi Elise Zhang, Di Wu, Benoit Boulet
Time series anomaly detection has been recognized as of critical importance for the reliable and efficient operation of real-world systems. Many anomaly detection methods have been developed based on various assumptions on anomaly characteristics. However, due to the complex nature of real-world data, different anomalies within a time series usually have diverse profiles supporting different anomaly assumptions. This makes it difficult to find a single anomaly detector that can consistently outperform other models. In this work, to harness the benefits of different base models, we propose a reinforcement learning-based model selection framework. Specifically, we first learn a pool of different anomaly detection models, and then utilize reinforcement learning to dynamically select a candidate model from these base models. Experiments on real-world data have demonstrated that the proposed strategy can indeed outplay all baseline models in terms of overall performance.
AIOct 23, 2022
Meta-Reinforcement Learning for Building Energy Management SystemHuiliang Zhang, Di Wu, Arnaud Zinflou et al.
The building sector is one of the largest contributors to global energy consumption. Improving its energy efficiency is essential for reducing operational costs and greenhouse gas emissions. Energy management systems (EMS) play a key role in monitoring and controlling building appliances efficiently and reliably. With the increasing integration of renewable energy, intelligent EMS solutions have received growing attention. Reinforcement learning (RL) has recently been explored for this purpose and shows strong potential. However, most RL-based EMS methods require a large number of training steps to learn effective control policies, especially when adapting to unseen buildings, which limits their practical deployment. This paper introduces MetaEMS, a meta-reinforcement learning framework for EMS. MetaEMS improves learning efficiency by transferring knowledge from previously solved tasks to new ones through group-level and building-level adaptation, enabling fast adaptation and effective control across diverse building environments. Experimental results demonstrate that MetaEMS adapts more rapidly to unseen buildings and consistently outperforms baseline methods across various scenarios.
LGApr 27, 2022
Meta-Learning Based Early Fault Detection for Rolling Bearings via Few-Shot Anomaly DetectionWenbin Song, Di Wu, Weiming Shen et al.
Early fault detection (EFD) of rolling bearings can recognize slight deviation of the health states and contribute to the stability of mechanical systems. In practice, very limited target bearing data are available to conduct EFD, which makes it hard to adapt to the EFD task of new bearings. To address this problem, many transfer learning based EFD methods utilize historical data to learn transferable domain knowledge and conduct early fault detection on new target bearings. However, most existing methods only consider the distribution drift across different working conditions but ignore the difference between bearings under the same working condition, which is called Unit-to-Unit Variability (UtUV). The setting of EFD with limited target data considering UtUV can be formulated as a Few-shot Anomaly Detection task. Therefore, this paper proposes a novel EFD method based on meta-learning considering UtUV. The proposed method can learn a generic metric based on Relation Network (RN) to measure the similarity between normal data and the new arrival target bearing data. Besides, the proposed method utilizes a health state embedding strategy to decrease false alarms. The performance of proposed method is tested on two bearing datasets. The results show that the proposed method can detect incipient faults earlier than the baselines with lower false alarms.
LGMar 11, 2023
Anomaly Detection with Ensemble of Encoder and DecoderXijuan Sun, Di Wu, Arnaud Zinflou et al.
Hacking and false data injection from adversaries can threaten power grids' everyday operations and cause significant economic loss. Anomaly detection in power grids aims to detect and discriminate anomalies caused by cyber attacks against the power system, which is essential for keeping power grids working correctly and efficiently. Different methods have been applied for anomaly detection, such as statistical methods and machine learning-based methods. Usually, machine learning-based methods need to model the normal data distribution. In this work, we propose a novel anomaly detection method by modeling the data distribution of normal samples via multiple encoders and decoders. Specifically, the proposed method maps input samples into a latent space and then reconstructs output samples from latent vectors. The extra encoder finally maps reconstructed samples to latent representations. During the training phase, we optimize parameters by minimizing the reconstruction loss and encoding loss. Training samples are re-weighted to focus more on missed correlations between features of normal data. Furthermore, we employ the long short-term memory model as encoders and decoders to test its effectiveness. We also investigate a meta-learning-based framework for hyper-parameter tuning of our approach. Experiment results on network intrusion and power system datasets demonstrate the effectiveness of our proposed method, where our models consistently outperform all baselines.
LGMay 1, 2022
An Early Fault Detection Method of Rotating Machines Based on Multiple Feature Fusion with Stacking ArchitectureWenbin Song, Di Wu, Weiming Shen et al.
Early fault detection (EFD) of rotating machines is important to decrease the maintenance cost and improve the mechanical system stability. One of the key points of EFD is developing a generic model to extract robust and discriminative features from different equipment for early fault detection. Most existing EFD methods focus on learning fault representation by one type of feature. However, a combination of multiple features can capture a more comprehensive representation of system state. In this paper, we propose an EFD method based on multiple feature fusion with stacking architecture (M2FSA). The proposed method can extract generic and discriminiative features to detect early faults by combining time domain (TD), frequency domain (FD), and time-frequency domain (TFD) features. In order to unify the dimensions of the different domain features, Stacked Denoising Autoencoder (SDAE) is utilized to learn deep features in three domains. The architecture of the proposed M2FSA consists of two layers. The first layer contains three base models, whose corresponding inputs are different deep features. The outputs of the first layer are concatenated to generate the input to the second layer, which consists of a meta model. The proposed method is tested on three bearing datasets. The results demonstrate that the proposed method is better than existing methods both in sensibility and reliability.
LGFeb 7, 2023
Adaptive Aggregation for Safety-Critical ControlHuiliang Zhang, Di Wu, Benoit Boulet
Safety has been recognized as the central obstacle to preventing the use of reinforcement learning (RL) for real-world applications. Different methods have been developed to deal with safety concerns in RL. However, learning reliable RL-based solutions usually require a large number of interactions with the environment. Likewise, how to improve the learning efficiency, specifically, how to utilize transfer learning for safe reinforcement learning, has not been well studied. In this work, we propose an adaptive aggregation framework for safety-critical control. Our method comprises two key techniques: 1) we learn to transfer the safety knowledge by aggregating the multiple source tasks and a target task through the attention network; 2) we separate the goal of improving task performance and reducing constraint violations by utilizing a safeguard. Experiment results demonstrate that our algorithm can achieve fewer safety violations while showing better data efficiency compared with several baselines.
AIOct 29, 2024Code
Robot Policy Learning with Temporal Optimal Transport RewardYuwei Fu, Haichao Zhang, Di Wu et al.
Reward specification is one of the most tricky problems in Reinforcement Learning, which usually requires tedious hand engineering in practice. One promising approach to tackle this challenge is to adopt existing expert video demonstrations for policy learning. Some recent work investigates how to learn robot policies from only a single/few expert video demonstrations. For example, reward labeling via Optimal Transport (OT) has been shown to be an effective strategy to generate a proxy reward by measuring the alignment between the robot trajectory and the expert demonstrations. However, previous work mostly overlooks that the OT reward is invariant to temporal order information, which could bring extra noise to the reward signal. To address this issue, in this paper, we introduce the Temporal Optimal Transport (TemporalOT) reward to incorporate temporal order information for learning a more accurate OT-based proxy reward. Extensive experiments on the Meta-world benchmark tasks validate the efficacy of the proposed method. Code is available at: https://github.com/fuyw/TemporalOT
CLMar 20
A Training-Free Regeneration Paradigm: Contrastive Reflection Memory Guided Self-Verification and Self-ImprovementYuran Li, Di Wu, Benoit Boulet
Verification-guided self-improvement has recently emerged as a promising approach to improving the accuracy of large language model (LLM) outputs. However, existing approaches face a trade-off between inference efficiency and accuracy: iterative verification-rectification is computationally expensive and prone to being trapped in faulty reasoning, while best-of-N selection requires extensive sampling without addressing internal model flaws. We propose a training-free regeneration paradigm that leverages an offline-curated contrastive Reflection Memory (RM) to provide corrective guidance, while regenerating from scratch helps break out of faulty reasoning. At inference time, the method performs RM-guided self-verification followed by a single RM-guided regeneration, avoiding both iterative correction and multi-sample selection. We evaluated our method on nine benchmarks that span algorithmic, reasoning, symbolic, and domain-specific tasks in both small- and large-scale LLMs. Experiment results show that our method outperforms prior methods while maintaining low computational cost.
LGJun 2, 2024Code
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement LearningYuwei Fu, Haichao Zhang, Di Wu et al.
In this work, we investigate how to leverage pre-trained visual-language models (VLM) for online Reinforcement Learning (RL). In particular, we focus on sparse reward tasks with pre-defined textual task descriptions. We first identify the problem of reward misalignment when applying VLM as a reward in RL tasks. To address this issue, we introduce a lightweight fine-tuning method, named Fuzzy VLM reward-aided RL (FuRL), based on reward alignment and relay RL. Specifically, we enhance the performance of SAC/DrQ baseline agents on sparse reward tasks by fine-tuning VLM representations and using relay RL to avoid local minima. Extensive experiments on the Meta-world benchmark tasks demonstrate the efficacy of the proposed method. Code is available at: https://github.com/fuyw/FuRL.
CLJan 22, 2025
OnionEval: An Unified Evaluation of Fact-conflicting Hallucination for Small-Large Language ModelsChongren Sun, Yuran Li, Di Wu et al.
Large Language Models (LLMs) are highly capable but require significant computational resources for both training and inference. Within the LLM family, smaller models (those with fewer than 10 billion parameters) also perform well across various tasks. However, these smaller models share similar limitations to their larger counterparts, including the tendency to hallucinate. Despite the existence of many benchmarks to evaluate hallucination in LLMs, few have specifically focused on small LLMs (SLLMs). Additionally, SLLMs show widely varying performance across different benchmarks. In this paper, we introduce OnionEval, a multi-layer structured framework with a specific metric called the context-influence score (CI), designed to effectively assess the fact-conflicting hallucination tendencies of small LLMs across different contextual levels. Our experimental results reveal a key feature of SLLMs: they excel in factual analysis but face challenges with context reasoning. Further investigation shows that a simple Chain-of-Thought strategy can significantly reduce these limitations, improving the practical usefulness of SLLMs in real-world applications.
LGDec 12, 2023
Traffic Signal Control Using Lightweight Transformers: An Offline-to-Online RL ApproachXingshuai Huang, Di Wu, Benoit Boulet
Efficient traffic signal control is critical for reducing traffic congestion and improving overall transportation efficiency. The dynamic nature of traffic flow has prompted researchers to explore Reinforcement Learning (RL) for traffic signal control (TSC). Compared with traditional methods, RL-based solutions have shown preferable performance. However, the application of RL-based traffic signal controllers in the real world is limited by the low sample efficiency and high computational requirements of these solutions. In this work, we propose DTLight, a simple yet powerful lightweight Decision Transformer-based TSC method that can learn policy from easily accessible offline datasets. DTLight novelly leverages knowledge distillation to learn a lightweight controller from a well-trained larger teacher model to reduce implementation computation. Additionally, it integrates adapter modules to mitigate the expenses associated with fine-tuning, which makes DTLight practical for online adaptation with minimal computation and only a few fine-tuning steps during real deployment. Moreover, DTLight is further enhanced to be more applicable to real-world TSC problems. Extensive experiments on synthetic and real-world scenarios show that DTLight pre-trained purely on offline datasets can outperform state-of-the-art online RL-based methods in most scenarios. Experiment results also show that online fine-tuning further improves the performance of DTLight by up to 42.6% over the best online RL baseline methods. In this work, we also introduce Datasets specifically designed for TSC with offline RL (referred to as DTRL). Our datasets and code are publicly available.
AIApr 23, 2025
Leveraging LLMs as Meta-Judges: A Multi-Agent Framework for Evaluating LLM JudgmentsYuran Li, Jama Hussein Mohamud, Chongren Sun et al.
Large language models (LLMs) are being widely applied across various fields, but as tasks become more complex, evaluating their responses is increasingly challenging. Compared to human evaluators, the use of LLMs to support performance evaluation offers a more efficient alternative. However, most studies focus mainly on aligning LLMs' judgments with human preferences, overlooking the existence of biases and mistakes in human judgment. Furthermore, how to select suitable LLM judgments given multiple potential LLM responses remains underexplored. To address these two aforementioned issues, we propose a three-stage meta-judge selection pipeline: 1) developing a comprehensive rubric with GPT-4 and human experts, 2) using three advanced LLM agents to score judgments, and 3) applying a threshold to filter out low-scoring judgments. Compared to methods using a single LLM as both judge and meta-judge, our pipeline introduces multi-agent collaboration and a more comprehensive rubric. Experimental results on the JudgeBench dataset show about 15.55\% improvement compared to raw judgments and about 8.37\% improvement over the single-agent baseline. Our work demonstrates the potential of LLMs as meta-judges and lays the foundation for future research on constructing preference datasets for LLM-as-a-judge reinforcement learning.
LGDec 29, 2024
Goal-Conditioned Data Augmentation for Offline Reinforcement LearningXingshuai Huang, Di Wu, Benoit Boulet
Offline reinforcement learning (RL) enables policy learning from pre-collected offline datasets, relaxing the need to interact directly with the environment. However, limited by the quality of offline datasets, it generally fails to learn well-qualified policies in suboptimal datasets. To address datasets with insufficient optimal demonstrations, we introduce Goal-cOnditioned Data Augmentation (GODA), a novel goal-conditioned diffusion-based method for augmenting samples with higher quality. Leveraging recent advancements in generative modelling, GODA incorporates a novel return-oriented goal condition with various selection mechanisms. Specifically, we introduce a controllable scaling technique to provide enhanced return-based guidance during data sampling. GODA learns a comprehensive distribution representation of the original offline datasets while generating new data with selectively higher-return goals, thereby maximizing the utility of limited optimal demonstrations. Furthermore, we propose a novel adaptive gated conditioning method for processing noisy inputs and conditions, enhancing the capture of goal-oriented guidance. We conduct experiments on the D4RL benchmark and real-world challenges, specifically traffic signal control (TSC) tasks, to demonstrate GODA's effectiveness in enhancing data quality and superior performance compared to state-of-the-art data augmentation methods across various offline RL algorithms.
ROOct 15, 2024
Trajectory Prediction for Autonomous Driving using Agent-Interaction Graph EmbeddingJilan Samiuddin, Benoit Boulet, Di Wu
Trajectory prediction module in an autonomous driving system is crucial for the decision-making and safety of the autonomous agent car and its surroundings. This work presents a novel scheme called AiGem (Agent-Interaction Graph Embedding) to predict traffic vehicle trajectories around the autonomous car. AiGem tackles this problem in four steps. First, AiGem formulates the historical traffic interaction with the autonomous agent as a graph in two steps: (1) at each time step of the history frames, agent-interactions are captured using spatial edges between the agents (nodes of the graph), and then, (2) connects the spatial graphs in chronological order using temporal edges. Then, AiGem applies a depthwise graph encoder network on the spatial-temporal graph to generate graph embedding, i.e., embedding of all the nodes in the graph. Next, a sequential Gated Recurrent Unit decoder network uses the embedding of the current timestamp to get the decoded states. Finally, an output network comprising a Multilayer Perceptron is used to predict the trajectories utilizing the decoded states as its inputs. Results show that AiGem outperforms the state-of-the-art deep learning algorithms for longer prediction horizons.
ROApr 18, 2024
An Online Spatial-Temporal Graph Trajectory Planner for Autonomous VehiclesJilan Samiuddin, Benoit Boulet, Di Wu
The autonomous driving industry is expected to grow by over 20 times in the coming decade and, thus, motivate researchers to delve into it. The primary focus of their research is to ensure safety, comfort, and efficiency. An autonomous vehicle has several modules responsible for one or more of the aforementioned items. Among these modules, the trajectory planner plays a pivotal role in the safety of the vehicle and the comfort of its passengers. The module is also responsible for respecting kinematic constraints and any applicable road constraints. In this paper, a novel online spatial-temporal graph trajectory planner is introduced to generate safe and comfortable trajectories. First, a spatial-temporal graph is constructed using the autonomous vehicle, its surrounding vehicles, and virtual nodes along the road with respect to the vehicle itself. Next, the graph is forwarded into a sequential network to obtain the desired states. To support the planner, a simple behavioral layer is also presented that determines kinematic constraints for the planner. Furthermore, a novel potential function is also proposed to train the network. Finally, the proposed planner is tested on three different complex driving tasks, and the performance is compared with two frequently used methods. The results show that the proposed planner generates safe and feasible trajectories while achieving similar or longer distances in the forward direction and comparable comfort ride.
LGJan 12, 2025
DRDT3: Diffusion-Refined Decision Test-Time Training ModelXingshuai Huang, Di Wu, Benoit Boulet
Decision Transformer (DT), a trajectory modelling method, has shown competitive performance compared to traditional offline reinforcement learning (RL) approaches on various classic control tasks. However, it struggles to learn optimal policies from suboptimal, reward-labelled trajectories. In this study, we explore the use of conditional generative modelling to facilitate trajectory stitching given its high-quality data generation ability. Additionally, recent advancements in Recurrent Neural Networks (RNNs) have shown their linear complexity and competitive sequence modelling performance over Transformers. We leverage the Test-Time Training (TTT) layer, an RNN that updates hidden states during testing, to model trajectories in the form of DT. We introduce a unified framework, called Diffusion-Refined Decision TTT (DRDT3), to achieve performance beyond DT models. Specifically, we propose the Decision TTT (DT3) module, which harnesses the sequence modelling strengths of both self-attention and the TTT layer to capture recent contextual information and make coarse action predictions. DRDT3 iteratively refines the coarse action predictions through the generative diffusion model, progressively moving closer to the optimal actions. We further integrate DT3 with the diffusion model using a unified optimization objective. With experiments on multiple tasks in the D4RL benchmark, our DT3 model without diffusion refinement demonstrates improved performance over standard DT, while DRDT3 further achieves superior results compared to state-of-the-art DT-based and offline RL methods.
AIOct 15, 2025
STEMS: Spatial-Temporal Enhanced Safe Multi-Agent Coordination for Building Energy ManagementHuiliang Zhang, Di Wu, Arnaud Zinflou et al.
Building energy management is essential for achieving carbon reduction goals, improving occupant comfort, and reducing energy costs. Coordinated building energy management faces critical challenges in exploiting spatial-temporal dependencies while ensuring operational safety across multi-building systems. Current multi-building energy systems face three key challenges: insufficient spatial-temporal information exploitation, lack of rigorous safety guarantees, and system complexity. This paper proposes Spatial-Temporal Enhanced Safe Multi-Agent Coordination (STEMS), a novel safety-constrained multi-agent reinforcement learning framework for coordinated building energy management. STEMS integrates two core components: (1) a spatial-temporal graph representation learning framework using a GCN-Transformer fusion architecture to capture inter-building relationships and temporal patterns, and (2) a safety-constrained multi-agent RL algorithm incorporating Control Barrier Functions to provide mathematical safety guarantees. Extensive experiments on real-world building datasets demonstrate STEMS's superior performance over existing methods, showing that STEMS achieves 21% cost reduction, 18% emission reduction, and dramatically reduces safety violations from 35.1% to 5.6% while maintaining optimal comfort with only 0.13 discomfort proportion. The framework also demonstrates strong robustness during extreme weather conditions and maintains effectiveness across different building types.
LGMay 20, 2025
Leveraging Multivariate Long-Term History Representation for Time Series ForecastingHuiliang Zhang, Di Wu, Arnaud Zinflou et al.
Multivariate Time Series (MTS) forecasting has a wide range of applications in both industry and academia. Recent advances in Spatial-Temporal Graph Neural Network (STGNN) have achieved great progress in modelling spatial-temporal correlations. Limited by computational complexity, most STGNNs for MTS forecasting focus primarily on short-term and local spatial-temporal dependencies. Although some recent methods attempt to incorporate univariate history into modeling, they still overlook crucial long-term spatial-temporal similarities and correlations across MTS, which are essential for accurate forecasting. To fill this gap, we propose a framework called the Long-term Multivariate History Representation (LMHR) Enhanced STGNN for MTS forecasting. Specifically, a Long-term History Encoder (LHEncoder) is adopted to effectively encode the long-term history into segment-level contextual representations and reduce point-level noise. A non-parametric Hierarchical Representation Retriever (HRetriever) is designed to include the spatial information in the long-term spatial-temporal dependency modelling and pick out the most valuable representations with no additional training. A Transformer-based Aggregator (TAggregator) selectively fuses the sparsely retrieved contextual representations based on the ranking positional embedding efficiently. Experimental results demonstrate that LMHR outperforms typical STGNNs by 10.72% on the average prediction horizons and state-of-the-art methods by 4.12% on several real-world datasets. Additionally, it consistently improves prediction accuracy by 9.8% on the top 10% of rapidly changing patterns across the datasets.
LGFeb 6, 2025
MXMap: A Multivariate Cross Mapping Framework for Causal Discovery in Dynamical SystemsElise Zhang, François Mirallès, Raphaël Rousseau-Rizzi et al.
Convergent Cross Mapping (CCM) is a powerful method for detecting causality in coupled nonlinear dynamical systems, providing a model-free approach to capture dynamic causal interactions. Partial Cross Mapping (PCM) was introduced as an extension of CCM to address indirect causality in three-variable systems by comparing cross-mapping quality between direct cause-effect mapping and indirect mapping through an intermediate conditioning variable. However, PCM remains limited to univariate delay embeddings in its cross-mapping processes. In this work, we extend PCM to the multivariate setting, introducing multiPCM, which leverages multivariate embeddings to more effectively distinguish indirect causal relationships. We further propose a multivariate cross-mapping framework (MXMap) for causal discovery in dynamical systems. This two-phase framework combines (1) pairwise CCM tests to establish an initial causal graph and (2) multiPCM to refine the graph by pruning indirect causal connections. Through experiments on simulated data and the ERA5 Reanalysis weather dataset, we demonstrate the effectiveness of MXMap. Additionally, MXMap is compared against several baseline methods, showing advantages in accuracy and causal graph refinement.
LGNov 15, 2021
ModelLight: Model-Based Meta-Reinforcement Learning for Traffic Signal ControlXingshuai Huang, Di Wu, Michael Jenkin et al.
Traffic signal control is of critical importance for the effective use of transportation infrastructures. The rapid increase of vehicle traffic and changes in traffic patterns make traffic signal control more and more challenging. Reinforcement Learning (RL)-based algorithms have demonstrated their potential in dealing with traffic signal control. However, most existing solutions require a large amount of training data, which is unacceptable for many real-world scenarios. This paper proposes a novel model-based meta-reinforcement learning framework (ModelLight) for traffic signal control. Within ModelLight, an ensemble of models for road intersections and the optimization-based meta-learning method are used to improve the data efficiency of an RL-based traffic light control method. Experiments on real-world datasets demonstrate that ModelLight can outperform state-of-the-art traffic light control algorithms while substantially reducing the number of required interactions with the real-world environment.
LGJul 16, 2021
Time Series Anomaly Detection for Smart Grids: A SurveyJiuqi Elise Zhang, Di Wu, Benoit Boulet
With the rapid increase in the integration of renewable energy generation and the wide adoption of various electric appliances, power grids are now faced with more and more challenges. One prominent challenge is to implement efficient anomaly detection for different types of anomalous behaviors within power grids. These anomalous behaviors might be induced by unusual consumption patterns of the users, faulty grid infrastructures, outages, external cyberattacks, or energy fraud. Identifying such anomalies is of critical importance for the reliable and efficient operation of modern power grids. Various methods have been proposed for anomaly detection on power grid time-series data. This paper presents a short survey of the recent advances in anomaly detection for power grid time-series data. Specifically, we first outline current research challenges in the power grid anomaly detection domain and further review the major anomaly detection approaches. Finally, we conclude the survey by identifying the potential directions for future research.
RODec 19, 2020
Image-based Intraluminal Contact Force Monitoring in Robotic Vascular NavigationMasoud Razban, Javad Dargahi, Benoit Boulet
Embolization, stroke, ischaemic lesion, and perforation remain significant concerns in endovascular interventions. Intravascular sensing of tool interaction with the arteries is advantageous to minimize such complications and enhance navigation safety. Intraluminal information is currently limited due to the lack of intravascular contact sensing technologies. We present monitoring of the intraluminal tool interaction with the arterial wall using an image-based estimation approach within vascular robotic navigation. The proposed image-based method employs continuous finite element simulation of the tool using imaging data to estimate multi-point forces along tool-vessel wall interaction. We implemented imaging algorithms to detect and track contacts, and compute pose measurements. The model is constructed based on the nonlinear beam element and flexural rigidity profile over the tool length. During remote cannulation of aortic arteries, intraluminal monitoring achieved tracking local contact forces, building a contour map of force on the arterial wall and estimating tool structural stress. Results suggest that high risk intraluminal forces may happen even with low insertion force. The presented online monitoring system delivers insight into the intraluminal behavior of endovascular tools and is well suited for intraoperative visual guidance for the clinician, robotic control of vascular procedures and research on interventional device design.