CVAug 30, 2023Code
Two-Stage Violence Detection Using ViTPose and Classification Models at Smart Airportsİrem Üstek, Jay Desai, Iván López Torrecillas et al.
This study introduces an innovative violence detection framework tailored to the unique requirements of smart airports, where prompt responses to violent situations are crucial. The proposed framework harnesses the power of ViTPose for human pose estimation. It employs a CNN - BiLSTM network to analyse spatial and temporal information within keypoints sequences, enabling the accurate classification of violent behaviour in real time. Seamlessly integrated within the SAFE (Situational Awareness for Enhanced Security framework of SAAB, the solution underwent integrated testing to ensure robust performance in real world scenarios. The AIRTLab dataset, characterized by its high video quality and relevance to surveillance scenarios, is utilized in this study to enhance the model's accuracy and mitigate false positives. As airports face increased foot traffic in the post pandemic era, implementing AI driven violence detection systems, such as the one proposed, is paramount for improving security, expediting response times, and promoting data informed decision making. The implementation of this framework not only diminishes the probability of violent events but also assists surveillance teams in effectively addressing potential threats, ultimately fostering a more secure and protected aviation sector. Codes are available at: https://github.com/Asami-1/GDP.
LGMay 30, 2022
A Transistor Operations Model for Deep Learning Energy Consumption Scaling LawChen Li, Antonios Tsourdos, Weisi Guo
Deep Learning (DL) has transformed the automation of a wide range of industries and finds increasing ubiquity in society. The high complexity of DL models and its widespread adoption has led to global energy consumption doubling every 3-4 months. Currently, the relationship between the DL model configuration and energy consumption is not well established. At a general computational energy model level, there is both strong dependency to both the hardware architecture (e.g. generic processors with different configuration of inner components- CPU and GPU, programmable integrated circuits - FPGA), as well as different interacting energy consumption aspects (e.g., data movement, calculation, control). At the DL model level, we need to translate non-linear activation functions and its interaction with data into calculation tasks. Current methods mainly linearize nonlinear DL models to approximate its theoretical FLOPs and MACs as a proxy for energy consumption. Yet, this is inaccurate (est. 93\% accuracy) due to the highly nonlinear nature of many convolutional neural networks (CNNs) for example. In this paper, we develop a bottom-level Transistor Operations (TOs) method to expose the role of non-linear activation functions and neural network structure in energy consumption. We translate a range of feedforward and CNN models into ALU calculation tasks and then TO steps. This is then statistically linked to real energy consumption values via a regression model for different hardware configurations and data sets. We show that our proposed TOs method can achieve a 93.61% - 99.51% precision in predicting its energy consumption.
LGJul 29, 2023
Dynamic Deep-Reinforcement-Learning Algorithm in Partially Observable Markov Decision ProcessesSaki Omi, Hyo-Sang Shin, Namhoon Cho et al.
Recent studies have greatly improved reinforcement learning, and an increased interest in real-world implementation has emerged. In many cases, the implementation is challenged by time-varying disturbances as it introduces hidden states, which makes the problem best described with Partially Observable Markov Decision Processes. An effective approach to address this problem is to introduce a Recurrent Neural Network (RNN) in place of a state estimator. However, only a few studies have investigated the types of information to be supplied to the RNN and the network architecture to handle them. This study discusses the effectiveness of the inclusion of action along with observation and the impact of network architecture to handle them by providing interpretations of how the trajectories are summarized at LSTM networks. Specifically, three novel approaches with different architectures are introduced. All algorithms demonstrated the effectiveness of the inclusion of the action trajectories in simulation environments. In particular, one of the developed algorithms, H-TD3, differs from the typical actor and critic network as the critic network is trained by utilizing the hidden states generated by the actor network as the summarized trajectory information. This novel approach exhibited the potential improvement of the computational time while maintaining the performance.
OCSep 8, 2022
Incremental Correction in Dynamic Systems Modelled with Neural Networks for Constraint SatisfactionNamhoon Cho, Hyo-Sang Shin, Antonios Tsourdos et al.
This study presents incremental correction methods for refining neural network parameters or control functions entering into a continuous-time dynamic system to achieve improved solution accuracy in satisfying the interim point constraints placed on the performance output variables. The proposed approach is to linearise the dynamics around the baseline values of its arguments, and then to solve for the corrective input required to transfer the perturbed trajectory to precisely known or desired values at specific time points, i.e., the interim points. Depending on the type of decision variables to adjust, parameter correction and control function correction methods are developed. These incremental correction methods can be utilised as a means to compensate for the prediction errors of pre-trained neural networks in real-time applications where high accuracy of the prediction of dynamical systems at prescribed time points is imperative. In this regard, the online update approach can be useful for enhancing overall targeting accuracy of finite-horizon control subject to point constraints using a neural policy. Numerical example demonstrates the effectiveness of the proposed approach in an application to a powered descent problem at Mars.
LGMar 5, 2022
Bayesian Learning Approach to Model Predictive ControlNamhoon Cho, Seokwon Lee, Hyo-Sang Shin et al.
This study presents a Bayesian learning perspective towards model predictive control algorithms. High-level frameworks have been developed separately in the earlier studies on Bayesian learning and sampling-based model predictive control. On one hand, the Bayesian learning rule provides a general framework capable of generating various machine learning algorithms as special instances. On the other hand, the dynamic mirror descent model predictive control framework is capable of diversifying sample-rollout-based control algorithms. However, connections between the two frameworks have still not been fully appreciated in the context of stochastic optimal control. This study combines the Bayesian learning rule point of view into the model predictive control setting by taking inspirations from the view of understanding model predictive controller as an online learner. The selection of posterior class and natural gradient approximation for the variational formulation governs diversification of model predictive control algorithms in the Bayesian learning approach to model predictive control. This alternative viewpoint complements the dynamic mirror descent framework through streamlining the explanation of design choices.
AIMay 4, 2024
Explainable Interface for Human-Autonomy Teaming: A SurveyXiangqi Kong, Yang Xing, Antonios Tsourdos et al.
Nowadays, large-scale foundation models are being increasingly integrated into numerous safety-critical applications, including human-autonomy teaming (HAT) within transportation, medical, and defence domains. Consequently, the inherent 'black-box' nature of these sophisticated deep neural networks heightens the significance of fostering mutual understanding and trust between humans and autonomous systems. To tackle the transparency challenges in HAT, this paper conducts a thoughtful study on the underexplored domain of Explainable Interface (EI) in HAT systems from a human-centric perspective, thereby enriching the existing body of research in Explainable Artificial Intelligence (XAI). We explore the design, development, and evaluation of EI within XAI-enhanced HAT systems. To do so, we first clarify the distinctions between these concepts: EI, explanations and model explainability, aiming to provide researchers and practitioners with a structured understanding. Second, we contribute to a novel framework for EI, addressing the unique challenges in HAT. Last, our summarized evaluation framework for ongoing EI offers a holistic perspective, encompassing model performance, human-centered factors, and group task objectives. Based on extensive surveys across XAI, HAT, psychology, and Human-Computer Interaction (HCI), this review offers multiple novel insights into incorporating XAI into HAT systems and outlines future directions.
ROOct 18, 2024
Benchmarking Deep Reinforcement Learning for Navigation in Denied Sensor EnvironmentsMariusz Wisniewski, Paraskevas Chatzithanos, Weisi Guo et al.
Deep Reinforcement learning (DRL) is used to enable autonomous navigation in unknown environments. Most research assume perfect sensor data, but real-world environments may contain natural and artificial sensor noise and denial. Here, we present a benchmark of both well-used and emerging DRL algorithms in a navigation task with configurable sensor denial effects. In particular, we are interested in comparing how different DRL methods (e.g. model-free PPO vs. model-based DreamerV3) are affected by sensor denial. We show that DreamerV3 outperforms other methods in the visual end-to-end navigation task with a dynamic goal - and other methods are not able to learn this. Furthermore, DreamerV3 generally outperforms other methods in sensor-denied environments. In order to improve robustness, we use adversarial training and demonstrate an improved performance in denied environments, although this generally comes with a performance cost on the vanilla environments. We anticipate this benchmark of different DRL methods and the usage of adversarial training to be a starting point for the development of more elaborate navigation strategies that are capable of dealing with uncertain and denied sensor readings.
ITOct 25, 2021
Variational Probabilistic Multi-Hypothesis TrackingShuoyuan Xu, Hyo-Sang Shin, Antonios Tsourdos
This paper proposes a novel multi-target tracking (MTT) algorithm for scenarios with arbitrary numbers of measurements per target. We propose the variational probabilistic multi-hypothesis tracking (VPMHT) algorithm based on the variational Bayesian expectation-maximisation (VBEM) algorithm to resolve the MTT problem in the classic PMHT algorithm. With the introduction of variational inference, the proposed VPMHT handles track-loss much better than the conventional probabilistic multi-hypothesis tracking (PMHT) while preserving a similar or even better tracking accuracy. Extensive numerical simulations are conducted to demonstrate the effectiveness of the proposed algorithm.
CVAug 18, 2021
Scarce Data Driven Deep Learning of Drones via Generalized Data Distribution SpaceChen Li, Schyler C. Sun, Zhuangkun Wei et al.
Increased drone proliferation in civilian and professional settings has created new threat vectors for airports and national infrastructures. The economic damage for a single major airport from drone incursions is estimated to be millions per day. Due to the lack of diverse drone training data, accurate training of deep learning detection algorithms under scarce data is an open challenge. Existing methods largely rely on collecting diverse and comprehensive experimental drone footage data, artificially induced data augmentation, transfer and meta-learning, as well as physics-informed learning. However, these methods cannot guarantee capturing diverse drone designs and fully understanding the deep feature space of drones. Here, we show how understanding the general distribution of the drone data via a Generative Adversarial Network (GAN) and explaining the missing features using Topological Data Analysis (TDA) - can allow us to acquire missing data to achieve rapid and more accurate learning. We demonstrate our results on a drone image dataset, which contains both real drone images as well as simulated images from computer-aided design. When compared to random data collection (usual practice - discriminator accuracy of 94.67\% after 200 epochs), our proposed GAN-TDA informed data collection method offers a significant 4\% improvement (99.42\% after 200 epochs). We believe that this approach of exploiting general data distribution knowledge form neural networks can be applied to a wide range of scarce data open challenges.
LGMar 9, 2021
A Learning-Based Computational Impact Time GuidanceZichao Liu, Jiang Wang, Shaoming He et al.
This paper investigates the problem of impact-time-control and proposes a learning-based computational guidance algorithm to solve this problem. The proposed guidance algorithm is developed based on a general prediction-correction concept: the exact time-to-go under proportional navigation guidance with realistic aerodynamic characteristics is estimated by a deep neural network and a biased command to nullify the impact time error is developed by utilizing the emerging reinforcement learning techniques. The deep neural network is augmented into the reinforcement learning block to resolve the issue of sparse reward that has been observed in typical reinforcement learning formulation. Extensive numerical simulations are conducted to support the proposed algorithm.
CVNov 25, 2020
Relation3DMOT: Exploiting Deep Affinity for 3D Multi-Object Tracking from View AggregationCan Chen, Luca Zanotti Fragonara, Antonios Tsourdos
Autonomous systems need to localize and track surrounding objects in 3D space for safe motion planning. As a result, 3D multi-object tracking (MOT) plays a vital role in autonomous navigation. Most MOT methods use a tracking-by-detection pipeline, which includes object detection and data association processing. However, many approaches detect objects in 2D RGB sequences for tracking, which is lack of reliability when localizing objects in 3D space. Furthermore, it is still challenging to learn discriminative features for temporally-consistent detection in different frames, and the affinity matrix is normally learned from independent object features without considering the feature interaction between detected objects in the different frames. To settle these problems, We firstly employ a joint feature extractor to fuse the 2D and 3D appearance features captured from both 2D RGB images and 3D point clouds respectively, and then propose a novel convolutional operation, named RelationConv, to better exploit the correlation between each pair of objects in the adjacent frames, and learn a deep affinity matrix for further data association. We finally provide extensive evaluation to reveal that our proposed model achieves state-of-the-art performance on KITTI tracking benchmark.
CVSep 9, 2020
RoIFusion: 3D Object Detection from LiDAR and VisionCan Chen, Luca Zanotti Fragonara, Antonios Tsourdos
When localizing and detecting 3D objects for autonomous driving scenes, obtaining information from multiple sensor (e.g. camera, LIDAR) typically increases the robustness of 3D detectors. However, the efficient and effective fusion of different features captured from LIDAR and camera is still challenging, especially due to the sparsity and irregularity of point cloud distributions. This notwithstanding, point clouds offer useful complementary information. In this paper, we would like to leverage the advantages of LIDAR and camera sensors by proposing a deep neural network architecture for the fusion and the efficient detection of 3D objects by identifying their corresponding 3D bounding boxes with orientation. In order to achieve this task, instead of densely combining the point-wise feature of the point cloud and the related pixel features, we propose a novel fusion algorithm by projecting a set of 3D Region of Interests (RoIs) from the point clouds to the 2D RoIs of the corresponding the images. Finally, we demonstrate that our deep fusion approach achieves state-of-the-art performance on the KITTI 3D object detection challenging benchmark.
LGJun 10, 2020
Scalable Partial Explainability in Neural Networks via Flexible Activation FunctionsSchyler C. Sun, Chen Li, Zhuangkun Wei et al.
Achieving transparency in black-box deep learning algorithms is still an open challenge. High dimensional features and decisions given by deep neural networks (NN) require new algorithms and methods to expose its mechanisms. Current state-of-the-art NN interpretation methods (e.g. Saliency maps, DeepLIFT, LIME, etc.) focus more on the direct relationship between NN outputs and inputs rather than the NN structure and operations itself. In current deep NN operations, there is uncertainty over the exact role played by neurons with fixed activation functions. In this paper, we achieve partially explainable learning model by symbolically explaining the role of activation functions (AF) under a scalable topology. This is carried out by modeling the AFs as adaptive Gaussian Processes (GP), which sit within a novel scalable NN topology, based on the Kolmogorov-Arnold Superposition Theorem (KST). In this scalable NN architecture, the AFs are generated by GP interpolation between control points and can thus be tuned during the back-propagation procedure via gradient descent. The control points act as the core enabler to both local and global adjustability of AF, where the GP interpolation constrains the intrinsic autocorrelation to avoid over-fitting. We show that there exists a trade-off between the NN's expressive power and interpretation complexity, under linear KST topology scaling. To demonstrate this, we perform a case study on a binary classification dataset of banknote authentication. By quantitatively and qualitatively investigating the mapping relationship between inputs and output, our explainable model can provide interpretation over each of the one-dimensional attributes. These early results suggest that our model has the potential to act as the final interpretation layer for deep neural networks.
CVFeb 27, 2020
Improving Learning Effectiveness For Object Detection and Classification in Cluttered BackgroundsVinorth Varatharasan, Hyo-Sang Shin, Antonios Tsourdos et al.
Usually, Neural Networks models are trained with a large dataset of images in homogeneous backgrounds. The issue is that the performance of the network models trained could be significantly degraded in a complex and heterogeneous environment. To mitigate the issue, this paper develops a framework that permits to autonomously generate a training dataset in heterogeneous cluttered backgrounds. It is clear that the learning effectiveness of the proposed framework should be improved in complex and heterogeneous environments, compared with the ones with the typical dataset. In our framework, a state-of-the-art image segmentation technique called DeepLab is used to extract objects of interest from a picture and Chroma-key technique is then used to merge the extracted objects of interest into specific heterogeneous backgrounds. The performance of the proposed framework is investigated through empirical tests and compared with that of the model trained with the COCO dataset. The results show that the proposed framework outperforms the model compared. This implies that the learning effectiveness of the framework developed is superior to the models with the typical dataset.
CVSep 23, 2019
Go Wider: An Efficient Neural Network for Point Cloud Analysis via Group ConvolutionsCan Chen, Luca Zanotti Fragonara, Antonios Tsourdos
In order to achieve better performance for point cloud analysis, many researchers apply deeper neural networks using stacked Multi-Layer-Perceptron (MLP) convolutions over irregular point cloud. However, applying dense MLP convolutions over large amount of points (e.g. autonomous driving application) leads to inefficiency in memory and computation. To achieve high performance but less complexity, we propose a deep-wide neural network, called ShufflePointNet, to exploit fine-grained local features and reduce redundancy in parallel using group convolution and channel shuffle operation. Unlike conventional operation that directly applies MLPs on high-dimensional features of point cloud, our model goes wider by splitting features into groups in advance, and each group with certain smaller depth is only responsible for respective MLP operation, which can reduce complexity and allows to encode more useful information. Meanwhile, we connect communication between groups by shuffling groups in feature channel to capture fine-grained features. We claim that, multi-branch method for wider neural networks is also beneficial to feature extraction for point cloud. We present extensive experiments for shape classification task on ModelNet40 dataset and semantic segmentation task on large scale datasets ShapeNet part, S3DIS and KITTI. We further perform ablation study and compare our model to other state-of-the-art algorithms in terms of complexity and accuracy.
AIAug 19, 2019
A Domain-Knowledge-Aided Deep Reinforcement Learning Approach for Flight Control DesignHyo-Sang Shin, Shaoming He, Antonios Tsourdos
This paper aims to examine the potential of using the emerging deep reinforcement learning techniques in flight control. Instead of learning from scratch, we suggest to leverage domain knowledge available in learning to improve learning efficiency and generalisability. More specifically, the proposed approach fixes the autopilot structure as typical three-loop autopilot and deep reinforcement learning is utilised to learn the autopilot gains. To solve the flight control problem, we then formulate a Markovian decision process with a proper reward function that enable the application of reinforcement learning theory. Another type of domain knowledge is exploited for defining the reward function, by shaping reference inputs in consideration of important control objectives and using the shaped reference inputs in the reward function. The state-of-the-art deep deterministic policy gradient algorithm is utilised to learn an action policy that maps the observed states to the autopilot gains. Extensive empirical numerical simulations are performed to validate the proposed computational control algorithm.
CVJun 10, 2019
Fast Hierarchical Neural Network for Feature Learning on Point CloudCan Chen, Luca Zanotti Fragonara, Antonios Tsourdos
The analyses relying on 3D point clouds are an utterly complex task, often involving million of points, but also requiring computationally efficient algorithms because of many real-time applications; e.g. autonomous vehicle. However, point clouds are intrinsically irregular and the points are sparsely distributed in a non-Euclidean space, which normally requires point-wise processing to achieve high performances. Although shared filter matrices and pooling layers in convolutional neural networks (CNNs) are capable of reducing the dimensionality of the problem and extracting high-level information simultaneously, grids and highly regular data format are required as input. In order to balance model performance and complexity, we introduce a novel neural network architecture exploiting local features from a manually subsampled point set. In our network, a recursive farthest point sampling method is firstly applied to efficiently cover the entire point set. Successively, we employ the k-nearest neighbours (knn) algorithm to gather local neighbourhood for each group of the subsampled points. Finally, a multiple layer perceptron (MLP) is applied on the subsampled points and edges that connect corresponding point and neighbours to extract local features. The architecture has been tested for both shape classification and segmentation using the ModelNet40 and ShapeNet part datasets, in order to show that the network achieves the best trade-off in terms of competitive performance when compared to other state-of-the-art algorithms.
CVMay 21, 2019
GAPNet: Graph Attention based Point Neural Network for Exploiting Local Feature of Point CloudCan Chen, Luca Zanotti Fragonara, Antonios Tsourdos
Exploiting fine-grained semantic features on point cloud is still challenging due to its irregular and sparse structure in a non-Euclidean space. Among existing studies, PointNet provides an efficient and promising approach to learn shape features directly on unordered 3D point cloud and has achieved competitive performance. However, local feature that is helpful towards better contextual learning is not considered. Meanwhile, attention mechanism shows efficiency in capturing node representation on graph-based data by attending over neighboring nodes. In this paper, we propose a novel neural network for point cloud, dubbed GAPNet, to learn local geometric representations by embedding graph attention mechanism within stacked Multi-Layer-Perceptron (MLP) layers. Firstly, we introduce a GAPLayer to learn attention features for each point by highlighting different attention weights on neighborhood. Secondly, in order to exploit sufficient features, a multi-head mechanism is employed to allow GAPLayer to aggregate different features from independent heads. Thirdly, we propose an attention pooling layer over neighbors to capture local signature aimed at enhancing network robustness. Finally, GAPNet applies stacked MLP layers to attention features and local signature to fully extract local geometric structures. The proposed GAPNet architecture is tested on the ModelNet40 and ShapeNet part datasets, and achieves state-of-the-art performance in both shape classification and part segmentation tasks.
MANov 18, 2017
Anonymous Hedonic Game for Task Allocation in a Large-Scale Multiple Agent SystemInmo Jang, Hyo-Sang Shin, Antonios Tsourdos
This paper proposes a novel game-theoretical autonomous decision-making framework to address a task allocation problem for a swarm of multiple agents. We consider cooperation of self-interested agents, and show that our proposed decentralized algorithm guarantees convergence of agents with social inhibition to a Nash stable partition (i.e., social agreement) within polynomial time. The algorithm is simple and executable based on local interactions with neighbor agents under a strongly-connected communication network and even in asynchronous environments. We analytically present a mathematical formulation for computing the lower bound of suboptimality of the solution, and additionally show that 50% of suboptimality can be at least guaranteed if social utilities are non-decreasing functions with respect to the number of co-working agents. The results of numerical experiments confirm that the proposed framework is scalable, fast adaptable against dynamical environments, and robust even in a realistic situation.