Erik Blasch

CV
h-index39
32papers
1,422citations
Novelty38%
AI Score50

32 Papers

CVMar 19, 2023Code
CCTV-Gun: Benchmarking Handgun Detection in CCTV Images

Srikar Yellapragada, Zhenghong Li, Kevin Bhadresh Doshi et al.

Gun violence is a critical security problem, and it is imperative for the computer vision community to develop effective gun detection algorithms for real-world scenarios, particularly in Closed Circuit Television (CCTV) surveillance data. Despite significant progress in visual object detection, detecting guns in real-world CCTV images remains a challenging and under-explored task. Firearms, especially handguns, are typically very small in size, non-salient in appearance, and often severely occluded or indistinguishable from other small objects. Additionally, the lack of principled benchmarks and difficulty collecting relevant datasets further hinder algorithmic development. In this paper, we present a meticulously crafted and annotated benchmark, called \textbf{CCTV-Gun}, which addresses the challenges of detecting handguns in real-world CCTV images. Our contribution is three-fold. Firstly, we carefully select and analyze real-world CCTV images from three datasets, manually annotate handguns and their holders, and assign each image with relevant challenge factors such as blur and occlusion. Secondly, we propose a new cross-dataset evaluation protocol in addition to the standard intra-dataset protocol, which is vital for gun detection in practical settings. Finally, we comprehensively evaluate both classical and state-of-the-art object detection algorithms, providing an in-depth analysis of their generalizing abilities. The benchmark will facilitate further research and development on this topic and ultimately enhance security. Code, annotations, and trained models are available at https://github.com/srikarym/CCTV-Gun.

CVSep 13, 2023Code
Transparent Object Tracking with Enhanced Fusion Module

Kalyan Garigapati, Erik Blasch, Jie Wei et al.

Accurate tracking of transparent objects, such as glasses, plays a critical role in many robotic tasks such as robot-assisted living. Due to the adaptive and often reflective texture of such objects, traditional tracking algorithms that rely on general-purpose learned features suffer from reduced performance. Recent research has proposed to instill transparency awareness into existing general object trackers by fusing purpose-built features. However, with the existing fusion techniques, the addition of new features causes a change in the latent space making it impossible to incorporate transparency awareness on trackers with fixed latent spaces. For example, many of the current days transformer-based trackers are fully pre-trained and are sensitive to any latent space perturbations. In this paper, we present a new feature fusion technique that integrates transparency information into a fixed feature space, enabling its use in a broader range of trackers. Our proposed fusion module, composed of a transformer encoder and an MLP module, leverages key query-based transformations to embed the transparency information into the tracking pipeline. We also present a new two-step training strategy for our fusion module to effectively merge transparency features. We propose a new tracker architecture that uses our fusion techniques to achieve superior results for transparent object tracking. Our proposed method achieves competitive results with state-of-the-art trackers on TOTB, which is the largest transparent object tracking benchmark recently released. Our results and the implementation of code will be made publicly available at https://github.com/kalyan0510/TOTEM.

CVJun 21, 2022
Automatic Concept Extraction for Concept Bottleneck-based Video Classification

Jeya Vikranth Jeyakumar, Luke Dickens, Luis Garcia et al.

Recent efforts in interpretable deep learning models have shown that concept-based explanation methods achieve competitive accuracy with standard end-to-end models and enable reasoning and intervention about extracted high-level visual concepts from images, e.g., identifying the wing color and beak length for bird-species classification. However, these concept bottleneck models rely on a necessary and sufficient set of predefined concepts-which is intractable for complex tasks such as video classification. For complex tasks, the labels and the relationship between visual elements span many frames, e.g., identifying a bird flying or catching prey-necessitating concepts with various levels of abstraction. To this end, we present CoDEx, an automatic Concept Discovery and Extraction module that rigorously composes a necessary and sufficient set of concept abstractions for concept-based video classification. CoDEx identifies a rich set of complex concept abstractions from natural language explanations of videos-obviating the need to predefine the amorphous set of concepts. To demonstrate our method's viability, we construct two new public datasets that combine existing complex video classification datasets with short, crowd-sourced natural language explanations for their labels. Our method elicits inherent complex concept abstractions in natural language to generalize concept-bottleneck methods to complex tasks.

CRJul 22, 2022
DeFakePro: Decentralized DeepFake Attacks Detection using ENF Authentication

Deeraj Nagothu, Ronghua Xu, Yu Chen et al.

Advancements in generative models, like Deepfake allows users to imitate a targeted person and manipulate online interactions. It has been recognized that disinformation may cause disturbance in society and ruin the foundation of trust. This article presents DeFakePro, a decentralized consensus mechanism-based Deepfake detection technique in online video conferencing tools. Leveraging Electrical Network Frequency (ENF), an environmental fingerprint embedded in digital media recording, affords a consensus mechanism design called Proof-of-ENF (PoENF) algorithm. The similarity in ENF signal fluctuations is utilized in the PoENF algorithm to authenticate the media broadcasted in conferencing tools. By utilizing the video conferencing setup with malicious participants to broadcast deep fake video recordings to other participants, the DeFakePro system verifies the authenticity of the incoming media in both audio and video channels.

13.3AIApr 16
Predicting Power-System Dynamic Trajectories with Foundation Models

Haoran Li, Lihao Mai, Chenhan Xiao et al.

As power systems transition toward renewable-rich and inverter-dominated operations, accurate time-domain dynamic analysis becomes increasingly critical. Such analysis supports key operational tasks, including transient stability assessment, dynamic security analysis, contingency screening, and post-fault trajectory evaluation. In practice, these tasks may operate under several challenges, including unknown and time-varying system parameters, privacy constraints on data sharing, and the need for fast online inference. Existing learning-based approaches are typically trained for individual systems and therefore lack generalization across operating conditions and physical parameters. Hence, this paper proposes LArge Scale Small ODE (LASS)-ODE-Power, a learning framework for general-purpose time-domain prediction. The proposed approach leverages large-scale pretraining on more than 40 GB of DAE or ordinary differential-equation (ODE) trajectories to learn transferable representations. The resulting model supports trajectory prediction from short measurement prefixes across diverse dynamic regimes, including electromechanical and inverter-driven systems. Hence, the model can be directly used without data sharing in a zero-shot setting. In addition, the proposed architecture incorporates parallel and linearized computation to achieve fast inference. Moreover, to enhance task-specific performance in power systems, a specialized fine-tuning strategy is developed based on approximately 1 GB of heterogeneous power-system dynamic data. Extensive experiments over diverse power-system simulation scenarios demonstrate that LASS-ODE-Power consistently outperforms existing learning-based models in trajectory prediction accuracy with efficient inference.

CYJan 22, 2025
PADTHAI-MM: Principles-based Approach for Designing Trustworthy, Human-centered AI using MAST Methodology

Myke C. Cohen, Nayoung Kim, Yang Ba et al.

Despite an extensive body of literature on trust in technology, designing trustworthy AI systems for high-stakes decision domains remains a significant challenge, further compounded by the lack of actionable design and evaluation tools. The Multisource AI Scorecard Table (MAST) was designed to bridge this gap by offering a systematic, tradecraft-centered approach to evaluating AI-enabled decision support systems. Expanding on MAST, we introduce an iterative design framework called \textit{Principles-based Approach for Designing Trustworthy, Human-centered AI using MAST Methodology} (PADTHAI-MM). We demonstrate this framework in our development of the Reporting Assistant for Defense and Intelligence Tasks (READIT), a research platform that leverages data visualizations and natural language processing-based text analysis, emulating an AI-enabled system supporting intelligence reporting work. To empirically assess the efficacy of MAST on trust in AI, we developed two distinct iterations of READIT for comparison: a High-MAST version, which incorporates AI contextual information and explanations, and a Low-MAST version, akin to a ``black box'' system. This iterative design process, guided by stakeholder feedback and contemporary AI architectures, culminated in a prototype that was evaluated through its use in an intelligence reporting task. We further discuss the potential benefits of employing the MAST-inspired design framework to address context-specific needs. We also explore the relationship between stakeholder evaluators' MAST ratings and three categories of information known to impact trust: \textit{process}, \textit{purpose}, and \textit{performance}. Overall, our study supports the practical benefits and theoretical validity for PADTHAI-MM as a viable method for designing trustable, context-specific AI systems.

7.3LGMar 27
Context-specific Credibility-aware Multimodal Fusion with Conditional Probabilistic Circuits

Pranuthi Tenali, Sahil Sidheekh, Saurabh Mathur et al.

Multimodal fusion requires integrating information from multiple sources that may conflict depending on context. Existing fusion approaches typically rely on static assumptions about source reliability, limiting their ability to resolve conflicts when a modality becomes unreliable due to situational factors such as sensor degradation or class-specific corruption. We introduce C$^2$MF, a context-specfic credibility-aware multimodal fusion framework that models per-instance source reliability using a Conditional Probabilistic Circuit (CPC). We formalize instance-level reliability through Context-Specific Information Credibility (CSIC), a KL-divergence-based measure computed exactly from the CPC. CSIC generalizes conventional static credibility estimates as a special case, enabling principled and adaptive reliability assessment. To evaluate robustness under cross-modal conflicts, we propose the Conflict benchmark, in which class-specific corruptions deliberately induce discrepancies between different modalities. Experimental results show that C$^2$MF improves predictive accuracy by up to 29% over static-reliability baselines in high-noise settings, while preserving the interpretability advantages of probabilistic circuit-based fusion.

CVOct 26, 2023
Image Prior and Posterior Conditional Probability Representation for Efficient Damage Assessment

Jie Wei, Weicong Feng, Erik Blasch et al.

It is important to quantify Damage Assessment (DA) for Human Assistance and Disaster Response (HADR) applications. In this paper, to achieve efficient and scalable DA in HADR, an image prior and posterior conditional probability (IP2CP) is developed as an effective computational imaging representation. Equipped with the IP2CP representation, the matching pre- and post-disaster images are effectively encoded into one image that is then processed using deep learning approaches to determine the damage levels. Two scenarios of crucial importance for the practical use of DA in HADR applications are examined: pixel-wise semantic segmentation and patch-based contrastive learning-based global damage classification. Results achieved by IP2CP in both scenarios demonstrate promising performances, showing that our IP2CP-based methods within the deep learning framework can effectively achieve data and computational efficiency, which is of utmost importance for the DA in HADR applications.

CLFeb 8, 2025
Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey

Bo Ni, Zheyuan Liu, Leyao Wang et al.

Retrieval-Augmented Generation (RAG) is an advanced technique designed to address the challenges of Artificial Intelligence-Generated Content (AIGC). By integrating context retrieval into content generation, RAG provides reliable and up-to-date external knowledge, reduces hallucinations, and ensures relevant context across a wide range of tasks. However, despite RAG's success and potential, recent studies have shown that the RAG paradigm also introduces new risks, including robustness issues, privacy concerns, adversarial attacks, and accountability issues. Addressing these risks is critical for future applications of RAG systems, as they directly impact their trustworthiness. Although various methods have been developed to improve the trustworthiness of RAG methods, there is a lack of a unified perspective and framework for research in this topic. Thus, in this paper, we aim to address this gap by providing a comprehensive roadmap for developing trustworthy RAG systems. We place our discussion around five key perspectives: reliability, privacy, safety, fairness, explainability, and accountability. For each perspective, we present a general framework and taxonomy, offering a structured approach to understanding the current challenges, evaluating existing solutions, and identifying promising future research directions. To encourage broader adoption and innovation, we also highlight the downstream applications where trustworthy RAG systems have a significant impact.

AIOct 11, 2024
Towards Trustworthy Knowledge Graph Reasoning: An Uncertainty Aware Perspective

Bo Ni, Yu Wang, Lu Cheng et al.

Recently, Knowledge Graphs (KGs) have been successfully coupled with Large Language Models (LLMs) to mitigate their hallucinations and enhance their reasoning capability, such as in KG-based retrieval-augmented frameworks. However, current KG-LLM frameworks lack rigorous uncertainty estimation, limiting their reliable deployment in high-stakes applications. Directly incorporating uncertainty quantification into KG-LLM frameworks presents challenges due to their complex architectures and the intricate interactions between the knowledge graph and language model components. To address this gap, we propose a new trustworthy KG-LLM framework, Uncertainty Aware Knowledge-Graph Reasoning (UAG), which incorporates uncertainty quantification into the KG-LLM framework. We design an uncertainty-aware multi-step reasoning framework that leverages conformal prediction to provide a theoretical guarantee on the prediction set. To manage the error rate of the multi-step process, we additionally introduce an error rate control module to adjust the error rate within the individual components. Extensive experiments show that our proposed UAG can achieve any pre-defined coverage rate while reducing the prediction set/interval size by 40% on average over the baselines.

LGMar 5, 2024
Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits

Sahil Sidheekh, Pranuthi Tenali, Saurabh Mathur et al.

We consider the problem of late multi-modal fusion for discriminative learning. Motivated by noisy, multi-source domains that require understanding the reliability of each data source, we explore the notion of credibility in the context of multi-modal fusion. We propose a combination function that uses probabilistic circuits (PCs) to combine predictive distributions over individual modalities. We also define a probabilistic measure to evaluate the credibility of each modality via inference queries over the PC. Our experimental evaluation demonstrates that our fusion method can reliably infer credibility while maintaining competitive performance with the state-of-the-art.

CVAug 2, 2025
Effective Damage Data Generation by Fusing Imagery with Human Knowledge Using Vision-Language Models

Jie Wei, Erika Ardiles-Cruz, Aleksey Panasyuk et al.

It is of crucial importance to assess damages promptly and accurately in humanitarian assistance and disaster response (HADR). Current deep learning approaches struggle to generalize effectively due to the imbalance of data classes, scarcity of moderate damage examples, and human inaccuracy in pixel labeling during HADR situations. To accommodate for these limitations and exploit state-of-the-art techniques in vision-language models (VLMs) to fuse imagery with human knowledge understanding, there is an opportunity to generate a diversified set of image-based damage data effectively. Our initial experimental results suggest encouraging data generation quality, which demonstrates an improvement in classifying scenes with different levels of structural damage to buildings, roads, and infrastructures.

LGFeb 1
LASS-ODE: Scaling ODE Computations to Connect Foundation Models with Dynamical Physical Systems

Haoran Li, Chenhan Xiao, Lihao Mai et al.

Foundation models have transformed language, vision, and time series data analysis, yet progress on dynamic predictions for physical systems remains limited. Given the complexity of physical constraints, two challenges stand out. $(i)$ Physics-computation scalability: physics-informed learning can enforce physical regularization, but its computation (e.g., ODE integration) does not scale to extensive systems. $(ii)$ Knowledge-sharing efficiency: the attention mechanism is primarily computed within each system, which limits the extraction of shared ODE structures across systems. We show that enforcing ODE consistency does not require expensive nonlinear integration: a token-wise locally linear ODE representation preserves physical fidelity while scaling to foundation-model regimes. Thus, we propose novel token representations that respect locally linear ODE evolution. Such linearity substantially accelerates integration while accurately approximating the local data manifold. Second, we introduce a simple yet effective inter-system attention that augments attention with a common structure hub (CSH) that stores shared tokens and aggregates knowledge across systems. The resulting model, termed LASS-ODE (\underline{LA}rge-\underline{S}cale \underline{S}mall \underline{ODE}), is pretrained on our $40$GB ODE trajectory collections to enable strong in-domain performance, zero-shot generalization across diverse ODE systems, and additional improvements through fine-tuning.

AINov 3, 2021
The Powerful Use of AI in the Energy Sector: Intelligent Forecasting

Erik Blasch, Haoran Li, Zhihao Ma et al.

Artificial Intelligence (AI) techniques continue to broaden across governmental and public sectors, such as power and energy - which serve as critical infrastructures for most societal operations. However, due to the requirements of reliability, accountability, and explainability, it is risky to directly apply AI-based methods to power systems because society cannot afford cascading failures and large-scale blackouts, which easily cost billions of dollars. To meet society requirements, this paper proposes a methodology to develop, deploy, and evaluate AI systems in the energy sector by: (1) understanding the power system measurements with physics, (2) designing AI algorithms to forecast the need, (3) developing robust and accountable AI methods, and (4) creating reliable measures to evaluate the performance of the AI model. The goal is to provide a high level of confidence to energy utility users. For illustration purposes, the paper uses power system event forecasting (PEF) as an example, which carefully analyzes synchrophasor patterns measured by the Phasor Measurement Units (PMUs). Such a physical understanding leads to a data-driven framework that reduces the dimensionality with physics and forecasts the event with high credibility. Specifically, for dimensionality reduction, machine learning arranges physical information from different dimensions, resulting inefficient information extraction. For event forecasting, the supervised learning model fuses the results of different models to increase the confidence. Finally, comprehensive experiments demonstrate the high accuracy, efficiency, and reliability as compared to other state-of-the-art machine learning methods.

AINov 3, 2021
Certifiable Artificial Intelligence Through Data Fusion

Erik Blasch, Junchi Bin, Zheng Liu

This paper reviews and proposes concerns in adopting, fielding, and maintaining artificial intelligence (AI) systems. While the AI community has made rapid progress, there are challenges in certifying AI systems. Using procedures from design and operational test and evaluation, there are opportunities towards determining performance bounds to manage expectations of intended use. A notional use case is presented with image data fusion to support AI object recognition certifiability considering precision versus distance.

LGOct 27, 2021
NIDA-CLIFGAN: Natural Infrastructure Damage Assessment through Efficient Classification Combining Contrastive Learning, Information Fusion and Generative Adversarial Networks

Jie Wei, Zhigang Zhu, Erik Blasch et al.

During natural disasters, aircraft and satellites are used to survey the impacted regions. Usually human experts are needed to manually label the degrees of the building damage so that proper humanitarian assistance and disaster response (HADR) can be achieved, which is labor-intensive and time-consuming. Expecting human labeling of major disasters over a wide area gravely slows down the HADR efforts. It is thus of crucial interest to take advantage of the cutting-edge Artificial Intelligence and Machine Learning techniques to speed up the natural infrastructure damage assessment process to achieve effective HADR. Accordingly, the paper demonstrates a systematic effort to achieve efficient building damage classification. First, two novel generative adversarial nets (GANs) are designed to augment data used to train the deep-learning-based classifier. Second, a contrastive learning based method using novel data structures is developed to achieve great performance. Third, by using information fusion, the classifier is effectively trained with very few training data samples for transfer learning. All the classifiers are small enough to be loaded in a smart phone or simple laptop for first responders. Based on the available overhead imagery dataset, results demonstrate data and computational efficiency with 10% of the collected data combined with a GAN reducing the time of computation from roughly half a day to about 1 hour with roughly similar classification performances.

NIApr 2, 2021
UAV-Assisted Communication in Remote Disaster Areas using Imitation Learning

Alireza Shamsoshoara, Fatemeh Afghah, Erik Blasch et al.

The damage to cellular towers during natural and man-made disasters can disturb the communication services for cellular users. One solution to the problem is using unmanned aerial vehicles to augment the desired communication network. The paper demonstrates the design of a UAV-Assisted Imitation Learning (UnVAIL) communication system that relays the cellular users' information to a neighbor base station. Since the user equipment (UEs) are equipped with buffers with limited capacity to hold packets, UnVAIL alternates between different UEs to reduce the chance of buffer overflow, positions itself optimally close to the selected UE to reduce service time, and uncovers a network pathway by acting as a relay node. UnVAIL utilizes Imitation Learning (IL) as a data-driven behavioral cloning approach to accomplish an optimal scheduling solution. Results demonstrate that UnVAIL performs similar to a human expert knowledge-based planning in communication timeliness, position accuracy, and energy consumption with an accuracy of 97.52% when evaluated on a developed simulator to train the UAV.

AIFeb 8, 2021
Multisource AI Scorecard Table for System Evaluation

Erik Blasch, James Sung, Tao Nguyen

The paper describes a Multisource AI Scorecard Table (MAST) that provides the developer and user of an artificial intelligence (AI)/machine learning (ML) system with a standard checklist focused on the principles of good analysis adopted by the intelligence community (IC) to help promote the development of more understandable systems and engender trust in AI outputs. Such a scorecard enables a transparent, consistent, and meaningful understanding of AI tools applied for commercial and government use. A standard is built on compliance and agreement through policy, which requires buy-in from the stakeholders. While consistency for testing might only exist across a standard data set, the community requires discussion on verification and validation approaches which can lead to interpretability, explainability, and proper use. The paper explores how the analytic tradecraft standards outlined in Intelligence Community Directive (ICD) 203 can provide a framework for assessing the performance of an AI system supporting various operational needs. These include sourcing, uncertainty, consistency, accuracy, and visualization. Three use cases are presented as notional examples that support security for comparative analysis.

CVDec 28, 2020
Aerial Imagery Pile burn detection using Deep Learning: the FLAME dataset

Alireza Shamsoshoara, Fatemeh Afghah, Abolfazl Razi et al.

Wildfires are one of the costliest and deadliest natural disasters in the US, causing damage to millions of hectares of forest resources and threatening the lives of people and animals. Of particular importance are risks to firefighters and operational forces, which highlights the need for leveraging technology to minimize danger to people and property. FLAME (Fire Luminosity Airborne-based Machine learning Evaluation) offers a dataset of aerial images of fires along with methods for fire detection and segmentation which can help firefighters and researchers to develop optimal fire management strategies. This paper provides a fire image dataset collected by drones during a prescribed burning piled detritus in an Arizona pine forest. The dataset includes video recordings and thermal heatmaps captured by infrared cameras. The captured videos and images are annotated and labeled frame-wise to help researchers easily apply their fire detection and modeling algorithms. The paper also highlights solutions to two machine learning problems: (1) Binary classification of video frames based on the presence [and absence] of fire flames. An Artificial Neural Network (ANN) method is developed that achieved a 76% classification accuracy. (2) Fire detection using segmentation methods to precisely determine fire borders. A deep learning method is designed based on the U-Net up-sampling and down-sampling approach to extract a fire mask from the video frames. Our FLAME method approached a precision of 92% and a recall of 84%. Future research will expand the technique for free burning broadcast fire using thermal images.

CRApr 16, 2020
Hybrid Blockchain-Enabled Secure Microservices Fabric for Decentralized Multi-Domain Avionics Systems

Ronghua Xu, Yu Chen, Erik Blasch et al.

Advancement in artificial intelligence (AI) and machine learning (ML), dynamic data driven application systems (DDDAS), and hierarchical cloud-fog-edge computing paradigm provide opportunities for enhancing multi-domain systems performance. As one example that represents multi-domain scenario, a "fly-by-feel" system utilizes DDDAS framework to support autonomous operations and improve maneuverability, safety and fuel efficiency. The DDDAS "fly-by-feel" avionics system can enhance multi-domain coordination to support domain specific operations. However, conventional enabling technologies rely on a centralized manner for data aggregation, sharing and security policy enforcement, and it incurs critical issues related to bottleneck of performance, data provenance and consistency. Inspired by the containerized microservices and blockchain technology, this paper introduces BLEM, a hybrid BLockchain-Enabled secure Microservices fabric to support decentralized, secure and efficient data fusion and multi-domain operations for avionics systems. Leveraging the fine-granularity and loose-coupling features of the microservices architecture, multidomain operations and security functionalities are decoupled into multiple containerized microservices. A hybrid blockchain fabric based on two-level committee consensus protocols is proposed to enable decentralized security architecture and support immutability, auditability and traceability for data provenience in existing multi-domain avionics system. Our evaluation results show the feasibility of the proposed BLEM mechanism to support decentralized security service and guarantee immutability, auditability and traceability for data provenience across domain boundaries.

CVMar 9, 2020
I-ViSE: Interactive Video Surveillance as an Edge Service using Unsupervised Feature Queries

Seyed Yahya Nikouei, Yu Chen, Alexander Aved et al.

Situation AWareness (SAW) is essential for many mission critical applications. However, SAW is very challenging when trying to immediately identify objects of interest or zoom in on suspicious activities from thousands of video frames. This work aims at developing a queryable system to instantly select interesting content. While face recognition technology is mature, in many scenarios like public safety monitoring, the features of objects of interest may be much more complicated than face features. In addition, human operators may not be always able to provide a descriptive, simple, and accurate query. Actually, it is more often that there are only rough, general descriptions of certain suspicious objects or accidents. This paper proposes an Interactive Video Surveillance as an Edge service (I-ViSE) based on unsupervised feature queries. Adopting unsupervised methods that do not reveal any private information, the I-ViSE scheme utilizes general features of a human body and color of clothes. An I-ViSE prototype is built following the edge-fog computing paradigm and the experimental results verified the I-ViSE scheme meets the design goal of scene recognition in less than two seconds.

CYNov 3, 2019
Artificial Intelligence Strategies for National Security and Safety Standards

Erik Blasch, James Sung, Tao Nguyen et al.

Recent advances in artificial intelligence (AI) have lead to an explosion of multimedia applications (e.g., computer vision (CV) and natural language processing (NLP)) for different domains such as commercial, industrial, and intelligence. In particular, the use of AI applications in a national security environment is often problematic because the opaque nature of the systems leads to an inability for a human to understand how the results came about. A reliance on 'black boxes' to generate predictions and inform decisions is potentially disastrous. This paper explores how the application of standards during each stage of the development of an AI system deployed and used in a national security environment would help enable trust. Specifically, we focus on the standards outlined in Intelligence Community Directive 203 (Analytic Standards) to subject machine outputs to the same rigorous standards as analysis performed by humans.

CVSep 12, 2019
I-SAFE: Instant Suspicious Activity identiFication at the Edge using Fuzzy Decision Making

Seyed Yahya Nikouei, Yu Chen, Alexander Aved et al.

Urban imagery usually serves as forensic analysis and by design is available for incident mitigation. As more imagery collected, it is harder to narrow down to certain frames among thousands of video clips to a specific incident. A real-time, proactive surveillance system is desirable, which could instantly detect dubious personnel, identify suspicious activities, or raise momentous alerts. The recent proliferation of the edge computing paradigm allows more data-intensive tasks to be accomplished by smart edge devices with lightweight but powerful algorithms. This paper presents a forensic surveillance strategy by introducing an Instant Suspicious Activity identiFication at the Edge (I-SAFE) using fuzzy decision making. A fuzzy control system is proposed to mimic the decision-making process of a security officer. Decisions are made based on video features extracted by a lightweight Deep Machine Learning (DML) model. Based on the requirements from the first-line law enforcement officers, several features are selected and fuzzified to cope with the state of uncertainty that exists in the officers' decision-making process. Using features in the edge hierarchy minimizes the communication delay such that instant alerting is achieved. Additionally, leveraging the Microservices architecture, the I-SAFE scheme possesses good scalability given the increasing complexities at the network edge. Implemented as an edge-based application and tested using exemplary and various labeled dataset surveillance videos, the I-SAFE scheme raises alerts by identifying the suspicious activity in an average of 0.002 seconds. Compared to four other state-of-the-art methods over two other data sets, the experimental study verified the superiority of the I-SAFE decentralized method.

CVApr 16, 2019
Clustered Object Detection in Aerial Images

Fan Yang, Heng Fan, Peng Chu et al.

Detecting objects in aerial images is challenging for at least two reasons: (1) target objects like pedestrians are very small in pixels, making them hardly distinguished from surrounding background; and (2) targets are in general sparsely and non-uniformly distributed, making the detection very inefficient. In this paper, we address both issues inspired by observing that these targets are often clustered. In particular, we propose a Clustered Detection (ClusDet) network that unifies object clustering and detection in an end-to-end framework. The key components in ClusDet include a cluster proposal sub-network (CPNet), a scale estimation sub-network (ScaleNet), and a dedicated detection network (DetecNet). Given an input image, CPNet produces object cluster regions and ScaleNet estimates object scales for these regions. Then, each scale-normalized cluster region is fed into DetecNet for object detection. ClusDet has several advantages over previous solutions: (1) it greatly reduces the number of chips for final object detection and hence achieves high running time efficiency, (2) the cluster-based scale estimation is more accurate than previously used single-object based ones, hence effectively improves the detection for small objects, and (3) the final DetecNet is dedicated for clustered regions and implicitly models the prior context information so as to boost detection accuracy. The proposed method is tested on three popular aerial image datasets including VisDrone, UAVDT and DOTA. In all experiments, ClusDet achieves promising performance in comparison with state-of-the-art detectors. Code will be available in \url{https://github.com/fyangneil}.

DCMar 11, 2019
Decentralized Smart Surveillance through Microservices Platform

Seyed Yahya Nikouei, Ronghua Xu, Yu Chen et al.

Connected societies require reliable measures to assure the safety, privacy, and security of members. Public safety technology has made fundamental improvements since the first generation of surveillance cameras were introduced, which aims to reduce the role of observer agents so that no abnormality goes unnoticed. While the edge computing paradigm promises solutions to address the shortcomings of cloud computing, e.g., the extra communication delay and network security issues, it also introduces new challenges. One of the main concerns is the limited computing power at the edge to meet the on-site dynamic data processing. In this paper, a Lightweight IoT (Internet of Things) based Smart Public Safety (LISPS) framework is proposed on top of microservices architecture. As a computing hierarchy at the edge, the LISPS system possesses high flexibility in the design process, loose coupling to add new services or update existing functions without interrupting the normal operations, and efficient power balancing. A real-world public safety monitoring scenario is selected to verify the effectiveness of LISPS, which detects, tracks human objects and identify suspicious activities. The experimental results demonstrate the feasibility of the approach.

CRMar 8, 2019
A Study on Smart Online Frame Forging Attacks against Video Surveillance System

Deeraj Nagothu, Jacob Schwell, Yu Chen et al.

Video Surveillance Systems (VSS) have become an essential infrastructural element of smart cities by increasing public safety and countering criminal activities. A VSS is normally deployed in a secure network to prevent access from unauthorized personnel. Compared to traditional systems that continuously record video regardless of the actions in the frame, a smart VSS has the capability of capturing video data upon motion detection or object detection, and then extracts essential information and send to users. This increasing design complexity of the surveillance system, however, also introduces new security vulnerabilities. In this work, a smart, real-time frame duplication attack is investigated. We show the feasibility of forging the video streams in real-time as the camera's surroundings change. The generated frames are compared constantly and instantly to identify changes in the pixel values that could represent motion detection or changes in light intensities outdoors. An attacker (intruder) can remotely trigger the replay of some previously duplicated video streams manually or automatically, via a special quick response (QR) code or when the face of an intruder appears in the camera field of view. A detection technique is proposed by leveraging the real-time electrical network frequency (ENF) reference database to match with the power grid frequency.

CROct 1, 2018
An Exploration of Blockchain Enabled Decentralized Capability based Access Control Strategy for Space Situation Awareness

Ronghua Xu, Yu Chen, Erik Blasch et al.

Space situation awareness (SSA) includes tracking of active and inactive resident space objects (RSOs) and assessing the space environment through sensor data collection and processing. To enhance SSA, the dynamic data-driven applications systems (DDDAS) framework couples on-line data with off-line models to enhance system performance. Using feedback control, sensor management, and communications reliability. For information management, there is a need for identity authentication and access control to ensure the integrity of exchanged data as well as to grant authorized entities access right to data and services. Due to decentralization and heterogeneity of SSA systems, it is challenging to build an efficient centralized access control system, which could either be a performance bottleneck or the single point of failure. Inspired by the blockchain and smart contract technology, this paper introduces BlendCAC, a decentralized authentication and capability-based access control mechanism to enable effective protection for devices, services and information in SSA networks. To achieve secure identity authentication, the BlendCAC leverages the blockchain to create virtual trust zones and a robust identity-based capability token management strategy is proposed. A proof-of-concept prototype has been implemented on both resources-constrained devices and more powerful computing devices, and is tested on a private Ethereum blockchain network. The experimental results demonstrate the feasibility of the BlendCAC scheme to offer a decentralized, scalable, lightweight and fine-grained access control solution for space system towards SSA.

DCJul 17, 2018
Real-Time Index Authentication for Event-Oriented Surveillance Video Query using Blockchain

Seyed Yahya Nikouei, Ronghua Xu, Deeraj Nagothu et al.

Information from surveillance video is essential for situational awareness (SAW). Nowadays, a prohibitively large amount of surveillance data is being generated continuously by ubiquitously distributed video sensors. It is very challenging to immediately identify the objects of interest or zoom in suspicious actions from thousands of video frames. Making the big data indexable is critical to tackle this problem. It is ideal to generate pattern indexes in a real-time, on-site manner on the video streaming instead of depending on the batch processing at the cloud centers. The modern edge-fog-cloud computing paradigm allows implementation of time sensitive tasks at the edge of the network. The on-site edge devices collect the information sensed in format of frames and extracts useful features. The near-site fog nodes conduct the contextualization and classification of the features. The remote cloud center is in charge of more data intensive and computing intensive tasks. However, exchanging the index information among devices in different layers raises security concerns where an adversary can capture or tamper with features to mislead the surveillance system. In this paper, a blockchain enabled scheme is proposed to protect the index data through an encrypted secure channel between the edge and fog nodes. It reduces the chance of attacks on the small edge and fog devices. The feasibility of the proposal is validated through intensive experimental analysis.

CVJun 25, 2018
IR2VI: Enhanced Night Environmental Perception by Unsupervised Thermal Image Translation

Shuo Liu, Vijay John, Erik Blasch et al.

Context enhancement is critical for night vision (NV) applications, especially for the dark night situation without any artificial lights. In this paper, we present the infrared-to-visual (IR2VI) algorithm, a novel unsupervised thermal-to-visible image translation framework based on generative adversarial networks (GANs). IR2VI is able to learn the intrinsic characteristics from VI images and integrate them into IR images. Since the existing unsupervised GAN-based image translation approaches face several challenges, such as incorrect mapping and lack of fine details, we propose a structure connection module and a region-of-interest (ROI) focal loss method to address the current limitations. Experimental results show the superiority of the IR2VI algorithm over baseline methods.

NIApr 24, 2018
BlendCAC: A BLockchain-ENabled Decentralized Capability-based Access Control for IoTs

Ronghua Xu, Yu Chen, Erik Blasch et al.

The prevalence of Internet of Things (IoTs) allows heterogeneous embedded smart devices to collaboratively provide smart services with or without human intervention. While leveraging the large scale IoT based applications like Smart Gird or Smart Cities, IoTs also incur more concerns on privacy and security. Among the top security challenges that IoTs face, access authorization is critical in resource sharing and information protection. One of the weaknesses in today's access control (AC) is the centralized authorization server, which can be the performance bottleneck or the single point of failure. In this paper, BlendCAC, a blockchain enabled decentralized capability based AC is proposed for the security of IoTs. The BlendCAC aims at an effective access control processes to devices, services and information in large scale IoT systems. Based on the blockchain network, a capability delegation mechanism is suggested for access permission propagation. A robust identity based capability token management strategy is proposed, which takes advantage of smart contract for registering, propagation and revocation of the access authorization. In the proposed BlendCAC scheme, IoT devices are their own master to control their resources instead of being supervised by a centralized authority. Implemented and tested on a Raspberry Pi device and on a local private blockchain network, our experimental results demonstrate the feasibility of the proposed BlendCAC approach to offer a decentralized, scalable, lightweight and fine grained AC solution to IoT systems.

CVJan 18, 2016
A Comparative Study of Object Trackers for Infrared Flying Bird Tracking

Ying Huang, Hong Zheng, Haibin Ling et al.

Bird strikes present a huge risk for aircraft, especially since traditional airport bird surveillance is mainly dependent on inefficient human observation. Computer vision based technology has been proposed to automatically detect birds, determine bird flying trajectories, and predict aircraft takeoff delays. However, the characteristics of bird flight using imagery and the performance of existing methods applied to flying bird task are not well known. Therefore, we perform infrared flying bird tracking experiments using 12 state-of-the-art algorithms on a real BIRDSITE-IR dataset to obtain useful clues and recommend feature analysis. We also develop a Struck-scale method to demonstrate the effectiveness of multiple scale sampling adaption in handling the object of flying bird with varying shape and scale. The general analysis can be used to develop specialized bird tracking methods for airport safety, wildness and urban bird population studies.