CVJul 5, 2022
Aesthetic Attribute Assessment of Images Numerically on Mixed Multi-attribute DatasetsXin Jin, Xinning Li, Hao Lou et al.
With the continuous development of social software and multimedia technology, images have become a kind of important carrier for spreading information and socializing. How to evaluate an image comprehensively has become the focus of recent researches. The traditional image aesthetic assessment methods often adopt single numerical overall assessment scores, which has certain subjectivity and can no longer meet the higher aesthetic requirements. In this paper, we construct an new image attribute dataset called aesthetic mixed dataset with attributes(AMD-A) and design external attribute features for fusion. Besides, we propose a efficient method for image aesthetic attribute assessment on mixed multi-attribute dataset and construct a multitasking network architecture by using the EfficientNet-B0 as the backbone network. Our model can achieve aesthetic classification, overall scoring and attribute scoring. In each sub-network, we improve the feature extraction through ECA channel attention module. As for the final overall scoring, we adopt the idea of the teacher-student network and use the classification sub-network to guide the aesthetic overall fine-grain regression. Experimental results, using the MindSpore, show that our proposed method can effectively improve the performance of the aesthetic overall and attribute assessment.
DCJan 14, 2016Code
Benchmarking, System Design and Case-studies for Multi-core based Embedded Automotive SystemsPiotr Dziurzanski, Amit Kumar Singh, Leandro S. Indrusiak et al.
In this paper, using of automotive use cases as benchmarks for real-time system design has been proposed. The use cases are described in a format supported by AMALTHEA platform, which is a model based open source development environment for automotive multi-core systems. An example of a simple Electronic Control Unit has been analysed and presented with enough details to reconstruct this system in any format. For researchers willing to use AMALTHEA file format directly, an appropriate parser has been developed and offered. An example of applying this parser and benchmark for optimising makespan while not violating the timing constraints by allocating functionality to different Network on Chip resource is demonstrated.
ROApr 8, 2025
Deep RL-based Autonomous Navigation of Micro Aerial Vehicles (MAVs) in a complex GPS-denied Indoor EnvironmentAmit Kumar Singh, Prasanth Kumar Duba, P. Rajalakshmi
The Autonomy of Unmanned Aerial Vehicles (UAVs) in indoor environments poses significant challenges due to the lack of reliable GPS signals in enclosed spaces such as warehouses, factories, and indoor facilities. Micro Aerial Vehicles (MAVs) are preferred for navigating in these complex, GPS-denied scenarios because of their agility, low power consumption, and limited computational capabilities. In this paper, we propose a Reinforcement Learning based Deep-Proximal Policy Optimization (D-PPO) algorithm to enhance realtime navigation through improving the computation efficiency. The end-to-end network is trained in 3D realistic meta-environments created using the Unreal Engine. With these trained meta-weights, the MAV system underwent extensive experimental trials in real-world indoor environments. The results indicate that the proposed method reduces computational latency by 91\% during training period without significant degradation in performance. The algorithm was tested on a DJI Tello drone, yielding similar results.
CVJan 17, 2024
Fluid Dynamic DNNs for Reliable and Adaptive Distributed Inference on Edge DevicesLei Xun, Mingyu Hu, Hengrui Zhao et al.
Distributed inference is a popular approach for efficient DNN inference at the edge. However, traditional Static and Dynamic DNNs are not distribution-friendly, causing system reliability and adaptability issues. In this paper, we introduce Fluid Dynamic DNNs (Fluid DyDNNs), tailored for distributed inference. Distinct from Static and Dynamic DNNs, Fluid DyDNNs utilize a novel nested incremental training algorithm to enable independent and combined operation of its sub-networks, enhancing system reliability and adaptability. Evaluation on embedded Arm CPUs with a DNN model and the MNIST dataset, shows that in scenarios of single device failure, Fluid DyDNNs ensure continued inference, whereas Static and Dynamic DNNs fail. When devices are fully operational, Fluid DyDNNs can operate in either a High-Accuracy mode and achieve comparable accuracy with Static DNNs, or in a High-Throughput mode and achieve 2.5x and 2x throughput compared with Static and Dynamic DNNs, respectively.
CVOct 12, 2024
Advanced Gesture Recognition for Autism Spectrum Disorder Detection: Integrating YOLOv7, Video Augmentation, and VideoMAE for Naturalistic Video AnalysisAmit Kumar Singh, Vrijendra Singh
Deep learning and contactless sensing technologies have significantly advanced the automated assessment of human behaviors in healthcare. In the context of autism spectrum disorder (ASD), repetitive motor behaviors such as spinning, head banging, and arm flapping are key indicators for diagnosis. This study focuses on distinguishing between children with ASD and typically developed (TD) peers by analyzing videos captured in natural, uncontrolled environments. Using the publicly available Self-Stimulatory Behavior Dataset (SSBD), we address the classification task as a binary problem, ASD vs. TD, based on stereotypical repetitive gestures. We adopt a pipeline integrating YOLOv7-based detection, extensive video augmentations, and the VideoMAE framework, which efficiently captures both spatial and temporal features through a high-ratio masking and reconstruction strategy. Our proposed approach achieves 95% accuracy, 0.93 precision, 0.94 recall, and 0.94 F1 score, surpassing the previous state-of-the-art by a significant margin. These results demonstrate the effectiveness of combining advanced object detection, robust data augmentation, and masked autoencoder-based video modeling for reliable ASD vs. TD classification in naturalistic settings.
CRDec 19, 2020
Confused Modulo Projection based Somewhat Homomorphic Encryption -- Cryptosystem, Library and Applications on Secure Smart CitiesXin Jin, Hongyu Zhang, Xiaodong Li et al.
With the development of cloud computing, the storage and processing of massive visual media data has gradually transferred to the cloud server. For example, if the intelligent video monitoring system cannot process a large amount of data locally, the data will be uploaded to the cloud. Therefore, how to process data in the cloud without exposing the original data has become an important research topic. We propose a single-server version of somewhat homomorphic encryption cryptosystem based on confused modulo projection theorem named CMP-SWHE, which allows the server to complete blind data processing without \emph{seeing} the effective information of user data. On the client side, the original data is encrypted by amplification, randomization, and setting confusing redundancy. Operating on the encrypted data on the server side is equivalent to operating on the original data. As an extension, we designed and implemented a blind computing scheme of accelerated version based on batch processing technology to improve efficiency. To make this algorithm easy to use, we also designed and implemented an efficient general blind computing library based on CMP-SWHE. We have applied this library to foreground extraction, optical flow tracking and object detection with satisfactory results, which are helpful for building smart cities. We also discuss how to extend the algorithm to deep learning applications. Compared with other homomorphic encryption cryptosystems and libraries, the results show that our method has obvious advantages in computing efficiency. Although our algorithm has some tiny errors ($10^{-6}$) when the data is too large, it is very efficient and practical, especially suitable for blind image and video processing.
CRJul 2, 2020
DATE: Defense Against TEmperature Side-Channel Attacks in DVFS Enabled MPSoCsSomdip Dey, Amit Kumar Singh, Xiaohang Wang et al.
Given the constant rise in utilizing embedded devices in daily life, side channels remain a challenge to information flow control and security in such systems. One such important security flaw could be exploited through temperature side-channel attacks, where heat dissipation and propagation from the processing elements are observed over time in order to deduce security flaws. In our proposed methodology, DATE: Defense Against TEmperature side-channel attacks, we propose a novel approach of reducing spatial and temporal thermal gradient, which makes the system more secure against temperature side-channel attacks, and at the same time increases the reliability of the device in terms of lifespan. In this paper, we have also introduced a new metric, Thermal-Security-in-Multi-Processors (TSMP), which is capable of quantifying the security against temperature side-channel attacks on computing systems, and DATE is evaluated to be 139.24% more secure at the most for certain applications than the state-of-the-art, while reducing thermal cycle by 67.42% at the most.
CVJul 5, 2018
MAT-CNN-SOPC: Motionless Analysis of Traffic Using Convolutional Neural Networks on System-On-a-Programmable-ChipSomdip Dey, Grigorios Kalliatakis, Sangeet Saha et al.
Intelligent Transportation Systems (ITS) have become an important pillar in modern "smart city" framework which demands intelligent involvement of machines. Traffic load recognition can be categorized as an important and challenging issue for such systems. Recently, Convolutional Neural Network (CNN) models have drawn considerable amount of interest in many areas such as weather classification, human rights violation detection through images, due to its accurate prediction capabilities. This work tackles real-life traffic load recognition problem on System-On-a-Programmable-Chip (SOPC) platform and coin it as MAT-CNN- SOPC, which uses an intelligent re-training mechanism of the CNN with known environments. The proposed methodology is capable of enhancing the efficacy of the approach by 2.44x in comparison to the state-of-art and proven through experimental analysis. We have also introduced a mathematical equation, which is capable of quantifying the suitability of using different CNN models over the other for a particular application based implementation.