AIJul 1, 2024
Hybrid RAG-empowered Multi-modal LLM for Secure Data Management in Internet of Medical Things: A Diffusion-based Contract ApproachCheng Su, Jinbo Wen, Jiawen Kang et al.
Secure data management and effective data sharing have become paramount in the rapidly evolving healthcare landscape, especially with the growing integration of the Internet of Medical Things (IoMT). The rise of generative artificial intelligence has further elevated Multi-modal Large Language Models (MLLMs) as essential tools for managing and optimizing healthcare data in IoMT. MLLMs can support multi-modal inputs and generate diverse types of content by leveraging large-scale training on vast amounts of multi-modal data. However, critical challenges persist in developing medical MLLMs, including security and freshness issues of healthcare data, affecting the output quality of MLLMs. To this end, in this paper, we propose a hybrid Retrieval-Augmented Generation (RAG)-empowered medical MLLM framework for healthcare data management. This framework leverages a hierarchical cross-chain architecture to facilitate secure data training. Moreover, it enhances the output quality of MLLMs through hybrid RAG, which employs multi-modal metrics to filter various unimodal RAG results and incorporates these retrieval results as additional inputs to MLLMs. Additionally, we employ age of information to indirectly evaluate the data freshness impact of MLLMs and utilize contract theory to incentivize healthcare data holders to share their fresh data, mitigating information asymmetry during data sharing. Finally, we utilize a generative diffusion model-based deep reinforcement learning algorithm to identify the optimal contract for efficient data sharing. Numerical results demonstrate the effectiveness of the proposed schemes, which achieve secure and efficient healthcare data management.
NIApr 30, 2024
Harnessing Federated Generative Learning for Green and Sustainable Internet of ThingsYuanhang Qi, M. Shamim Hossain
The rapid proliferation of devices in the Internet of Things (IoT) has ushered in a transformative era of data-driven connectivity across various domains. However, this exponential growth has raised pressing concerns about environmental sustainability and data privacy. In response to these challenges, this paper introduces One-shot Federated Learning (OSFL), an innovative paradigm that harmonizes sustainability and machine learning within IoT ecosystems. OSFL revolutionizes the traditional Federated Learning (FL) workflow by condensing multiple iterative communication rounds into a single operation, thus significantly reducing energy consumption, communication overhead, and latency. This breakthrough is coupled with the strategic integration of generative learning techniques, ensuring robust data privacy while promoting efficient knowledge sharing among IoT devices. By curtailing resource utilization, OSFL aligns seamlessly with the vision of green and sustainable IoT, effectively extending device lifespans and mitigating their environmental footprint. Our research underscores the transformative potential of OSFL, poised to reshape the landscape of IoT applications across domains such as energy-efficient smart cities and groundbreaking healthcare solutions. This contribution marks a pivotal step towards a more responsible, sustainable, and technologically advanced future.
CVDec 31, 2023
Generative Model-Driven Synthetic Training Image Generation: An Approach to Cognition in Rail Defect DetectionRahatara Ferdousi, Chunsheng Yang, M. Anwar Hossain et al.
Recent advancements in cognitive computing, with the integration of deep learning techniques, have facilitated the development of intelligent cognitive systems (ICS). This is particularly beneficial in the context of rail defect detection, where the ICS would emulate human-like analysis of image data for defect patterns. Despite the success of Convolutional Neural Networks (CNN) in visual defect classification, the scarcity of large datasets for rail defect detection remains a challenge due to infrequent accident events that would result in defective parts and images. Contemporary researchers have addressed this data scarcity challenge by exploring rule-based and generative data augmentation models. Among these, Variational Autoencoder (VAE) models can generate realistic data without extensive baseline datasets for noise modeling. This study proposes a VAE-based synthetic image generation technique for rail defects, incorporating weight decay regularization and image reconstruction loss to prevent overfitting. The proposed method is applied to create a synthetic dataset for the Canadian Pacific Railway (CPR) with just 50 real samples across five classes. Remarkably, 500 synthetic samples are generated with a minimal reconstruction loss of 0.021. A Visual Transformer (ViT) model underwent fine-tuning using this synthetic CPR dataset, achieving high accuracy rates (98%-99%) in classifying the five defect classes. This research offers a promising solution to the data scarcity challenge in rail defect detection, showcasing the potential for robust ICS development in this domain.
CVJan 31, 2025
EcoWeedNet: A Lightweight and Automated Weed Detection Method for Sustainable Next-Generation Agricultural Consumer ElectronicsOmar H. Khater, Abdul Jabbar Siddiqui, M. Shamim Hossain et al.
Sustainable agriculture plays a crucial role in ensuring world food security for consumers. A critical challenge faced by sustainable precision agriculture is weed growth, as weeds compete for essential resources with crops, such as water, soil nutrients, and sunlight, which notably affect crop yields. The adoption of automated computer vision technologies and ground agricultural consumer electronic vehicles in precision agriculture offers sustainable, low-carbon solutions. However, prior works suffer from issues such as low accuracy and precision, as well as high computational expense. This work proposes EcoWeedNet, a novel model that enhances weed detection performance without introducing significant computational complexity, aligning with the goals of low-carbon agricultural practices. The effectiveness of the proposed model is demonstrated through comprehensive experiments on the CottonWeedDet12 benchmark dataset, which reflects real-world scenarios. EcoWeedNet achieves performance comparable to that of large models (mAP@0.5 = 95.2%), yet with significantly fewer parameters (approximately 4.21% of the parameters of YOLOv4), lower computational complexity and better computational efficiency 6.59% of the GFLOPs of YOLOv4). These key findings indicate EcoWeedNet's deployability on low-power consumer hardware, lower energy consumption, and hence reduced carbon footprint, thereby emphasizing the application prospects of EcoWeedNet in next-generation sustainable agriculture. These findings provide the way forward for increased application of environmentally-friendly agricultural consumer technologies.
CRNov 8, 2024
ViT Enhanced Privacy-Preserving Secure Medical Data Sharing and ClassificationAl Amin, Kamrul Hasan, Sharif Ullah et al.
Privacy-preserving and secure data sharing are critical for medical image analysis while maintaining accuracy and minimizing computational overhead are also crucial. Applying existing deep neural networks (DNNs) to encrypted medical data is not always easy and often compromises performance and security. To address these limitations, this research introduces a secure framework consisting of a learnable encryption method based on the block-pixel operation to encrypt the data and subsequently integrate it with the Vision Transformer (ViT). The proposed framework ensures data privacy and security by creating unique scrambling patterns per key, providing robust performance against leading bit attacks and minimum difference attacks.
CVSep 19, 2025
TinyEcoWeedNet: Edge Efficient Real-Time Aerial Agricultural Weed DetectionOmar H. Khater, Abdul Jabbar Siddiqui, Aiman El-Maleh et al.
Deploying deep learning models in agriculture is difficult because edge devices have limited resources, but this work presents a compressed version of EcoWeedNet using structured channel pruning, quantization-aware training (QAT), and acceleration with NVIDIA's TensorRT on the Jetson Orin Nano. Despite the challenges of pruning complex architectures with residual shortcuts, attention mechanisms, concatenations, and CSP blocks, the model size was reduced by up to 68.5% and computations by 3.2 GFLOPs, while inference speed reached 184 FPS at FP16, 28.7% faster than the baseline. On the CottonWeedDet12 dataset, the pruned EcoWeedNet with a 39.5% pruning ratio outperformed YOLO11n and YOLO12n (with only 20% pruning), achieving 83.7% precision, 77.5% recall, and 85.9% mAP50, proving it to be both efficient and effective for precision agriculture.
GTMay 9, 2025
Bi-LSTM based Multi-Agent DRL with Computation-aware Pruning for Agent Twins Migration in Vehicular Embodied AI NetworksYuxiang Wei, Zhuoqi Zeng, Yue Zhong et al.
With the advancement of large language models and embodied Artificial Intelligence (AI) in the intelligent transportation scenarios, the combination of them in intelligent transportation spawns the Vehicular Embodied AI Network (VEANs). In VEANs, Autonomous Vehicles (AVs) are typical agents whose local advanced AI applications are defined as vehicular embodied AI agents, enabling capabilities such as environment perception and multi-agent collaboration. Due to computation latency and resource constraints, the local AI applications and services running on vehicular embodied AI agents need to be migrated, and subsequently referred to as vehicular embodied AI agent twins, which drive the advancement of vehicular embodied AI networks to offload intensive tasks to Roadside Units (RSUs), mitigating latency problems while maintaining service quality. Recognizing workload imbalance among RSUs in traditional approaches, we model AV-RSU interactions as a Stackelberg game to optimize bandwidth resource allocation for efficient migration. A Tiny Multi-Agent Bidirectional LSTM Proximal Policy Optimization (TMABLPPO) algorithm is designed to approximate the Stackelberg equilibrium through decentralized coordination. Furthermore, a personalized neural network pruning algorithm based on Path eXclusion (PX) dynamically adapts to heterogeneous AV computation capabilities by identifying task-critical parameters in trained models, reducing model complexity with less performance degradation. Experimental validation confirms the algorithm's effectiveness in balancing system load and minimizing delays, demonstrating significant improvements in vehicular embodied AI agent deployment.
LGJul 19, 2020
Deep Anomaly Detection for Time-series Data in Industrial IoT: A Communication-Efficient On-device Federated Learning ApproachYi Liu, Sahil Garg, Jiangtian Nie et al.
Since edge device failures (i.e., anomalies) seriously affect the production of industrial products in Industrial IoT (IIoT), accurately and timely detecting anomalies is becoming increasingly important. Furthermore, data collected by the edge device may contain the user's private data, which is challenging the current detection approaches as user privacy is calling for the public concern in recent years. With this focus, this paper proposes a new communication-efficient on-device federated learning (FL)-based deep anomaly detection framework for sensing time-series data in IIoT. Specifically, we first introduce a FL framework to enable decentralized edge devices to collaboratively train an anomaly detection model, which can improve its generalization ability. Second, we propose an Attention Mechanism-based Convolutional Neural Network-Long Short Term Memory (AMCNN-LSTM) model to accurately detect anomalies. The AMCNN-LSTM model uses attention mechanism-based CNN units to capture important fine-grained features, thereby preventing memory loss and gradient dispersion problems. Furthermore, this model retains the advantages of LSTM unit in predicting time series data. Third, to adapt the proposed framework to the timeliness of industrial anomaly detection, we propose a gradient compression mechanism based on Top-\textit{k} selection to improve communication efficiency. Extensive experiment studies on four real-world datasets demonstrate that the proposed framework can accurately and timely detect anomalies and also reduce the communication overhead by 50\% compared to the federated learning framework that does not use a gradient compression scheme.