Zhichao Cao

h-index10

7papers

106citations

Novelty39%

AI Score43

Ranked #79,204 of 201,326 authors (top 39%)#27,342 in CV (top 46%)

7 Papers

LGSep 29, 2023Code

FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of Things

Samiul Alam, Tuo Zhang, Tiantian Feng et al.

There is a significant relevance of federated learning (FL) in the realm of Artificial Intelligence of Things (AIoT). However, most existing FL works do not use datasets collected from authentic IoT devices and thus do not capture unique modalities and inherent challenges of IoT data. To fill this critical gap, in this work, we introduce FedAIoT, an FL benchmark for AIoT. FedAIoT includes eight datasets collected from a wide range of IoT devices. These datasets cover unique IoT modalities and target representative applications of AIoT. FedAIoT also includes a unified end-to-end FL framework for AIoT that simplifies benchmarking the performance of the datasets. Our benchmark results shed light on the opportunities and challenges of FL for AIoT. We hope FedAIoT could serve as an invaluable resource to foster advancements in the important field of FL for AIoT. The repository of FedAIoT is maintained at https://github.com/AIoT-MLSys-Lab/FedAIoT.

NIApr 20, 2023Code

NELoRa-Bench: A Benchmark for Neural-enhanced LoRa Demodulation

Jialuo Du, Yidong Ren, Mi Zhang et al.

Low-Power Wide-Area Networks (LPWANs) are an emerging Internet-of-Things (IoT) paradigm marked by low-power and long-distance communication. Among them, LoRa is widely deployed for its unique characteristics and open-source technology. By adopting the Chirp Spread Spectrum (CSS) modulation, LoRa enables low signal-to-noise ratio (SNR) communication. The standard LoRa demodulation method accumulates the chirp power of the whole chirp into an energy peak in the frequency domain. In this way, it can support communication even when SNR is lower than -15 dB. Beyond that, we proposed NELoRa, a neural-enhanced decoder that exploits multi-dimensional information to achieve significant SNR gain. This paper presents the dataset used to train/test NELoRa, which includes 27,329 LoRa symbols with spreading factors from 7 to 10, for further improvement of neural-enhanced LoRa demodulation. The dataset shows that NELoRa can achieve 1.84-2.35 dB SNR gain over the standard LoRa decoder. The dataset and codes can be found at https://github.com/daibiaoxuwu/NeLoRa_Dataset.

NIOct 25, 2024Code

Artificial Intelligence of Things: A Survey

Shakhrul Iman Siam, Hyunho Ahn, Li Liu et al.

The integration of the Internet of Things (IoT) and modern Artificial Intelligence (AI) has given rise to a new paradigm known as the Artificial Intelligence of Things (AIoT). In this survey, we provide a systematic and comprehensive review of AIoT research. We examine AIoT literature related to sensing, computing, and networking & communication, which form the three key components of AIoT. In addition to advancements in these areas, we review domain-specific AIoT systems that are designed for various important application domains. We have also created an accompanying GitHub repository, where we compile the papers included in this survey: https://github.com/AIoT-MLSys-Lab/AIoT-Survey. This repository will be actively maintained and updated with new research as it becomes available. As both IoT and AI become increasingly critical to our society, we believe AIoT is emerging as an essential research field at the intersection of IoT and modern AI. We hope this survey will serve as a valuable resource for those engaged in AIoT research and act as a catalyst for future explorations to bridge gaps and drive advancements in this exciting field.

CVAug 4, 2025

Hydra: Accurate Multi-Modal Leaf Wetness Sensing with mm-Wave and Camera Fusion

Yimeng Liu, Maolin Gan, Huaili Zeng et al.

Leaf Wetness Duration (LWD), the time that water remains on leaf surfaces, is crucial in the development of plant diseases. Existing LWD detection lacks standardized measurement techniques, and variations across different plant characteristics limit its effectiveness. Prior research proposes diverse approaches, but they fail to measure real natural leaves directly and lack resilience in various environmental conditions. This reduces the precision and robustness, revealing a notable practical application and effectiveness gap in real-world agricultural settings. This paper presents Hydra, an innovative approach that integrates millimeter-wave (mm-Wave) radar with camera technology to detect leaf wetness by determining if there is water on the leaf. We can measure the time to determine the LWD based on this detection. Firstly, we design a Convolutional Neural Network (CNN) to selectively fuse multiple mm-Wave depth images with an RGB image to generate multiple feature images. Then, we develop a transformer-based encoder to capture the inherent connection among the multiple feature images to generate a feature map, which is further fed to a classifier for detection. Moreover, we augment the dataset during training to generalize our model. Implemented using a frequency-modulated continuous-wave (FMCW) radar within the 76 to 81 GHz band, Hydra's performance is meticulously evaluated on plants, demonstrating the potential to classify leaf wetness with up to 96% accuracy across varying scenarios. Deploying Hydra in the farm, including rainy, dawn, or poorly light nights, it still achieves an accuracy rate of around 90%.

LGOct 17, 2024

Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching

Jie Peng, Zhang Cao, Huaizhi Qu et al.

Although Large Language Models (LLMs) have demonstrated remarkable capabilities, their massive parameter counts and associated extensive computing make LLMs' deployment the main part of carbon emission from nowadays AI applications. Compared to modern GPUs like H$100$, it would be significantly carbon-sustainable if we could leverage old-fashioned GPUs such as M$40$ (as shown in Figure 1, M$40$ only has one third carbon emission of H$100$'s) for LLM servings. However, the limited High Bandwidth Memory (HBM) available on such GPU often cannot support the loading of LLMs due to the gigantic model size and intermediate activation data, making their serving challenging. For instance, a LLaMA2 model with $70$B parameters typically requires $128$GB for inference, which substantially surpasses $24$GB HBM in a $3090$ GPU and remains infeasible even considering the additional $64$GB DRAM. To address this challenge, this paper proposes a mixed-precision with a model modularization algorithm to enable LLM inference on outdated hardware with resource constraints. (The precision denotes the numerical precision like FP16, INT8, INT4) and multi-level caching (M2Cache).) Specifically, our M2Cache first modulizes neurons in LLM and creates their importance ranking. Then, it adopts a dynamic sparse mixed-precision quantization mechanism in weight space to reduce computational demands and communication overhead at each decoding step. It collectively lowers the operational carbon emissions associated with LLM inference. Moreover, M2Cache introduces a three-level cache management system with HBM, DRAM, and SSDs that complements the dynamic sparse mixed-precision inference. To enhance communication efficiency, M2Cache maintains a neuron-level mixed-precision LRU cache in HBM, a larger layer-aware cache in DRAM, and a full model in SSD.

DBOct 28, 2025

StorageXTuner: An LLM Agent-Driven Automatic Tuning Framework for Heterogeneous Storage Systems

Qi Lin, Zhenyu Zhang, Viraj Thakkar et al.

Automatically configuring storage systems is hard: parameter spaces are large and conditions vary across workloads, deployments, and versions. Heuristic and ML tuners are often system specific, require manual glue, and degrade under changes. Recent LLM-based approaches help but usually treat tuning as a single-shot, system-specific task, which limits cross-system reuse, constrains exploration, and weakens validation. We present StorageXTuner, an LLM agent-driven auto-tuning framework for heterogeneous storage engines. StorageXTuner separates concerns across four agents - Executor (sandboxed benchmarking), Extractor (performance digest), Searcher (insight-guided configuration exploration), and Reflector (insight generation and management). The design couples an insight-driven tree search with layered memory that promotes empirically validated insights and employs lightweight checkers to guard against unsafe actions. We implement a prototype and evaluate it on RocksDB, LevelDB, CacheLib, and MySQL InnoDB with YCSB, MixGraph, and TPC-H/C. Relative to out-of-the-box settings and to ELMo-Tune, StorageXTuner reaches up to 575% and 111% higher throughput, reduces p99 latency by as much as 88% and 56%, and converges with fewer trials.

CVJul 30, 2025

Hydra-Bench: A Benchmark for Multi-Modal Leaf Wetness Sensing

Yimeng Liu, Maolin Gan, Yidong Ren et al.

Leaf wetness detection is a crucial task in agricultural monitoring, as it directly impacts the prediction and protection of plant diseases. However, existing sensing systems suffer from limitations in robustness, accuracy, and environmental resilience when applied to natural leaves under dynamic real-world conditions. To address these challenges, we introduce a new multi-modal dataset specifically designed for evaluating and advancing machine learning algorithms in leaf wetness detection. Our dataset comprises synchronized mmWave raw data, Synthetic Aperture Radar (SAR) images, and RGB images collected over six months from five diverse plant species in both controlled and outdoor field environments. We provide detailed benchmarks using the Hydra model, including comparisons against single modality baselines and multiple fusion strategies, as well as performance under varying scan distances. Additionally, our dataset can serve as a benchmark for future SAR imaging algorithm optimization, enabling a systematic evaluation of detection accuracy under diverse conditions.