Kyeongpil Min

25.3ARMay 20

CMAX-CAMEL: A Coarse-to-Fine Adaptive, Memory-Efficient, and Low-Power Edge Processor for Contrast Maximization

Kyeongpil Min, Jongin Choi, Kyeongwon Lee et al.

Contrast maximization (CMAX) is a direct geometric framework for event-based motion estimation, but its iterative warp-and-accumulate pipeline incurs input-dependent computation and frequent memory accesses, challenging real-time, low-power edge deployment. We present CMAX-CAMEL, a coarse-to-fine adaptive, memory-efficient, low-power edge processor for CMAX. CMAX-CAMEL combines a runtime-adaptive execution strategy with a memory-centric processor architecture. It adjusts coarse-to-fine execution according to the observed event distribution, prioritizing stages likely to improve estimation accuracy while suppressing low-value iterations and unnecessary stage transitions. Architecturally, a banked parallel memory organization sustains real-time throughput while reducing latency, and a subsampling-coupled accumulation structure lowers memory-access activity along the warp-and-accumulate dataflow. On a Virtex FPGA prototype operating at 200 MHz, CMAX-CAMEL improves estimation accuracy by up to 19% over fixed coarse-to-fine schedules, reduces processing latency by 53.3%, lowers effective memory accesses by 42%, and cuts total system energy by 52.2%, including adaptation overheads. These results show that CMAX-CAMEL is an HW-SW co-design that co-optimizes execution policy and data movement for real-time, low-power event-based motion estimation at the edge.

CRFeb 24

TT-SEAL: TTD-Aware Selective Encryption for Adversarially-Robust and Low-Latency Edge AI

Kyeongpil Min, Sangmin Jeon, Jae-Jin Lee et al.

Cloud-edge AI must jointly satisfy model compression and security under tight device budgets. While Tensor-Train Decomposition (TTD) shrinks on-device models, prior selective-encryption studies largely assume dense weights, leaving its practicality under TTD compression unclear. We present TT-SEAL, a selective-encryption framework for TT-decomposed networks. TT-SEAL ranks TT cores with a sensitivity-based importance metric, calibrates a one-time robustness threshold, and uses a value-DP optimizer to encrypt the minimum set of critical cores with AES. Under TTD-aware, transfer-based threat models (and on an FPGA-prototyped edge processor) TT-SEAL matches the robustness of full (black-box) encryption while encrypting as little as 4.89-15.92% of parameters across ResNet-18, MobileNetV2, and VGG-16, and drives the share of AES decryption in end-to-end latency to low single digits (e.g., 58% -> 2.76% on ResNet-18), enabling secure, low-latency edge AI.

Kyeongpil Min

2 Papers