CVMar 20, 2022Code
TVConv: Efficient Translation Variant Convolution for Layout-aware Visual ProcessingJierun Chen, Tianlang He, Weipeng Zhuo et al.
As convolution has empowered many smart applications, dynamic convolution further equips it with the ability to adapt to diverse inputs. However, the static and dynamic convolutions are either layout-agnostic or computation-heavy, making it inappropriate for layout-specific applications, e.g., face recognition and medical image segmentation. We observe that these applications naturally exhibit the characteristics of large intra-image (spatial) variance and small cross-image variance. This observation motivates our efficient translation variant convolution (TVConv) for layout-aware visual processing. Technically, TVConv is composed of affinity maps and a weight-generating block. While affinity maps depict pixel-paired relationships gracefully, the weight-generating block can be explicitly overparameterized for better training while maintaining efficient inference. Although conceptually simple, TVConv significantly improves the efficiency of the convolution and can be readily plugged into various network architectures. Extensive experiments on face recognition show that TVConv reduces the computational cost by up to 3.1x and improves the corresponding throughput by 2.3x while maintaining a high accuracy compared to the depthwise convolution. Moreover, for the same computation cost, we boost the mean accuracy by up to 4.21%. We also conduct experiments on the optic disc/cup segmentation task and obtain better generalization performance, which helps mitigate the critical data scarcity issue. Code is available at https://github.com/JierunChen/TVConv.
LGNov 14, 2025
LoRaCompass: Robust Reinforcement Learning to Efficiently Search for a LoRa TagTianlang He, Zhongming Lin, Tianrui Jiang et al.
The Long-Range (LoRa) protocol, known for its extensive range and low power, has increasingly been adopted in tags worn by mentally incapacitated persons (MIPs) and others at risk of going missing. We study the sequential decision-making process for a mobile sensor to locate a periodically broadcasting LoRa tag with the fewest moves (hops) in general, unknown environments, guided by the received signal strength indicator (RSSI). While existing methods leverage reinforcement learning for search, they remain vulnerable to domain shift and signal fluctuation, resulting in cascading decision errors that culminate in substantial localization inaccuracies. To bridge this gap, we propose LoRaCompass, a reinforcement learning model designed to achieve robust and efficient search for a LoRa tag. For exploitation under domain shift and signal fluctuation, LoRaCompass learns a robust spatial representation from RSSI to maximize the probability of moving closer to a tag, via a spatially-aware feature extractor and a policy distillation loss function. It further introduces an exploration function inspired by the upper confidence bound (UCB) that guides the sensor toward the tag with increasing confidence. We have validated LoRaCompass in ground-based and drone-assisted scenarios within diverse unseen environments covering an area of over 80km^2. It has demonstrated high success rate (>90%) in locating the tag within 100m proximity (a 40% improvement over existing methods) and high efficiency with a search path length (in hops) that scales linearly with the initial distance.
CVMay 6, 2024
Elevator, Escalator, or Neither? Classifying Conveyor State Using Smartphone under Arbitrary Pedestrian BehaviorTianlang He, Zhiqiu Xia, S. -H. Gary Chan
Knowing a pedestrian's conveyor state of ''elevator,'' ''escalator,'' or ''neither'' is fundamental to many applications such as indoor navigation and people flow management. Previous studies on classifying the conveyor state often rely on specially designed body-worn sensors or make strong assumptions on pedestrian behaviors, which greatly strangles their deployability. To overcome this, we study the classification problem under arbitrary pedestrian behaviors using the inertial navigation system (INS) of the commonly available smartphones (including accelerometer, gyroscope, and magnetometer). This problem is challenging, because the INS signals of the conveyor states are entangled by the arbitrary and diverse pedestrian behaviors. We propose ELESON, a novel and lightweight deep-learning approach that uses phone INS to classify a pedestrian to elevator, escalator, or neither. Using causal decomposition and adversarial learning, ELESON extracts the motion and magnetic features of conveyor state independent of pedestrian behavior, based on which it estimates the state confidence by means of an evidential classifier. We curate a large and diverse dataset with 36,420 instances of pedestrians randomly taking elevators and escalators under arbitrary unknown behaviors. Our extensive experiments show that ELESON is robust against pedestrian behavior, achieving a high accuracy of over 0.9 in F1 score, strong confidence discriminability of 0.81 in AUROC (Area Under the Receiver Operating Characteristics), and low computational and memory requirements fit for common smartphone deployment.
HCJan 11, 2022
Tackling Multipath and Biased Training Data for IMU-Assisted BLE Proximity DetectionTianlang He, Jiajie Tan, Weipeng Zhuo et al.
Proximity detection is to determine whether an IoT receiver is within a certain distance from a signal transmitter. Due to its low cost and high popularity, Bluetooth low energy (BLE) has been used to detect proximity based on the received signal strength indicator (RSSI). To address the fact that RSSI can be markedly influenced by device carriage states, previous works have incorporated RSSI with inertial measurement unit (IMU) using deep learning. However, they have not sufficiently accounted for the impact of multipath. Furthermore, due to the special setup, the IMU data collected in the training process may be biased, which hampers the system's robustness and generalizability. This issue has not been studied before. We propose PRID, an IMU-assisted BLE proximity detection approach robust against RSSI fluctuation and IMU data bias. PRID histogramizes RSSI to extract multipath features and uses carriage state regularization to mitigate overfitting due to IMU data bias. We further propose PRID-lite based on a binarized neural network to substantially cut memory requirements for resource-constrained devices. We have conducted extensive experiments under different multipath environments, data bias levels, and a crowdsourced dataset. Our results show that PRID significantly reduces false detection cases compared with the existing arts (by over 50%). PRID-lite further reduces over 90% PRID model size and extends 60% battery life, with a minor compromise in accuracy (7%).