ITDec 3, 2022
Quantify the Causes of Causal Emergence: Critical Conditions of Uncertainty and Asymmetry in Causal StructureLiye Jia, Fengyufan Yang, Ka Lok Man et al.
Beneficial to advanced computing devices, models with massive parameters are increasingly employed to extract more information to enhance the precision in describing and predicting the patterns of objective systems. This phenomenon is particularly pronounced in research domains associated with deep learning. However, investigations of causal relationships based on statistical and informational theories have posed an interesting and valuable challenge to large-scale models in the recent decade. Macroscopic models with fewer parameters can outperform their microscopic counterparts with more parameters in effectively representing the system. This valuable situation is called "Causal Emergence." This paper introduces a quantification framework, according to the Effective Information and Transition Probability Matrix, for assessing numerical conditions of Causal Emergence as theoretical constraints of its occurrence. Specifically, our results quantitatively prove the cause of Causal Emergence. By a particular coarse-graining strategy, optimizing uncertainty and asymmetry within the model's causal structure is significantly more influential than losing maximum information due to variations in model scales. Moreover, by delving into the potential exhibited by Partial Information Decomposition and Deep Learning networks in the study of Causal Emergence, we discuss potential application scenarios where our quantification framework could play a role in future investigations of Causal Emergence.
CVMar 19, 2024
WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave RadarRunwei Guan, Liye Jia, Fengyufan Yang et al.
The perception of waterways based on human intent is significant for autonomous navigation and operations of Unmanned Surface Vehicles (USVs) in water environments. Inspired by visual grounding, we introduce WaterVG, the first visual grounding dataset designed for USV-based waterway perception based on human prompts. WaterVG encompasses prompts describing multiple targets, with annotations at the instance level including bounding boxes and masks. Notably, WaterVG includes 11,568 samples with 34,987 referred targets, whose prompts integrates both visual and radar characteristics. The pattern of text-guided two sensors equips a finer granularity of text prompts with visual and radar features of referred targets. Moreover, we propose a low-power visual grounding model, Potamoi, which is a multi-task model with a well-designed Phased Heterogeneous Modality Fusion (PHMF) mode, including Adaptive Radar Weighting (ARW) and Multi-Head Slim Cross Attention (MHSCA). Exactly, ARW extracts required radar features to fuse with vision for prompt alignment. MHSCA is an efficient fusion module with a remarkably small parameter count and FLOPs, elegantly fusing scenario context captured by two sensors with linguistic features, which performs expressively on visual grounding tasks. Comprehensive experiments and evaluations have been conducted on WaterVG, where our Potamoi archives state-of-the-art performances compared with counterparts.