MLAug 22, 2024
Deconfounding Multi-Cause Latent Confounders: A Factor-Model Approach to Climate Model Bias CorrectionWentao Gao, Jiuyong Li, Debo Cheng et al.
Global Climate Models (GCMs) are crucial for predicting future climate changes by simulating the Earth systems. However, the GCM Outputs exhibit systematic biases due to model uncertainties, parameterization simplifications, and inadequate representation of complex climate phenomena. Traditional bias correction methods, which rely on historical observation data and statistical techniques, often neglect unobserved confounders, leading to biased results. This paper proposes a novel bias correction approach to utilize both GCM and observational data to learn a factor model that captures multi-cause latent confounders. Inspired by recent advances in causality based time series deconfounding, our method first constructs a factor model to learn latent confounders from historical data and then applies them to enhance the bias correction process using advanced time series forecasting models. The experimental results demonstrate significant improvements in the accuracy of precipitation outputs. By addressing unobserved confounders, our approach offers a robust and theoretically grounded solution for climate model bias correction.
SIMar 4
Identifying the Group to Intervene on to Maximise Effect Under Cross-Group InterferenceXiaojing Du, Jiuyong Li, Lin Liu et al.
In many networked systems, interventions applied to one group of units can induce substantial causal effects on another group through cross-group interference pathways. Despite its practical importance in domains such as public health, digital marketing, and social policy, the problem of identifying which intervention subset in a source group maximizes the benefit on a target group remains largely unaddressed. We formalize this problem as cross-group causal influence estimation and introduce the core-to-group causal effect (Co2G), a formally defined causal estimand that quantifies the contrast in target-group outcomes under intervention versus non-intervention on a candidate source subset. We establish the nonparametric identifiability of Co2G from observational network data using do-calculus under standard causal assumptions, and develop a graph neural network-based estimator that captures cross-group interference patterns. To navigate the combinatorial search space of candidate subsets, we propose CauMax, an uncertainty-aware causal effect maximization framework with two scalable selection algorithms: (i)CauMax-G, an iterative greedy search with Monte Carlo dropout--based lower confidence bounds, and (ii)CauMax-D, a differentiable gradient-based optimization via Gumbel-Softmax relaxation. Extensive experiments on two real-world social networks demonstrate that CauMax achieves an order-of-magnitude reduction in regret compared with structural heuristics and diffusion-based baselines, and that moderate uncertainty penalization consistently improves subset selection quality.
AIAug 21, 2024
Estimating Peer Direct and Indirect Effects in Observational Network DataXiaojing Du, Jiuyong Li, Debo Cheng et al.
Estimating causal effects is crucial for decision-makers in many applications, but it is particularly challenging with observational network data due to peer interactions. Many algorithms have been proposed to estimate causal effects involving network data, particularly peer effects, but they often overlook the variety of peer effects. To address this issue, we propose a general setting which considers both peer direct effects and peer indirect effects, and the effect of an individual's own treatment, and provide identification conditions of these causal effects and proofs. To estimate these causal effects, we utilize attention mechanisms to distinguish the influences of different neighbors and explore high-order neighbor effects through multi-layer graph neural networks (GNNs). Additionally, to control the dependency between node features and representations, we incorporate the Hilbert-Schmidt Independence Criterion (HSIC) into the GNN, fully utilizing the structural information of the graph, to enhance the robustness and accuracy of the model. Extensive experiments on two semi-synthetic datasets confirm the effectiveness of our approach. Our theoretical findings have the potential to improve intervention strategies in networked systems, with applications in areas such as social networks and epidemiology.
CLJan 20
Temporal-Spatial Decouple before Act: Disentangled Representation Learning for Multimodal Sentiment AnalysisChunlei Meng, Ziyang Zhou, Lucas He et al.
Multimodal Sentiment Analysis integrates Linguistic, Visual, and Acoustic. Mainstream approaches based on modality-invariant and modality-specific factorization or on complex fusion still rely on spatiotemporal mixed modeling. This ignores spatiotemporal heterogeneity, leading to spatiotemporal information asymmetry and thus limited performance. Hence, we propose TSDA, Temporal-Spatial Decouple before Act, which explicitly decouples each modality into temporal dynamics and spatial structural context before any interaction. For every modality, a temporal encoder and a spatial encoder project signals into separate temporal and spatial body. Factor-Consistent Cross-Modal Alignment then aligns temporal features only with their temporal counterparts across modalities, and spatial features only with their spatial counterparts. Factor specific supervision and decorrelation regularization reduce cross factor leakage while preserving complementarity. A Gated Recouple module subsequently recouples the aligned streams for task. Extensive experiments show that TSDA outperforms baselines. Ablation analysis studies confirm the necessity and interpretability of the design.
LGSep 13, 2024
Causal GNNs: A GNN-Driven Instrumental Variable Approach for Causal Inference in NetworksXiaojing Du, Feiyu Yang, Wentao Gao et al.
As network data applications continue to expand, causal inference within networks has garnered increasing attention. However, hidden confounders complicate the estimation of causal effects. Most methods rely on the strong ignorability assumption, which presumes the absence of hidden confounders-an assumption that is both difficult to validate and often unrealistic in practice. To address this issue, we propose CgNN, a novel approach that leverages network structure as instrumental variables (IVs), combined with graph neural networks (GNNs) and attention mechanisms, to mitigate hidden confounder bias and improve causal effect estimation. By utilizing network structure as IVs, we reduce confounder bias while preserving the correlation with treatment. Our integration of attention mechanisms enhances robustness and improves the identification of important nodes. Validated on two real-world datasets, our results demonstrate that CgNN effectively mitigates hidden confounder bias and offers a robust GNN-driven IV framework for causal inference in complex network data.
LGMay 1
Group Cognition Learning: Making Everything Better Through Governed Two-Stage Agents CollaborationChunlei Meng, Pengbin Feng, Rong Fu et al.
Centralized multimodal learning commonly compresses language, acoustic, and visual signals into a single fused representation for prediction. While effective, this paradigm suffers from two limitations: modality dominance, where optimization gravitates towards the path of least resistance, ignoring weaker but informative modalities, and spurious modality coupling, where models overfit to incidental cross-modal correlations. To address these, we propose Group Cognition Learning (GCL), a governed collaboration paradigm that applies a two-stage protocol after modality-specific encoding. In Stage 1 (Selective Interaction), a Routing Agent proposes directed interaction routes, and an Auditing Agent assigns sample-wise gates to emphasize exchanges that yield positive marginal predictive gain while suppressing redundant coupling. In Stage 2 (Consensus Formation), a Public-Factor Agent maintains an explicit shared factor, and an Aggregation Agent produces the final prediction through contribution-aware weighting while keeping each modality representation as a specialization channel. Extensive experiments on CMU-MOSI, CMU-MOSEI, and MIntRec demonstrate that GCL mitigates dominance and coupling, establishing state-of-the-art results across both regression and classification benchmarks. Analysis experiments further demonstrate the effectiveness of the design.
CLMar 23, 2024
MRC-based Nested Medical NER with Co-prediction and Adaptive Pre-trainingXiaojing Du, Hanjie Zhao, Danyan Xing et al.
In medical information extraction, medical Named Entity Recognition (NER) is indispensable, playing a crucial role in developing medical knowledge graphs, enhancing medical question-answering systems, and analyzing electronic medical records. The challenge in medical NER arises from the complex nested structures and sophisticated medical terminologies, distinguishing it from its counterparts in traditional domains. In response to these complexities, we propose a medical NER model based on Machine Reading Comprehension (MRC), which uses a task-adaptive pre-training strategy to improve the model's capability in the medical field. Meanwhile, our model introduces multiple word-pair embeddings and multi-granularity dilated convolution to enhance the model's representation ability and uses a combined predictor of Biaffine and MLP to improve the model's recognition performance. Experimental evaluations conducted on the CMeEE, a benchmark for Chinese nested medical NER, demonstrate that our proposed model outperforms the compared state-of-the-art (SOTA) models.
CLFeb 17
NeuroSymActive: Differentiable Neural-Symbolic Reasoning with Active Exploration for Knowledge Graph Question AnsweringRong Fu, Yang Li, Zeyu Zhang et al.
Large pretrained language models and neural reasoning systems have advanced many natural language tasks, yet they remain challenged by knowledge-intensive queries that require precise, structured multi-hop inference. Knowledge graphs provide a compact symbolic substrate for factual grounding, but integrating graph structure with neural models is nontrivial: naively embedding graph facts into prompts leads to inefficiency and fragility, while purely symbolic or search-heavy approaches can be costly in retrievals and lack gradient-based refinement. We introduce NeuroSymActive, a modular framework that combines a differentiable neural-symbolic reasoning layer with an active, value-guided exploration controller for Knowledge Graph Question Answering. The method couples soft-unification style symbolic modules with a neural path evaluator and a Monte-Carlo style exploration policy that prioritizes high-value path expansions. Empirical results on standard KGQA benchmarks show that NeuroSymActive attains strong answer accuracy while reducing the number of expensive graph lookups and model calls compared to common retrieval-augmented baselines.
LGFeb 20
TempoNet: Slack-Quantized Transformer-Guided Reinforcement Scheduler for Adaptive Deadline-Centric Real-Time DispatchsRong Fu, Yibo Meng, Guangzhen Yao et al.
Real-time schedulers must reason about tight deadlines under strict compute budgets. We present TempoNet, a reinforcement learning scheduler that pairs a permutation-invariant Transformer with a deep Q-approximation. An Urgency Tokenizer discretizes temporal slack into learnable embeddings, stabilizing value learning and capturing deadline proximity. A latency-aware sparse attention stack with blockwise top-k selection and locality-sensitive chunking enables global reasoning over unordered task sets with near-linear scaling and sub-millisecond inference. A multicore mapping layer converts contextualized Q-scores into processor assignments through masked-greedy selection or differentiable matching. Extensive evaluations on industrial mixed-criticality traces and large multiprocessor settings show consistent gains in deadline fulfillment over analytic schedulers and neural baselines, together with improved optimization stability. Diagnostics include sensitivity analyses for slack quantization, attention-driven policy interpretation, hardware-in-the-loop and kernel micro-benchmarks, and robustness under stress with simple runtime mitigations; we also report sample-efficiency benefits from behavioral-cloning pretraining and compatibility with an actor-critic variant without altering the inference pipeline. These results establish a practical framework for Transformer-based decision making in high-throughput real-time scheduling.
LGSep 1, 2025
From Noise to Precision: A Diffusion-Driven Approach to Zero-Inflated Precipitation PredictionWentao Gao, Jiuyong Li, Lin Liu et al.
Zero-inflated data pose significant challenges in precipitation forecasting due to the predominance of zeros with sparse non-zero events. To address this, we propose the Zero Inflation Diffusion Framework (ZIDF), which integrates Gaussian perturbation for smoothing zero-inflated distributions, Transformer-based prediction for capturing temporal patterns, and diffusion-based denoising to restore the original data structure. In our experiments, we use observational precipitation data collected from South Australia along with synthetically generated zero-inflated data. Results show that ZIDF demonstrates significant performance improvements over multiple state-of-the-art precipitation forecasting models, achieving up to 56.7\% reduction in MSE and 21.1\% reduction in MAE relative to the baseline Non-stationary Transformer. These findings highlight ZIDF's ability to robustly handle sparse time series data and suggest its potential generalizability to other domains where zero inflation is a key challenge.
LGAug 5, 2025
Peer Effect Estimation in the Presence of Simultaneous Feedback and Unobserved ConfoundersXiaojing Du, Jiuyong Li, Lin Liu et al.
Estimating peer causal effects within complex real-world networks such as social networks is challenging, primarily due to simultaneous feedback between peers and unobserved confounders. Existing methods either address unobserved confounders while ignoring the simultaneous feedback, or account for feedback but under restrictive linear assumptions, thus failing to obtain accurate peer effect estimation. In this paper, we propose DIG2RSI, a novel Deep learning framework which leverages I-G transformation (matrix operation) and 2SRI (an instrumental variable or IV technique) to address both simultaneous feedback and unobserved confounding, while accommodating complex, nonlinear and high-dimensional relationships. DIG2RSI first applies the I-G transformation to disentangle mutual peer influences and eliminate the bias due to the simultaneous feedback. To deal with unobserved confounding, we first construct valid IVs from network data. In stage 1 of 2RSI, we train a neural network on these IVs to predict peer exposure, and extract residuals as proxies for the unobserved confounders. In the stage 2, we fit a separate neural network augmented by an adversarial discriminator that incorporates these residuals as a control function and enforces the learned representation to contain no residual confounding signal. The expressive power of deep learning models in capturing complex non-linear relationships and adversarial debiasing enhances the effectiveness of DIG2RSI in eliminating bias from both feedback loops and hidden confounders. We prove consistency of our estimator under standard regularity conditions, ensuring asymptotic recovery of the true peer effect. Empirical results on two semi-synthetic benchmarks and a real-world dataset demonstrate that DIG2RSI outperforms existing approaches.
LGOct 27, 2024
Deconfounding Time Series ForecastingWentao Gao, Feiyu Yang, Mengze Hong et al.
Time series forecasting is a critical task in various domains, where accurate predictions can drive informed decision-making. Traditional forecasting methods often rely on current observations of variables to predict future outcomes, typically overlooking the influence of latent confounders, unobserved variables that simultaneously affect both the predictors and the target outcomes. This oversight can introduce bias and degrade the performance of predictive models. In this study, we address this challenge by proposing an enhanced forecasting approach that incorporates representations of latent confounders derived from historical data. By integrating these confounders into the predictive process, our method aims to improve the accuracy and robustness of time series forecasts. The proposed approach is demonstrated through its application to climate science data, showing significant improvements over traditional methods that do not account for confounders.