LGApr 1

CGCMA: Conditionally-Gated Cross-Modal Attention for Event-Conditioned Asynchronous Fusion

arXiv:2604.164112.4
Predicted impact top 98% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For researchers in multimodal learning and financial AI, this work addresses the underexplored problem of asynchronous alignment with explicit freshness and trust reasoning, though results are on a single stress-test domain.

The paper tackles asynchronous multimodal fusion where a dense primary stream (e.g., price sequences) must be integrated with sporadic, time-dependent external context (e.g., news). The proposed CGCMA method achieves a mean downstream Sharpe ratio of +0.449 ± 0.257 on a new cryptocurrency benchmark, outperforming baselines.

We study asynchronous alignment, a first-class multimodal learning setting in which a dense primary stream must be fused with sporadic external context whose value depends on when it arrives. Unlike standard multimodal benchmarks that assume structural synchrony, this setting requires models to reason explicitly about freshness and trust. We focus on the event-conditioned case in which continuous market states are paired with delayed web intelligence, and we use high-frequency cryptocurrency markets only as a timestamped, high-noise stress test for this broader problem. We propose CGCMA (Conditionally-Gated Cross-Modal Attention), whose central design principle is to separate text-conditioned grounding from lag-aware trust control. Text first attends over price sequences to identify event-relevant market states, after which a conditional gate uses modality agreement, web features, and lag $τ_{\mathrm{lag}}$ to regulate residual injection and fall back toward unimodal prediction when external context is stale or contradictory. We introduce CMI (Crypto Market Intelligence), an asynchronous evaluation corpus with 27,914 real-news samples pairing high-frequency price sequences with lagged web intelligence. On the current short real-news corpus, CGCMA attains the highest mean downstream Sharpe ratio ($+0.449 \pm 0.257$) among the evaluated baselines under a shared zero-cost threshold-trading evaluation on news-available bars. Additional controls show that the gain is not explained by web scalars alone and is not recovered by simple freshness heuristics. The resulting evidence supports problem validity and a promising asynchronous multimodal gain on this stress-test setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes