LGMay 14

NEST: Nested Event Stream Transformer for Sequences of Multisets

arXiv:2602.0052024.4h-index: 3
AI Analysis

For researchers working with hierarchical event stream data (e.g., EHRs), NEST provides a more efficient and effective foundation model that avoids spurious within-set relationships and improves set-level representations.

NEST introduces a transformer architecture that preserves the hierarchical structure of event streams (sequences of multisets), improving pretraining efficiency and downstream performance on EHR data compared to flattening approaches.

Event stream data often exhibit hierarchical structure in which multiple events co-occur, resulting in a sequence of multisets (i.e., bags of events). In electronic health records (EHRs), for example, medical events are grouped into a sequence of clinical encounters with well-defined temporal structure, but the order and timing of events within each encounter may be unknown or unreliable. Most existing foundation models (FMs) for event stream data flatten this hierarchy into a one-dimensional sequence, leading to (i) computational inefficiency associated with dense attention and learning spurious within-set relationships, and (ii) lower-quality set-level representations from heuristic post-training pooling for downstream tasks. Here, we show that preserving the original hierarchy in the FM architecture provides a useful inductive bias that improves both computational efficiency and representation quality. We then introduce Nested Event Stream Transformer (NEST), a FM for event streams comprised of sequences of multisets. Building on this architecture, we formulate Masked Set Modeling (MSM), an efficient paradigm that promotes improved set-level representation learning. Experiments on real-world multiset sequence data show that NEST captures real-world dynamics while improving both pretraining efficiency and downstream performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes