Representing Outcome-driven Higher-order Dependencies in Graphs of Disease Trajectories
This work addresses the challenge of improving interpretability and accuracy in predicting disease trajectories for clinical applications, though it appears incremental by combining existing graph and sequence modeling techniques.
The authors tackled the problem of modeling disease progression by proposing a method to identify and encode higher-order dependencies in graphs, using data from 913,475 type 2 diabetes patients, and found that their networks encode significantly more information about progression toward various outcomes compared to other approaches.
The widespread application of machine learning techniques to biomedical data has produced many new insights into disease progression and improving clinical care. Inspired by the flexibility and interpretability of graphs (networks), as well as the potency of sequence models like transformers and higher-order networks (HONs), we propose a method that identifies combinations of risk factors for a given outcome and accurately encodes these higher-order relationships in a graph. Using historical data from 913,475 type 2 diabetes (T2D) patients, we found that, compared to other approaches, the proposed networks encode significantly more information about the progression of T2D toward a variety of outcomes. We additionally demonstrate how structural information from the proposed graph can be used to augment the performance of transformer-based models on predictive tasks, especially when the data are noisy. By increasing the order, or memory, of the graph, we show how the proposed method illuminates key risk factors while successfully ignoring noisy elements, which facilitates analysis that is simultaneously accurate and interpretable.