LGNov 29, 2022
An Extreme-Adaptive Time Series Prediction Model Based on Probability-Enhanced LSTM Neural NetworksYanhong Li, Jack Xu, David C. Anastasiu
Forecasting time series with extreme events has been a challenging and prevalent research topic, especially when the time series data are affected by complicated uncertain factors, such as is the case in hydrologic prediction. Diverse traditional and deep learning models have been applied to discover the nonlinear relationships and recognize the complex patterns in these types of data. However, existing methods usually ignore the negative influence of imbalanced data, or severe events, on model training. Moreover, methods are usually evaluated on a small number of generally well-behaved time series, which does not show their ability to generalize. To tackle these issues, we propose a novel probability-enhanced neural network model, called NEC+, which concurrently learns extreme and normal prediction functions and a way to choose among them via selective back propagation. We evaluate the proposed model on the difficult 3-day ahead hourly water level prediction task applied to 9 reservoirs in California. Experimental results demonstrate that the proposed model significantly outperforms state-of-the-art baselines and exhibits superior generalization ability on data with diverse distributions.
LGNov 21, 2023
Fast and Interpretable Mortality Risk Scores for Critical Care PatientsChloe Qinyu Zhu, Muhang Tian, Lesia Semenova et al.
Prediction of mortality in intensive care unit (ICU) patients typically relies on black box models (that are unacceptable for use in hospitals) or hand-tuned interpretable models (that might lead to the loss in performance). We aim to bridge the gap between these two categories by building on modern interpretable ML techniques to design interpretable mortality risk scores that are as accurate as black boxes. We developed a new algorithm, GroupFasterRisk, which has several important benefits: it uses both hard and soft direct sparsity regularization, it incorporates group sparsity to allow more cohesive models, it allows for monotonicity constraint to include domain knowledge, and it produces many equally-good models, which allows domain experts to choose among them. For evaluation, we leveraged the largest existing public ICU monitoring datasets (MIMIC III and eICU). Models produced by GroupFasterRisk outperformed OASIS and SAPS II scores and performed similarly to APACHE IV/IVa while using at most a third of the parameters. For patients with sepsis/septicemia, acute myocardial infarction, heart failure, and acute kidney failure, GroupFasterRisk models outperformed OASIS and SOFA. Finally, different mortality prediction ML approaches performed better based on variables selected by GroupFasterRisk as compared to OASIS variables. GroupFasterRisk's models performed better than risk scores currently used in hospitals, and on par with black box ML models, while being orders of magnitude sparser. Because GroupFasterRisk produces a variety of risk scores, it allows design flexibility - the key enabler of practical model creation. GroupFasterRisk is a fast, accessible, and flexible procedure that allows learning a diverse set of sparse risk scores for mortality prediction.
LGDec 14, 2023
Learning from Polar Representation: An Extreme-Adaptive Model for Long-Term Time Series ForecastingYanhong Li, Jack Xu, David C. Anastasiu
In the hydrology field, time series forecasting is crucial for efficient water resource management, improving flood and drought control and increasing the safety and quality of life for the general population. However, predicting long-term streamflow is a complex task due to the presence of extreme events. It requires the capture of long-range dependencies and the modeling of rare but important extreme values. Existing approaches often struggle to tackle these dual challenges simultaneously. In this paper, we specifically delve into these issues and propose Distance-weighted Auto-regularized Neural network (DAN), a novel extreme-adaptive model for long-range forecasting of stremflow enhanced by polar representation learning. DAN utilizes a distance-weighted multi-loss mechanism and stackable blocks to dynamically refine indicator sequences from exogenous data, while also being able to handle uni-variate time-series by employing Gaussian Mixture probability modeling to improve robustness to severe events. We also introduce Kruskal-Wallis sampling and gate control vectors to handle imbalanced extreme data. On four real-life hydrologic streamflow datasets, we demonstrate that DAN significantly outperforms both state-of-the-art hydrologic time series prediction methods and general methods designed for long-term time series prediction.