Zhenbang Jiao

h-index1
2papers

2 Papers

MLFeb 20, 2025
Policy-Oriented Binary Classification: Improving (KD-)CART Final Splits for Subpopulation Targeting

Lei Bill Wang, Zhenbang Jiao, Fangyi Wang

Policymakers often use recursive binary split rules to partition populations based on binary outcomes and target subpopulations whose probability of the binary event exceeds a threshold. We call such problems Latent Probability Classification (LPC). Practitioners typically employ Classification and Regression Trees (CART) for LPC. We prove that in the context of LPC, classic CART and the knowledge distillation method, whose student model is a CART (referred to as KD-CART), are suboptimal. We propose Maximizing Distance Final Split (MDFS), which generates split rules that strictly dominate CART/KD-CART under the unique intersect assumption. MDFS identifies the unique best split rule, is consistent, and targets more vulnerable subpopulations than CART/KD-CART. To relax the unique intersect assumption, we additionally propose Penalized Final Split (PFS) and weighted Empirical risk Final Split (wEFS). Through extensive simulation studies, we demonstrate that the proposed methods predominantly outperform CART/KD-CART. When applied to real-world datasets, MDFS generates policies that target more vulnerable subpopulations than the CART/KD-CART.

SRDec 27, 2019
Interpreting LSTM Prediction on Solar Flare Eruption with Time-series Clustering

Hu Sun, Ward Manchester, Zhenbang Jiao et al.

We conduct a post hoc analysis of solar flare predictions made by a Long Short Term Memory (LSTM) model employing data in the form of Space-weather HMI Active Region Patches (SHARP) parameters calculated from data in proximity to the magnetic polarity inversion line where the flares originate. We train the the LSTM model for binary classification to provide a prediction score for the probability of M/X class flares to occur in next hour. We then develop a dimension-reduction technique to reduce the dimensions of SHARP parameter (LSTM inputs) and demonstrate the different patterns of SHARP parameters corresponding to the transition from low to high prediction score. Our work shows that a subset of SHARP parameters contain the key signals that strong solar flare eruptions are imminent. The dynamics of these parameters have a highly uniform trajectory for many events whose LSTM prediction scores for M/X class flares transition from very low to very high. The results demonstrate the existence of a few threshold values of SHARP parameters that when surpassed indicate a high probability of the eruption of a strong flare. Our method has distilled the knowledge of solar flare eruption learnt by deep learning model and provides a more interpretable approximation, which provides physical insight to processes driving solar flares.