Heidi A. Hanson

h-index58

5papers

8citations

Novelty39%

AI Score38

Ranked #108,897 of 201,326 authors (top 54%)#19,743 in CL (top 61%)

5 Papers

92.6APApr 21

Ground-Level Near Real-Time Modeling for PM2.5 Pollution Prediction

Zachary R. Fox, Janet O. Agbaje, Dakotah Maguire et al.

Air pollution is a worldwide public health threat that can cause or exacerbate many illnesses, including respiratory disease, cardiovascular disease, and some cancers. However, epidemiological studies and public health decision-making are stymied by the inability to assess pollution exposure impacts in near real time. To address this, developing accurate digital twins of environmental pollutants will enable timely data-driven analytics - a crucial step in modernizing health policy and decision-making. Although other models predict and analyze fine particulate matter exposure, they often rely on modeled input data sources and data streams that are not regularly updated. Another challenge stems from current models relying on predefined grids. In contrast, our deep-learning approach interpolates surface level PM2.5 concentrations between sparsely distributed US EPA monitoring stations in a grid-free manner. By incorporating additional, readily available datasets - including topographic, meteorological, and land-use data - we improve its ability to predict pollutant concentrations with high spatial and temporal resolution. This enables model querying at any spatial location for rapid predictions without computing over the entire grid. To ensure robustness, we randomize spatial sampling during training to enable our model to perform well in both dense and sparse monitored regions. This model is well suited for near real-time deployment because its lightweight architecture allows for fast updates in response to streaming data. Moreover, model flexibility and scalability allow it to be adapted to various geographical contexts and scales, making it a practical tool for delivering accurate and timely air quality assessments. Its capacity to rapidly evaluate multiple scenarios can be especially valuable for decision-making during public health crises.

IVApr 24, 2024

Enhancing Diagnosis through AI-driven Analysis of Reflectance Confocal Microscopy

Hong-Jun Yoon, Chris Keum, Alexander Witkowski et al.

Reflectance Confocal Microscopy (RCM) is a non-invasive imaging technique used in biomedical research and clinical dermatology. It provides virtual high-resolution images of the skin and superficial tissues, reducing the need for physical biopsies. RCM employs a laser light source to illuminate the tissue, capturing the reflected light to generate detailed images of microscopic structures at various depths. Recent studies explored AI and machine learning, particularly CNNs, for analyzing RCM images. Our study proposes a segmentation strategy based on textural features to identify clinically significant regions, empowering dermatologists in effective image interpretation and boosting diagnostic confidence. This approach promises to advance dermatological diagnosis and treatment.

CLJul 28, 2025

Can human clinical rationales improve the performance and explainability of clinical text classification models?

Christoph Metzner, Shang Gao, Drahomira Herrmannova et al.

AI-driven clinical text classification is vital for explainable automated retrieval of population-level health information. This work investigates whether human-based clinical rationales can serve as additional supervision to improve both performance and explainability of transformer-based models that automatically encode clinical documents. We analyzed 99,125 human-based clinical rationales that provide plausible explanations for primary cancer site diagnoses, using them as additional training samples alongside 128,649 electronic pathology reports to evaluate transformer-based models for extracting primary cancer sites. We also investigated sufficiency as a way to measure rationale quality for pre-selecting rationales. Our results showed that clinical rationales as additional training data can improve model performance in high-resource scenarios but produce inconsistent behavior when resources are limited. Using sufficiency as an automatic metric to preselect rationales also leads to inconsistent results. Importantly, models trained on rationales were consistently outperformed by models trained on additional reports instead. This suggests that clinical rationales don't consistently improve model performance and are outperformed by simply using more reports. Therefore, if the goal is optimizing accuracy, annotation efforts should focus on labeling more reports rather than creating rationales. However, if explainability is the priority, training models on rationale-supplemented data may help them better identify rationale-like features. We conclude that using clinical rationales as additional training data results in smaller performance improvements and only slightly better explainability (measured as average token-level rationale coverage) compared to training on additional reports.

LGApr 1, 2025

Global explainability of a deep abstaining classifier

Sayera Dhaubhadel, Jamaludin Mohd-Yusof, Benjamin H. McMahon et al.

We present a global explainability method to characterize sources of errors in the histology prediction task of our real-world multitask convolutional neural network (MTCNN)-based deep abstaining classifier (DAC), for automated annotation of cancer pathology reports from NCI-SEER registries. Our classifier was trained and evaluated on 1.04 million hand-annotated samples and makes simultaneous predictions of cancer site, subsite, histology, laterality, and behavior for each report. The DAC framework enables the model to abstain on ambiguous reports and/or confusing classes to achieve a target accuracy on the retained (non-abstained) samples, but at the cost of decreased coverage. Requiring 97% accuracy on the histology task caused our model to retain only 22% of all samples, mostly the less ambiguous and common classes. Local explainability with the GradInp technique provided a computationally efficient way of obtaining contextual reasoning for thousands of individual predictions. Our method, involving dimensionality reduction of approximately 13000 aggregated local explanations, enabled global identification of sources of errors as hierarchical complexity among classes, label noise, insufficient information, and conflicting evidence. This suggests several strategies such as exclusion criteria, focused annotation, and reduced penalties for errors involving hierarchically related classes to iteratively improve our DAC in this complex real-world implementation.

MLMay 15, 2023

Topological Interpretability for Deep-Learning

Adam Spannaus, Heidi A. Hanson, Lynne Penberthy et al.

With the growing adoption of AI-based systems across everyday life, the need to understand their decision-making mechanisms is correspondingly increasing. The level at which we can trust the statistical inferences made from AI-based decision systems is an increasing concern, especially in high-risk systems such as criminal justice or medical diagnosis, where incorrect inferences may have tragic consequences. Despite their successes in providing solutions to problems involving real-world data, deep learning (DL) models cannot quantify the certainty of their predictions. These models are frequently quite confident, even when their solutions are incorrect. This work presents a method to infer prominent features in two DL classification models trained on clinical and non-clinical text by employing techniques from topological and geometric data analysis. We create a graph of a model's feature space and cluster the inputs into the graph's vertices by the similarity of features and prediction statistics. We then extract subgraphs demonstrating high-predictive accuracy for a given label. These subgraphs contain a wealth of information about features that the DL model has recognized as relevant to its decisions. We infer these features for a given label using a distance metric between probability measures, and demonstrate the stability of our method compared to the LIME and SHAP interpretability methods. This work establishes that we may gain insights into the decision mechanism of a DL model. This method allows us to ascertain if the model is making its decisions based on information germane to the problem or identifies extraneous patterns within the data.