3.9CVJun 1
Deep Learning for Remote Sensing to Improve Flood Inundation MappingYogesh Bhattarai, Vijay Chaudhary, Wai Lim Kim et al.
Flooding is the most pervasive natural disaster worldwide. Timely and accurate flood inundation mapping are essential for informing disaster risk management. Optical satellite missions provide high-resolution, multispectral observations critical for flood detection and inundation mapping. However, their operational utility is severely constrained by cloud cover during extreme precipitation events. Conventional cloud-removal techniques based on temporal compositing or interpolation often fail to capture inundation dynamics. In this study, we introduce a cloud-removal framework for flood imagery based on Denoising Diffusion Probabilistic Models, leveraging the Masked Diffusion Transformer architecture. The proposed approach exploits self-attention mechanisms to capture wider spatial context and employs masked token modeling to explicitly learn the reconstruction of cloud-obscured regions. Trained on multispectral Sentinel-2B flood scenes with realistic cloud patterns, the model generates cloud-free image realizations that preserve both visual fidelity and hydrological consistency. Reconstruction performance is evaluated using standard image quality metrics alongside flood-specific hydrological measures, demonstrating improved continuity of water bodies and preservation of spectral signatures critical for water detection indices. The results indicate that diffusion-based generative modeling offers a robust and physically consistent alternative for cloud removal in optical flood monitoring, enabling more reliable, continuous observations to support disaster risk management and flood-related decision making.
CLJul 21, 2025
From Queries to Criteria: Understanding How Astronomers Evaluate LLMsAlina Hyk, Kiera McCormick, Mian Zhong et al.
There is growing interest in leveraging LLMs to aid in astronomy and other scientific research, but benchmarks for LLM evaluation in general have not kept pace with the increasingly diverse ways that real people evaluate and use these models. In this study, we seek to improve evaluation procedures by building an understanding of how users evaluate LLMs. We focus on a particular use case: an LLM-powered retrieval-augmented generation bot for engaging with astronomical literature, which we deployed via Slack. Our inductive coding of 368 queries to the bot over four weeks and our follow-up interviews with 11 astronomers reveal how humans evaluated this system, including the types of questions asked and the criteria for judging responses. We synthesize our findings into concrete recommendations for building better benchmarks, which we then employ in constructing a sample benchmark for evaluating LLMs for astronomy. Overall, our work offers ways to improve LLM evaluation and ultimately usability, particularly for use in scientific research.
LGJun 15, 2021
Machine Learning for Postprocessing Ensemble Streamflow ForecastsSanjib Sharma, Ganesh Raj Ghimire, Ridwan Siddique
Skillful streamflow forecasts can inform decisions in various areas of water policy and management. We integrate numerical weather prediction ensembles, distributed hydrological model and machine learning to generate ensemble streamflow forecasts at medium-range lead times (1 - 7 days). We demonstrate a case study for machine learning applications in postprocessing ensemble streamflow forecasts in the Upper Susquehanna River basin in the eastern United States. Our results show that the machine learning postprocessor can improve streamflow forecasts relative to low complexity forecasts (e.g., climatological and temporal persistence) as well as standalone hydrometeorological modeling and neural network. The relative gain in forecast skill from postprocessor is generally higher at medium-range timescales compared to shorter lead times; high flows compared to low-moderate flows, and warm-season compared to cool ones. Overall, our results highlight the benefits of machine learning in many aspects for improving both the skill and reliability of streamflow forecasts.