CVFeb 3, 2024Code
ScribFormer: Transformer Makes CNN Work Better for Scribble-based Medical Image SegmentationZihan Li, Yuan Zheng, Dandan Shan et al. · uw
Most recent scribble-supervised segmentation methods commonly adopt a CNN framework with an encoder-decoder architecture. Despite its multiple benefits, this framework generally can only capture small-range feature dependency for the convolutional layer with the local receptive field, which makes it difficult to learn global shape information from the limited information provided by scribble annotations. To address this issue, this paper proposes a new CNN-Transformer hybrid solution for scribble-supervised medical image segmentation called ScribFormer. The proposed ScribFormer model has a triple-branch structure, i.e., the hybrid of a CNN branch, a Transformer branch, and an attention-guided class activation map (ACAM) branch. Specifically, the CNN branch collaborates with the Transformer branch to fuse the local features learned from CNN with the global representations obtained from Transformer, which can effectively overcome limitations of existing scribble-supervised segmentation methods. Furthermore, the ACAM branch assists in unifying the shallow convolution features and the deep convolution features to improve model's performance further. Extensive experiments on two public datasets and one private dataset show that our ScribFormer has superior performance over the state-of-the-art scribble-supervised segmentation methods, and achieves even better results than the fully-supervised segmentation methods. The code is released at https://github.com/HUANGLIZI/ScribFormer.
LGJan 6, 2025
Skillful High-Resolution Ensemble Precipitation Forecasting with an Integrated Deep Learning FrameworkShuangshuang He, Hongli Liang, Yuanting Zhang et al.
High-resolution precipitation forecasts are crucial for providing accurate weather prediction and supporting effective responses to extreme weather events. Traditional numerical models struggle with stochastic subgrid-scale processes, while recent deep learning models often produce blurry results. To address these challenges, we propose a physics-inspired deep learning framework for high-resolution (0.05\textdegree{} $\times$ 0.05\textdegree{}) ensemble precipitation forecasting. Trained on ERA5 and CMPA high-resolution precipitation datasets, the framework integrates deterministic and probabilistic components. The deterministic model, based on a 3D SwinTransformer, captures average precipitation at mesoscale resolution and incorporates strategies to enhance performance, particularly for moderate to heavy rainfall. The probabilistic model employs conditional diffusion in latent space to account for uncertainties in residual precipitation at convective scales. During inference, ensemble members are generated by repeatedly sampling latent variables, enabling the model to represent precipitation uncertainty. Our model significantly enhances spatial resolution and forecast accuracy. Rank histogram shows that the ensemble system is reliable and unbiased. In a case study of heavy precipitation in southern China, the model outputs align more closely with observed precipitation distributions than ERA5, demonstrating superior capability in capturing extreme precipitation events. Additionally, 5-day real-time forecasts show good performance in terms of CSI scores.
LGNov 20, 2025
Physics-Guided Inductive Spatiotemporal Kriging for PM2.5 with Satellite Gradient ConstraintsShuo Wang, Mengfan Teng, Yun Cheng et al.
High-resolution mapping of fine particulate matter (PM2.5) is a cornerstone of sustainable urbanism but remains critically hindered by the spatial sparsity of ground monitoring networks. While traditional data-driven methods attempt to bridge this gap using satellite Aerosol Optical Depth (AOD), they often suffer from severe, non-random data missingness (e.g., due to cloud cover or nighttime) and inversion biases. To overcome these limitations, this study proposes the Spatiotemporal Physics-Guided Inference Network (SPIN), a novel framework designed for inductive spatiotemporal kriging. Unlike conventional approaches, SPIN synergistically integrates domain knowledge into deep learning by explicitly modeling physical advection and diffusion processes via parallel graph kernels. Crucially, we introduce a paradigm-shifting training strategy: rather than using error-prone AOD as a direct input, we repurpose it as a spatial gradient constraint within the loss function. This allows the model to learn structural pollution patterns from satellite data while remaining robust to data voids. Validated in the highly polluted Beijing-Tianjin-Hebei and Surrounding Areas (BTHSA), SPIN achieves a new state-of-the-art with a Mean Absolute Error (MAE) of 9.52 ug/m^3, effectively generating continuous, physically plausible pollution fields even in unmonitored areas. This work provides a robust, low-cost, and all-weather solution for fine-grained environmental management.
LGSep 18, 2025
FlowCast-ODE: Continuous Hourly Weather Forecasting with Dynamic Flow Matching and ODE SolverShuangshuang He, Yuanting Zhang, Hongli Liang et al.
Data-driven hourly weather forecasting models often face the challenge of error accumulation in long-term predictions. The problem is exacerbated by non-physical temporal discontinuities present in widely-used training datasets such as ECMWF Reanalysis v5 (ERA5), which stem from its 12-hour assimilation cycle. Such artifacts lead hourly autoregressive models to learn spurious dynamics and rapidly accumulate errors. To address this, we introduce FlowCast-ODE, a novel framework that treats atmospheric evolution as a continuous flow to ensure temporal coherence. Our method employs dynamic flow matching to learn the instantaneous velocity field from data and an ordinary differential equation (ODE) solver to generate smooth and temporally continuous hourly predictions. By pre-training on 6-hour intervals to sidestep data discontinuities and fine-tuning on hourly data, FlowCast-ODE produces seamless forecasts for up to 120 hours with a single lightweight model. It achieves competitive or superior skill on key meteorological variables compared to baseline models, preserves fine-grained spatial details, and demonstrates strong performance in forecasting extreme events, such as tropical cyclone tracks.
LGMar 16, 2025
CNCast: Leveraging 3D Swin Transformer and DiT for Enhanced Regional Weather ForecastingHongli Liang, Yuanting Zhang, Qingye Meng et al.
This study introduces a cutting-edge regional weather forecasting model based on the SwinTransformer 3D architecture. This model is specifically designed to deliver precise hourly weather predictions ranging from 1 hour to 5 days, significantly improving the reliability and practicality of short-term weather forecasts. Our model has demonstrated generally superior performance when compared to Pangu, a well-established global model. The evaluation indicates that our model excels in predicting most weather variables, highlighting its potential as a more effective alternative in the field of limited area modeling. A noteworthy feature of this model is the integration of enhanced boundary conditions, inspired by traditional numerical weather prediction (NWP) techniques. This integration has substantially improved the model's predictive accuracy. Additionally, the model includes an innovative approach for diagnosing hourly total precipitation at a high spatial resolution of approximately 5 kilometers. This is achieved through a latent diffusion model, offering an alternative method for generating high-resolution precipitation data.