Anthony Yang

h-index125
2papers

2 Papers

50.8CVApr 3
Banana100: Breaking NR-IQA Metrics by 100 Iterative Image Replications with Nano Banana Pro

Kenan Tang, Praveen Arunshankar, Andong Hua et al.

The multi-step, iterative image editing capabilities of multi-modal agentic systems have transformed digital content creation. Although latest image editing models faithfully follow instructions and generate high-quality images in single-turn edits, we identify a critical weakness in multi-turn editing, which is the iterative degradation of image quality. As images are repeatedly edited, minor artifacts accumulate, rapidly leading to a severe accumulation of visible noise and a failure to follow simple editing instructions. To systematically study these failures, we introduce Banana100, a comprehensive dataset of 28,000 degraded images generated through 100 iterative editing steps, including diverse textures and image content. Alarmingly, image quality evaluators fail to detect the degradation. Among 21 popular no-reference image quality assessment (NR-IQA) metrics, none of them consistently assign lower scores to heavily degraded images than to clean ones. The dual failures of generators and evaluators may threaten the stability of future model training and the safety of deployed agentic systems, if the low-quality synthetic data generated by multi-turn edits escape quality filters. We release the full code and data to facilitate the development of more robust models, helping to mitigate the fragility of multi-modal agentic systems.

IVFeb 1, 2024
MRAnnotator: multi-Anatomy and many-Sequence MRI segmentation of 44 structures

Alexander Zhou, Zelong Liu, Andrew Tieu et al.

In this retrospective study, we annotated 44 structures on two datasets: an internal dataset of 1,518 MRI sequences from 843 patients at the Mount Sinai Health System, and an external dataset of 397 MRI sequences from 263 patients for benchmarking. The internal dataset trained the nnU-Net model MRAnnotator, which demonstrated strong generalizability on the external dataset. MRAnnotator outperformed existing models such as TotalSegmentator MRI and MRSegmentator on both datasets, achieving an overall average Dice score of 0.878 on the internal dataset and 0.875 on the external set. Model weights are available on GitHub, and the external test set can be shared upon request.