SeamEdit: A Black-Box VLM-Agnostic Pipeline for Large-Image Semantic Editing

arXiv:2606.13041v15.5

Predicted impact top 86% in CV · last 90 daysOriginality Synthesis-oriented

AI Analysis

For practitioners needing to edit large images with closed-source VLMs, SeamEdit provides a practical solution to common failure modes, though it is an incremental engineering contribution.

SeamEdit introduces a training-free, model-agnostic pipeline for semantic editing of large images using black-box VLMs, addressing failure modes like seam artifacts and alignment drift. The method reduces seam visibility and supports arbitrary tile region editing without quantitative performance numbers reported.

Semantic region editing for large images must satisfy two requirements at the same time: high generative quality and natural integration with surrounding content. Some related methods rely on white-box models and leave the strong generation capability of closed-source models underexplored. Directly applying closed-source models to tiled editing, however, introduces several failure modes: semantic deformation, canvas-level alignment drift, and visible seam artifacts. This paper presents SeamEdit, a training-free and model-agnostic pipeline that treats any VLM with inpainting capability as a black-box oracle. SeamEdit mitigates these issues through a five-stage post-hoc pipeline: overlay-based tile decomposition, black-box VLM inpainting, geometric and color-consistency correction, seam-risk-based multi-candidate ranking, and dynamic-programming curved seam fusion. The pipeline reduces seam visibility and supports semantic modification of arbitrary tile regions.

View on arXiv PDF

Similar