CVOct 31, 2025

Object-IR: Leveraging Object Consistency and Mesh Deformation for Self-Supervised Image Retargeting

arXiv:2510.27236v1h-index: 3Has CodePattern Recognition
Originality Incremental advance
AI Analysis

This addresses the challenge of preserving object appearance in image retargeting for applications like content adaptation, though it is incremental as it builds on existing mesh-based methods.

The paper tackles the problem of geometric distortion in semantically important regions during image retargeting by proposing Object-IR, a self-supervised architecture that reformulates it as a mesh warping optimization, achieving state-of-the-art performance on the RetargetMe benchmark with an average inference time of 0.009s for 1024x683 resolution.

Eliminating geometric distortion in semantically important regions remains an intractable challenge in image retargeting. This paper presents Object-IR, a self-supervised architecture that reformulates image retargeting as a learning-based mesh warping optimization problem, where the mesh deformation is guided by object appearance consistency and geometric-preserving constraints. Given an input image and a target aspect ratio, we initialize a uniform rigid mesh at the output resolution and use a convolutional neural network to predict the motion of each mesh grid and obtain the deformed mesh. The retargeted result is generated by warping the input image according to the rigid mesh in the input image and the deformed mesh in the output resolution. To mitigate geometric distortion, we design a comprehensive objective function incorporating a) object-consistent loss to ensure that the important semantic objects retain their appearance, b) geometric-preserving loss to constrain simple scale transform of the important meshes, and c) boundary loss to enforce a clean rectangular output. Notably, our self-supervised paradigm eliminates the need for manually annotated retargeting datasets by deriving supervision directly from the input's geometric and semantic properties. Extensive evaluations on the RetargetMe benchmark demonstrate that our Object-IR achieves state-of-the-art performance, outperforming existing methods in quantitative metrics and subjective visual quality assessments. The framework efficiently processes arbitrary input resolutions (average inference time: 0.009s for 1024x683 resolution) while maintaining real-time performance on consumer-grade GPUs. The source code will soon be available at https://github.com/tlliao/Object-IR.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes