Reference-Based Video Colorization with Spatiotemporal Correspondence
This work provides an incremental improvement in video colorization quality for users who want to colorize grayscale videos with a reference image.
This paper tackles the problem of color leakage and average color emergence in reference-based video colorization by restricting color warping to regions with temporal correspondence. They achieve this by propagating masks using both off-the-shelf instance tracking and a novel dense tracking method, resulting in more faithful color propagation.
We propose a novel reference-based video colorization framework with spatiotemporal correspondence. Reference-based methods colorize grayscale frames referencing a user input color frame. Existing methods suffer from the color leakage between objects and the emergence of average colors, derived from non-local semantic correspondence in space. To address this issue, we warp colors only from the regions on the reference frame restricted by correspondence in time. We propagate masks as temporal correspondences, using two complementary tracking approaches: off-the-shelf instance tracking for high performance segmentation, and newly proposed dense tracking to track various types of objects. By restricting temporally-related regions for referencing colors, our approach propagates faithful colors throughout the video. Experiments demonstrate that our method outperforms state-of-the-art methods quantitatively and qualitatively.