Context-Aware Input Orchestration for Video Inpainting
This work addresses video inpainting for mobile applications, but it appears incremental as it builds on existing methods by modifying input data composition.
The paper tackles the challenge of achieving high-quality video inpainting on mobile devices with limited processing power and memory by optimizing memory usage through dynamic adjustment of input frame composition based on optical flow and mask changes, resulting in improved quality for various contents including rapid visual changes.
Traditional neural network-driven inpainting methods struggle to deliver high-quality results within the constraints of mobile device processing power and memory. Our research introduces an innovative approach to optimize memory usage by altering the composition of input data. Typically, video inpainting relies on a predetermined set of input frames, such as neighboring and reference frames, often limited to five-frame sets. Our focus is to examine how varying the proportion of these input frames impacts the quality of the inpainted video. By dynamically adjusting the input frame composition based on optical flow and changes of the mask, we have observed an improvement in various contents including rapid visual context changes.