Towards Data-Driven Automatic Video Editing
This addresses the problem of automating video editing for content creators, but it appears incremental as it builds on existing methods like ImageNet-trained networks and imitation learning.
The paper tackled the problem of automatic video editing by selecting valuable footage and cutting it into a coherent story, using a data-driven approach with a convolutional neural network and imitation learning, resulting in a controller that learned basic cinematography rules from a corpus of masterpieces.
Automatic video editing involving at least the steps of selecting the most valuable footage from points of view of visual quality and the importance of action filmed; and cutting the footage into a brief and coherent visual story that would be interesting to watch is implemented in a purely data-driven manner. Visual semantic and aesthetic features are extracted by the ImageNet-trained convolutional neural network, and the editing controller is trained by an imitation learning algorithm. As a result, at test time the controller shows the signs of observing basic cinematography editing rules learned from the corpus of motion pictures masterpieces.