CV NESep 21, 2023

Video Scene Location Recognition with Neural Networks

Lukáš Korel, Petr Pulc, Jiří Tumpach, Martin Holeňa

arXiv:2309.11928v11.5h-index: 22

Originality Synthesis-oriented

AI Analysis

This addresses a domain-specific problem for video analysis, but it is incremental as it applies existing methods to a new dataset.

The paper tackles scene location recognition in videos from TV series using neural networks, testing various frame combination methods on a dataset from The Big Bang Theory and finding that only some approaches are suitable.

This paper provides an insight into the possibility of scene recognition from a video sequence with a small set of repeated shooting locations (such as in television series) using artificial neural networks. The basic idea of the presented approach is to select a set of frames from each scene, transform them by a pre-trained singleimage pre-processing convolutional network, and classify the scene location with subsequent layers of the neural network. The considered networks have been tested and compared on a dataset obtained from The Big Bang Theory television series. We have investigated different neural network layers to combine individual frames, particularly AveragePooling, MaxPooling, Product, Flatten, LSTM, and Bidirectional LSTM layers. We have observed that only some of the approaches are suitable for the task at hand.

View on arXiv PDF

Similar