CVNESep 21, 2023

Video Scene Location Recognition with Neural Networks

arXiv:2309.11928v1h-index: 22
Originality Synthesis-oriented
AI Analysis

This addresses a domain-specific problem for video analysis, but it is incremental as it applies existing methods to a new dataset.

The paper tackles scene location recognition in videos from TV series using neural networks, testing various frame combination methods on a dataset from The Big Bang Theory and finding that only some approaches are suitable.

This paper provides an insight into the possibility of scene recognition from a video sequence with a small set of repeated shooting locations (such as in television series) using artificial neural networks. The basic idea of the presented approach is to select a set of frames from each scene, transform them by a pre-trained singleimage pre-processing convolutional network, and classify the scene location with subsequent layers of the neural network. The considered networks have been tested and compared on a dataset obtained from The Big Bang Theory television series. We have investigated different neural network layers to combine individual frames, particularly AveragePooling, MaxPooling, Product, Flatten, LSTM, and Bidirectional LSTM layers. We have observed that only some of the approaches are suitable for the task at hand.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes