Complex Sequential Understanding through the Awareness of Spatial and Temporal Concepts
This addresses the limitation of current neural networks in handling large-scale spatial representations over long-range sequences, offering a novel method for sequential understanding in AI.
The paper tackles the problem of understanding sequential information by introducing a Semi-Coupled Structure (SCS) that decouples spatial and temporal concept learning, resulting in improved performance for tasks like object annotation, video action recognition, and meteorological radar echo prediction.
Understanding sequential information is a fundamental task for artificial intelligence. Current neural networks attempt to learn spatial and temporal information as a whole, limited their abilities to represent large scale spatial representations over long-range sequences. Here, we introduce a new modeling strategy called Semi-Coupled Structure (SCS), which consists of deep neural networks that decouple the complex spatial and temporal concepts learning. Semi-Coupled Structure can learn to implicitly separate input information into independent parts and process these parts respectively. Experiments demonstrate that a Semi-Coupled Structure can successfully annotate the outline of an object in images sequentially and perform video action recognition. For sequence-to-sequence problems, a Semi-Coupled Structure can predict future meteorological radar echo images based on observed images. Taken together, our results demonstrate that a Semi-Coupled Structure has the capacity to improve the performance of LSTM-like models on large scale sequential tasks.