Analysis of Latent-Space Motion for Collaborative Intelligence
This work provides insights into how motion propagates through DNNs, which could be useful for researchers working on collaborative intelligence applications involving video compression or analysis.
This paper investigates the relationship between input video motion and the motion observed in the intermediate feature tensors of a deep neural network. The authors demonstrate that motion in each feature tensor channel is approximately a scaled version of the input motion, validating this finding through experiments with common motion models.
When the input to a deep neural network (DNN) is a video signal, a sequence of feature tensors is produced at the intermediate layers of the model. If neighboring frames of the input video are related through motion, a natural question is, "what is the relationship between the corresponding feature tensors?" By analyzing the effect of common DNN operations on optical flow, we show that the motion present in each channel of a feature tensor is approximately equal to the scaled version of the input motion. The analysis is validated through experiments utilizing common motion models. %These results will be useful in collaborative intelligence applications where sequences of feature tensors need to be compressed or further analyzed.