CV CLDec 17, 2018

Feature Fusion Effects of Tensor Product Representation on (De)Compositional Network for Caption Generation for Images

arXiv:1812.06624v1

Originality Incremental advance

AI Analysis

This work addresses image captioning for AI applications, but it appears incremental as it builds on existing TPR methods.

The paper tackled the problem of improving image captioning by using Tensor Product Representation (TPR) to better structure the relationship between visual features and language, achieving considerable improvement over previous architectures.

Progress in image captioning is gradually getting complex as researchers try to generalized the model and define the representation between visual features and natural language processing. This work tried to define such kind of relationship in the form of representation called Tensor Product Representation (TPR) which generalized the scheme of language modeling and structuring the linguistic attributes (related to grammar and parts of speech of language) which will provide a much better structure and grammatically correct sentence. TPR enables better and unique representation and structuring of the feature space and will enable better sentence composition from these representations. A large part of the different ways of defining and improving these TPR are discussed and their performance with respect to the traditional procedures and feature representations are evaluated for image captioning application. The new models achieved considerable improvement than the corresponding previous architectures.

View on arXiv PDF

Similar