FineTag: Multi-attribute Classification at Fine-grained Level in Images
This work addresses fine-grained attribute extraction for computer vision applications, presenting an incremental improvement in efficiency.
The paper tackles fine-grained multi-attribute classification in images by proposing an end-to-end bi-linear CNN with pairwise ranking loss, achieving performance comparable to or better than a baseline with 40 times fewer parameters.
In this paper, we address the extraction of the fine-grained attributes of an instance as a `multi-attribute classification' problem. To this end, we propose an end-to-end architecture by adopting the bi-linear Convolutional Neural Network with the pairwise ranking loss. This is the first time such architecture is applied for the fine-grained attributes classification problem. We compared the proposed method with a competitive deep Convolutional Neural Network baseline. Extensive experiments show that the proposed method attains/outperforms the performance of compared baseline with significantly less number of parameters ($40\times$ less). We demonstrated our approach on CUB200 birds dataset whose annotations are adapted in this work for multi-attribute classification at fine-grained level.