CVLGSep 25, 2019

Beyond image classification: zooplankton identification with deep vector space embeddings

arXiv:1909.11380v12 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of ambiguous and overlapping classes in ecological data for researchers and institutions, offering a more flexible alternative to traditional classification methods.

The paper tackles the problem of zooplankton identification by proposing a deep convolutional network to create vector embeddings of images, which achieves comparable accuracy to specific classifiers and reveals data structures, with performance evaluated on new unseen classes.

Zooplankton images, like many other real world data types, have intrinsic properties that make the design of effective classification systems difficult. For instance, the number of classes encountered in practical settings is potentially very large, and classes can be ambiguous or overlap. In addition, the choice of taxonomy often differs between researchers and between institutions. Although high accuracy has been achieved in benchmarks using standard classifier architectures, biases caused by an inflexible classification scheme can have profound effects when the output is used in ecosystem assessments and monitoring. Here, we propose using a deep convolutional network to construct a vector embedding of zooplankton images. The system maps (embeds) each image into a high-dimensional Euclidean space so that distances between vectors reflect semantic relationships between images. We show that the embedding can be used to derive classifications with comparable accuracy to a specific classifier, but that it simultaneously reveals important structures in the data. Furthermore, we apply the embedding to new classes previously unseen by the system, and evaluate its classification performance in such cases. Traditional neural network classifiers perform well when the classes are clearly defined a priori and have sufficiently large labeled data sets available. For practical cases in ecology as well as in many other fields this is not the case, and we argue that the vector embedding method presented here is a more appropriate approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes