CVJun 7, 2020

How useful is Active Learning for Image-based Plant Phenotyping?

Koushik Nagasubramanian, Talukder Z. Jubery, Fateme Fotouhi Ardakani, Seyed Vahid Mirnezami, Asheesh K. Singh, Arti Singh, Soumik Sarkar, Baskar Ganapathysubramanian

arXiv:2006.04255v34.23 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses labeling inefficiencies for plant scientists and biologists, but it is incremental as it applies existing active learning methods to new datasets.

The study tackled the challenge of high labeling costs in image-based plant phenotyping by evaluating active learning methods to reduce the amount of labeled data needed for deep learning models, finding that active learning outperformed random sampling in classification performance on two datasets (soybean stresses and weed species).

Deep learning models have been successfully deployed for a diverse array of image-based plant phenotyping applications including disease detection and classification. However, successful deployment of supervised deep learning models requires large amount of labeled data, which is a significant challenge in plant science (and most biological) domains due to the inherent complexity. Specifically, data annotation is costly, laborious, time consuming and needs domain expertise for phenotyping tasks, especially for diseases. To overcome this challenge, active learning algorithms have been proposed that reduce the amount of labeling needed by deep learning models to achieve good predictive performance. Active learning methods adaptively select samples to annotate using an acquisition function to achieve maximum (classification) performance under a fixed labeling budget. We report the performance of four different active learning methods, (1) Deep Bayesian Active Learning (DBAL), (2) Entropy, (3) Least Confidence, and (4) Coreset, with conventional random sampling-based annotation for two different image-based classification datasets. The first image dataset consists of soybean [Glycine max L. (Merr.)] leaves belonging to eight different soybean stresses and a healthy class, and the second consists of nine different weed species from the field. For a fixed labeling budget, we observed that the classification performance of deep learning models with active learning-based acquisition strategies is better than random sampling-based acquisition for both datasets. The integration of active learning strategies for data annotation can help mitigate labelling challenges in the plant sciences applications particularly where deep domain knowledge is required.

View on arXiv PDF Code

Similar