Thijs Defraeye

CV
h-index45
5papers
112citations
Novelty37%
AI Score26

5 Papers

CVJul 10, 2022
Facilitated machine learning for image-based fruit quality assessment

Manuel Knott, Fernando Perez-Cruz, Thijs Defraeye

Image-based machine learning models can be used to make the sorting and grading of agricultural products more efficient. In many regions, implementing such systems can be difficult due to the lack of centralization and automation of postharvest supply chains. Stakeholders are often too small to specialize in machine learning, and large training data sets are unavailable. We propose a machine learning procedure for images based on pre-trained Vision Transformers. It is easier to implement than the current standard approach of training Convolutional Neural Networks (CNNs) as we do not (re-)train deep neural networks. We evaluate our approach based on two data sets for apple defect detection and banana ripeness estimation. Our model achieves a competitive classification accuracy equal to or less than one percent below the best-performing CNN. At the same time, it requires three times fewer training samples to achieve a 90% accuracy.

CVJul 16, 2024
A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification

Markus Marks, Manuel Knott, Neehar Kondapaneni et al.

Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels. The model is forced to learn about the data structure or context by solving a pretext task. With SSL, models can learn from abundant and cheap unlabeled data, significantly reducing the cost of training models where labels are expensive or inaccessible. In Computer Vision, SSL is widely used as pre-training followed by a downstream task, such as supervised transfer, few-shot learning on smaller labeled data sets, and/or unsupervised clustering. Unfortunately, it is infeasible to evaluate SSL methods on all possible downstream tasks and objectively measure the quality of the learned representation. Instead, SSL methods are evaluated using in-domain evaluation protocols, such as fine-tuning, linear probing, and k-nearest neighbors (kNN). However, it is not well understood how well these evaluation protocols estimate the representation quality of a pre-trained model for different downstream tasks under different conditions, such as dataset, metric, and model architecture. We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types. Our study includes eleven common image datasets and 26 models that were pre-trained with different SSL methods or have different model backbones. We find that in-domain linear/kNN probing protocols are, on average, the best general predictors for out-of-domain performance. We further investigate the importance of batch normalization and evaluate how robust correlations are for different kinds of dataset domain shifts. We challenge assumptions about the relationship between discriminative and generative self-supervised methods, finding that most of their performance differences can be explained by changes to model backbones.

CVMar 28, 2022
Using Machine Learning to generate an open-access cropland map from satellite images time series in the Indian Himalayan Region

Danya Li, Joaquin Gajardo, Michele Volpi et al.

Crop maps are crucial for agricultural monitoring and food management and can additionally support domain-specific applications, such as setting cold supply chain infrastructure in developing countries. Machine learning (ML) models, combined with freely-available satellite imagery, can be used to produce cost-effective and high spatial-resolution crop maps. However, accessing ground truth data for supervised learning is especially challenging in developing countries due to factors such as smallholding and fragmented geography, which often results in a lack of crop type maps or even reliable cropland maps. Our area of interest for this study lies in Himachal Pradesh, India, where we aim at producing an open-access binary cropland map at 10-meter resolution for the Kullu, Shimla, and Mandi districts. To this end, we developed an ML pipeline that relies on Sentinel-2 satellite images time series. We investigated two pixel-based supervised classifiers, support vector machines (SVM) and random forest (RF), which are used to classify per-pixel time series for binary cropland mapping. The ground truth data used for training, validation and testing was manually annotated from a combination of field survey reference points and visual interpretation of very high resolution (VHR) imagery. We trained and validated the models via spatial cross-validation to account for local spatial autocorrelation and selected the RF model due to overall robustness and lower computational cost. We tested the generalization capability of the chosen model at the pixel level by computing the accuracy, recall, precision, and F1-score on hold-out test sets of each district, achieving an average accuracy for the RF (our best model) of 87%. We used this model to generate a cropland map for three districts of Himachal Pradesh, spanning 14,600 km2, which improves the resolution and quality of existing public maps.

CVNov 25, 2024Code
Weakly Supervised Panoptic Segmentation for Defect-Based Grading of Fresh Produce

Manuel Knott, Divinefavour Odion, Sameer Sontakke et al.

Visual inspection for defect grading in agricultural supply chains is crucial but traditionally labor-intensive and error-prone. Automated computer vision methods typically require extensively annotated datasets, which are often unavailable in decentralized supply chains. We address this challenge by evaluating the Segment Anything Model (SAM) to generate dense panoptic segmentation masks from sparse annotations. These dense predictions are then used to train a supervised panoptic segmentation model. Focusing on banana surface defects (bruises and scars), we validate our approach using 476 field images annotated with 1440 defects. While SAM-generated masks generally align with human annotations, substantially reducing annotation effort, we explicitly identify failure cases associated with specific defect sizes and shapes. Despite these limitations, our approach offers practical estimates of defect number and relative size from panoptic masks, underscoring the potential and current boundaries of foundation models for defect quantification in low-data agricultural scenarios. GitHub: https://github.com/manuelknott/banana-defect-segmentation

CVDec 18, 2023
Evaluating the Role of Training Data Origin for Country-Scale Cropland Mapping in Data-Scarce Regions: A Case Study of Nigeria

Joaquin Gajardo, Michele Volpi, Daniel Onwude et al.

Cropland maps are essential for remote sensing-based agricultural monitoring, providing timely insights without extensive field surveys. Machine learning enables large-scale mapping but depends on geo-referenced ground-truth data, which is costly to collect, motivating the use of global datasets in data-scarce regions. A key challenge is understanding how the quantity, quality, and proximity of the training data to the target region influences model performance. We evaluate this in Nigeria, using 1,827 manually labelled samples covering the whole country, and subsets of the Geowiki dataset: Nigeria-only, regional (Nigeria and neighbouring countries), and global. We extract pixel-wise multi-source time series arrays from Sentinel-1, Sentinel-2, ERA5 climate, and a digital elevation model using Google Earth Engine, comparing Random Forests with LSTMs, including a lightweight multi-headed LSTM variant. Results show local data significantly boosts performance, with accuracy gains up to 0.246 (RF) and 0.178 (LSTM). Nigeria-only or regional data outperformed global data despite the lower amount of labels, with the exception of the multi-headed LSTM, which benefited from global data when local samples were absent. Sentinel-1, climate, and topographic data are critical data sources, with their removal reducing F1-score by up to 0.593. Addressing class imbalance also improved LSTM accuracy by up to 0.071. Our top-performing model (Nigeria-only LSTM) achieved an F1-score of 0.814 and accuracy of 0.842, matching the best global land cover product while offering stronger recall, critical for food security. We release code, data, maps, and an interactive web app to support future work.