CVSep 19, 2022Code
SOCRATES: A Stereo Camera Trap for Monitoring of BiodiversityTimm Haucke, Hjalmar S. Kühl, Volker Steinhage
The development and application of modern technology is an essential basis for the efficient monitoring of species in natural habitats and landscapes to trace the development of ecosystems, species communities, and populations, and to analyze reasons of changes. For estimating animal abundance using methods such as camera trap distance sampling, spatial information of natural habitats in terms of 3D (three-dimensional) measurements is crucial. Additionally, 3D information improves the accuracy of animal detection using camera trapping. This study presents a novel approach to 3D camera trapping featuring highly optimized hardware and software. This approach employs stereo vision to infer 3D information of natural habitats and is designated as StereO CameRA Trap for monitoring of biodivErSity (SOCRATES). A comprehensive evaluation of SOCRATES shows not only a $3.23\%$ improvement in animal detection (bounding box $\text{mAP}_{75}$) but also its superior applicability for estimating animal abundance using camera trap distance sampling. The software and documentation of SOCRATES is provided at https://github.com/timmh/socrates
LGAug 26, 2022
Extreme Gradient Boosting for Yield Estimation compared with Deep Learning ApproachesFlorian Huber, Artem Yushchenko, Benedikt Stratmann et al.
Accurate prediction of crop yield before harvest is of great importance for crop logistics, market planning, and food distribution around the world. Yield prediction requires monitoring of phenological and climatic characteristics over extended time periods to model the complex relations involved in crop development. Remote sensing satellite images provided by various satellites circumnavigating the world are a cheap and reliable way to obtain data for yield prediction. The field of yield prediction is currently dominated by Deep Learning approaches. While the accuracies reached with those approaches are promising, the needed amounts of data and the ``black-box'' nature can restrict the application of Deep Learning methods. The limitations can be overcome by proposing a pipeline to process remote sensing images into feature-based representations that allow the employment of Extreme Gradient Boosting (XGBoost) for yield prediction. A comparative evaluation of soybean yield prediction within the United States shows promising prediction accuracies compared to state-of-the-art yield prediction systems based on Deep Learning. Feature importances expose the near-infrared spectrum of light as an important feature within our models. The reported results hint at the capabilities of XGBoost for yield prediction and encourage future experiments with XGBoost for yield prediction on other crops in regions all around the world.
LGApr 14, 2023
Grouping Shapley Value Feature Importances of Random Forests for explainable Yield PredictionFlorian Huber, Hannes Engler, Anna Kicherer et al.
Explainability in yield prediction helps us fully explore the potential of machine learning models that are already able to achieve high accuracy for a variety of yield prediction scenarios. The data included for the prediction of yields are intricate and the models are often difficult to understand. However, understanding the models can be simplified by using natural groupings of the input features. Grouping can be achieved, for example, by the time the features are captured or by the sensor used to do so. The state-of-the-art for interpreting machine learning models is currently defined by the game-theoretic approach of Shapley values. To handle groups of features, the calculated Shapley values are typically added together, ignoring the theoretical limitations of this approach. We explain the concept of Shapley values directly computed for predefined groups of features and introduce an algorithm to compute them efficiently on tree structures. We provide a blueprint for designing swarm plots that combine many local explanations for global understanding. Extensive evaluation of two different yield prediction problems shows the worth of our approach and demonstrates how we can enable a better understanding of yield prediction models in the future, ultimately leading to mutual enrichment of research and application.
CVFeb 9, 2022Code
Automated Distance Estimation for Wildlife Camera TrappingPeter Johanns, Timm Haucke, Volker Steinhage
The ongoing biodiversity crisis calls for accurate estimation of animal density and abundance to identify sources of biodiversity decline and effectiveness of conservation interventions. Camera traps together with abundance estimation methods are often employed for this purpose. The necessary distances between camera and observed animals are traditionally derived in a laborious, fully manual or semi-automatic process. Both approaches require reference image material, which is both difficult to acquire and not available for existing datasets. We propose a fully automatic approach we call AUtomated DIstance esTimation (AUDIT) to estimate camera-to-animal distances. We leverage existing state-of-the-art relative monocular depth estimation and combine it with a novel alignment procedure to estimate metric distances. AUDIT is fully automated and requires neither the comparison of observations in camera trap imagery with reference images nor capturing of reference image material at all. AUDIT therefore relieves biologists and ecologists from a significant workload. We evaluate AUDIT on a zoo scenario dataset unseen during training where we achieve a mean absolute distance estimation error over all animal instances of only 0.9864 meters and mean relative error (REL) of 0.113. The code and usage instructions are available at https://github.com/PJ-cs/DistanceEstimationTracking
CVFeb 8, 2024
On Convolutional Vision Transformers for Yield PredictionAlvin Inderka, Florian Huber, Volker Steinhage
While a variety of methods offer good yield prediction on histogrammed remote sensing data, vision Transformers are only sparsely represented in the literature. The Convolution vision Transformer (CvT) is being tested to evaluate vision Transformers that are currently achieving state-of-the-art results in many other vision tasks. CvT combines some of the advantages of convolution with the advantages of dynamic attention and global context fusion of Transformers. It performs worse than widely tested methods such as XGBoost and CNNs, but shows that Transformers have potential to improve yield prediction.
CVMay 10, 2021
Overcoming the Distance Estimation Bottleneck in Estimating Animal Abundance with Camera TrapsTimm Haucke, Hjalmar S. Kühl, Jacqueline Hoyer et al.
The biodiversity crisis is still accelerating, despite increasing efforts by the international community. Estimating animal abundance is of critical importance to assess, for example, the consequences of land-use change and invasive species on community composition, or the effectiveness of conservation interventions. Various approaches have been developed to estimate abundance of unmarked animal populations. Whereas these approaches differ in methodological details, they all require the estimation of the effective area surveyed in front of a camera trap. Until now camera-to-animal distance measurements are derived by laborious, manual and subjective estimation methods. To overcome this distance estimation bottleneck, this study proposes an automatized pipeline utilizing monocular depth estimation and depth image calibration methods. We are able to reduce the manual effort required by a factor greater than 21 and provide our system at https://timm.haucke.xyz/publications/distance-estimation-animal-abundance
NIFeb 16, 2021
Automated Identification of Vulnerable Devices in Networks using Traffic Data and Deep LearningJakob Greis, Artem Yushchenko, Daniel Vogel et al.
Many IoT devices are vulnerable to attacks due to flawed security designs and lacking mechanisms for firmware updates or patches to eliminate the security vulnerabilities. Device-type identification combined with data from vulnerability databases can pinpoint vulnerable IoT devices in a network and can be used to constrain the communications of vulnerable devices for preventing damage. In this contribution, we present and evaluate two deep learning approaches to the reliable IoT device-type identification, namely a recurrent and a convolutional network architecture. Both deep learning approaches show accuracies of 97% and 98%, respectively, and thereby outperform an up-to-date IoT device-type identification approach using hand-crafted fingerprint features obtaining an accuracy of 82%. The runtime performance for the IoT identification of both deep learning approaches outperforms the hand-crafted approach by three magnitudes. Finally, importance metrics explain the results of both deep learning approaches in terms of the utilization of the analyzed traffic data flow.
CVFeb 10, 2021
Exploiting Depth Information for Wildlife MonitoringTimm Haucke, Volker Steinhage
Camera traps are a proven tool in biology and specifically biodiversity research. However, camera traps including depth estimation are not widely deployed, despite providing valuable context about the scene and facilitating the automation of previously laborious manual ecological methods. In this study, we propose an automated camera trap-based approach to detect and identify animals using depth estimation. To detect and identify individual animals, we propose a novel method D-Mask R-CNN for the so-called instance segmentation which is a deep learning-based technique to detect and delineate each distinct object of interest appearing in an image or a video clip. An experimental evaluation shows the benefit of the additional depth estimation in terms of improved average precision scores of the animal detection compared to the standard approach that relies just on the image information. This novel approach was also evaluated in terms of a proof-of-concept in a zoo scenario using an RGB-D camera trap.
CVOct 18, 2020
Image-based Automated Species Identification: Can Virtual Data Augmentation Overcome Problems of Insufficient Sampling?Morris Klasen, Dirk Ahrens, Jonas Eberle et al.
Automated species identification and delimitation is challenging, particularly in rare and thus often scarcely sampled species, which do not allow sufficient discrimination of infraspecific versus interspecific variation. Typical problems arising from either low or exaggerated interspecific morphological differentiation are best met by automated methods of machine learning that learn efficient and effective species identification from training samples. However, limited infraspecific sampling remains a key challenge also in machine learning. 1In this study, we assessed whether a two-level data augmentation approach may help to overcome the problem of scarce training data in automated visual species identification. The first level of visual data augmentation applies classic approaches of data augmentation and generation of faked images using a GAN approach. Descriptive feature vectors are derived from bottleneck features of a VGG-16 convolutional neural network (CNN) that are then stepwise reduced in dimensionality using Global Average Pooling and PCA to prevent overfitting. The second level of data augmentation employs synthetic additional sampling in feature space by an oversampling algorithm in vector space (SMOTE). Applied on two challenging datasets of scarab beetles (Coleoptera), our augmentation approach outperformed a non-augmented deep learning baseline approach as well as a traditional 2D morphometric approach (Procrustes analysis).
CVNov 23, 2018
An Adaptive Approach for Automated Grapevine Phenotyping using VGG-based Convolutional Neural NetworksJonatan Grimm, Katja Herzog, Florian Rist et al.
In (grapevine) breeding programs and research, periodic phenotyping and multi-year monitoring of different grapevine traits, like growth or yield, is needed especially in the field. This demand imply objective, precise and automated methods using sensors and adaptive software. This work presents a proof-of-concept analyzing RGB images of different growth stages of grapevines with the aim to detect and quantify promising plant organs which are related to yield. The input images are segmented by a Fully Convolutional Neural Network (FCN) into object and background pixels. The objects are plant organs like young shoots, pedicels, flower buds or grapes, which are principally suitable for yield estimation. In the ground truth of the training images, each object is separately annotated as a connected segment of object pixels, which enables end-to-end learning of the object features. Based on the CNN-based segmentation, the number of objects is determined by detecting and counting connected components of object pixels using region labeling. In an evaluation on six different data sets, the system achieves an IoU of up to 87.3% for the segmentation and an F1 score of up to 88.6% for the object detection.
CVJul 19, 2018
Automated Phenotyping of Epicuticular Waxes of Grapevine Berries Using Light Separation and Convolutional Neural NetworksPierre Barré, Katja Herzog, Rebecca Höfle et al.
In viticulture the epicuticular wax as the outer layer of the berry skin is known as trait which is correlated to resilience towards Botrytis bunch rot. Traditionally this trait is classified using the OIV descriptor 227 (berry bloom) in a time consuming way resulting in subjective and error-prone phenotypic data. In the present study an objective, fast and sensor-based approach was developed to monitor berry bloom. From the technical point-of-view, it is known that the measurement of different illumination components conveys important information about observed object surfaces. A Mobile Light-Separation-Lab is proposed in order to capture illumination-separated images of grapevine berries for phenotyping the distribution of epicuticular waxes (berry bloom). For image analysis, an efficient convolutional neural network approach is used to derive the uniformity and intactness of waxes on berries. Method validation over six grapevine cultivars shows accuracies up to $97.3$%. In addition, electrical impedance of the cuticle and its epicuticular waxes (described as an indicator for the thickness of berry skin and its permeability) was correlated to the detected proportion of waxes with $r=0.76$. This novel, fast and non-invasive phenotyping approach facilitates enlarged screenings within grapevine breeding material and genetic repositories regarding berry bloom characteristics and its impact on resilience towards Botrytis bunch rot.
CVJul 10, 2018
Efficient identification, localization and quantification of grapevine inflorescences in unprepared field images using Fully Convolutional NetworksRobert Rudolph, Katja Herzog, Reinhard Töpfer et al.
Yield and its prediction is one of the most important tasks in grapevine breeding purposes and vineyard management. Commonly, this trait is estimated manually right before harvest by extrapolation, which mostly is labor-intensive, destructive and inaccurate. In the present study an automated image-based workflow was developed quantifying inflorescences and single flowers in unprepared field images of grapevines, i.e. no artificial background or light was applied. It is a novel approach for non-invasive, inexpensive and objective phenotyping with high-throughput. First, image regions depicting inflorescences were identified and localized. This was done by segmenting the images into the classes "inflorescence" and "non-inflorescence" using a Fully Convolutional Network (FCN). Efficient image segmentation hereby is the most challenging step regarding the small geometry and dense distribution of flowers (several hundred flowers per inflorescence), similar color of all plant organs in the fore- and background as well as the circumstance that only approximately 5% of an image show inflorescences. The trained FCN achieved a mean Intersection Over Union (IOU) of 87.6% on the test data set. Finally, individual flowers were extracted from the "inflorescence"-areas using Circular Hough Transform. The flower extraction achieved a recall of 80.3% and a precision of 70.7% using the segmentation derived by the trained FCN model. Summarized, the presented approach is a promising strategy in order to predict yield potential automatically in the earliest stage of grapevine development which is applicable for objective monitoring and evaluations of breeding material, genetic repositories or commercial vineyards.
CVMay 10, 2018
Multi-View Semantic Labeling of 3D Point Clouds for Automated Plant PhenotypingBernhard Japes, Jennifer Mack, Florian Rist et al.
Semantic labeling of 3D point clouds is important for the derivation of 3D models from real world scenarios in several economic fields such as building industry, facility management, town planning or heritage conservation. In contrast to these most common applications, we describe in this study the semantic labeling of 3D point clouds derived from plant organs by high-precision scanning. Our approach is optimized for the task of plant phenotyping with its very specific challenges and is employing a deep learning framework. Thereby, we report important experiences concerning detailed parameter initialization and optimization techniques. By evaluating our approach with challenging datasets we achieve state-of-the-art results without difficult and time consuming feature engineering as being necessary in traditional approaches to semantic labeling.