LGAug 1, 2022
A Real-time Edge-AI System for Reef SurveysYang Li, Jiajun Liu, Brano Kusy et al.
Crown-of-Thorn Starfish (COTS) outbreaks are a major cause of coral loss on the Great Barrier Reef (GBR) and substantial surveillance and control programs are ongoing to manage COTS populations to ecologically sustainable levels. In this paper, we present a comprehensive real-time machine learning-based underwater data collection and curation system on edge devices for COTS monitoring. In particular, we leverage the power of deep learning-based object detection techniques, and propose a resource-efficient COTS detector that performs detection inferences on the edge device to assist marine experts with COTS identification during the data collection phase. The preliminary results show that several strategies for improving computational efficiency (e.g., batch-wise processing, frame skipping, model input size) can be combined to run the proposed detection model on edge hardware with low resource consumption and low information loss.
CVMay 14, 2022
Monitoring of Pigmented Skin Lesions Using 3D Whole Body ImagingDavid Ahmedt-Aristizabal, Chuong Nguyen, Lachlan Tychsen-Smith et al.
Advanced artificial intelligence and machine learning have great potential to redefine how skin lesions are detected, mapped, tracked and documented. Here, We propose a 3D whole-body imaging system known as 3DSkin-mapper to enable automated detection, evaluation and mapping of skin lesions. A modular camera rig arranged in a cylindrical configuration was designed to automatically capture images of the entire skin surface of a subject synchronously from multiple angles. Based on the images, we developed algorithms for 3D model reconstruction, data processing and skin lesion detection and tracking based on deep convolutional neural networks. We also introduced a customised, user-friendly, and adaptable interface that enables individuals to interactively visualise, manipulate, and annotate the images. The proposed system is developed for skin lesion screening, the focus of this paper is to introduce the system instead of clinical study. Using synthetic and real images we demonstrate the effectiveness of the proposed system by providing multiple views of a target skin lesion, enabling further 3D geometry analysis and longitudinal tracking. It takes only a few seconds to capture the entire skin surface, and about half an hour to process and analyse the images. Our experiments show that the proposed system allow fast and easy whole body 3D imaging. It can be used by dermatological clinics to conduct skin screening, detect and track skin lesions over time, identify suspicious lesions, and document pigmented lesions. The system can potentially save clinicians time and effort significantly. The 3D imaging and analysis has the potential to change the paradigm of whole body photography with many applications in skin diseases, including inflammatory and pigmentary disorders.
CVNov 1, 2017Code
Improving Object Localization with Fitness NMS and Bounded IoU LossLachlan Tychsen-Smith, Lars Petersson
We demonstrate that many detection methods are designed to identify only a sufficently accurate bounding box, rather than the best available one. To address this issue we propose a simple and fast modification to the existing methods called Fitness NMS. This method is tested with the DeNet model and obtains a significantly improved MAP at greater localization accuracies without a loss in evaluation rate, and can be used in conjunction with Soft NMS for additional improvements. Next we derive a novel bounding box regression loss based on a set of IoU upper bounds that better matches the goal of IoU maximization while still providing good convergence properties. Following these novelties we investigate RoI clustering schemes for improving evaluation rates for the DeNet wide model variants and provide an analysis of localization performance at various input image dimensions. We obtain a MAP of 33.6%@79Hz and 41.8%@5Hz for MSCOCO and a Titan X (Maxwell). Source code available from: https://github.com/lachlants/denet
CVMar 30, 2017Code
DeNet: Scalable Real-time Object Detection with Directed Sparse SamplingLachlan Tychsen-Smith, Lars Petersson
We define the object detection from imagery problem as estimating a very large but extremely sparse bounding box dependent probability distribution. Subsequently we identify a sparse distribution estimation scheme, Directed Sparse Sampling, and employ it in a single end-to-end CNN based detection model. This methodology extends and formalizes previous state-of-the-art detection models with an additional emphasis on high evaluation rates and reduced manual engineering. We introduce two novelties, a corner based region-of-interest estimator and a deconvolution based CNN model. The resulting model is scene adaptive, does not require manually defined reference bounding boxes and produces highly competitive results on MSCOCO, Pascal VOC 2007 and Pascal VOC 2012 with real-time evaluation rates. Further analysis suggests our model performs particularly well when finegrained object localization is desirable. We argue that this advantage stems from the significantly larger set of available regions-of-interest relative to other methods. Source-code is available from: https://github.com/lachlants/denet
CVFeb 26, 2022
Continuous Human Action Recognition for Human-Machine Interaction: A ReviewHarshala Gammulle, David Ahmedt-Aristizabal, Simon Denman et al.
With advances in data-driven machine learning research, a wide variety of prediction models have been proposed to capture spatio-temporal features for the analysis of video streams. Recognising actions and detecting action transitions within an input video are challenging but necessary tasks for applications that require real-time human-machine interaction. By reviewing a large body of recent related work in the literature, we thoroughly analyse, explain and compare action segmentation methods and provide details on the feature extraction and learning strategies that are used on most state-of-the-art methods. We cover the impact of the performance of object detection and tracking techniques on human action segmentation methodologies. We investigate the application of such models to real-world scenarios and discuss several limitations and key research directions towards improving interpretability, generalisation, optimisation and deployment.
CVNov 29, 2021
The CSIRO Crown-of-Thorn Starfish Detection DatasetJiajun Liu, Brano Kusy, Ross Marchant et al.
Crown-of-Thorn Starfish (COTS) outbreaks are a major cause of coral loss on the Great Barrier Reef (GBR) and substantial surveillance and control programs are underway in an attempt to manage COTS populations to ecologically sustainable levels. We release a large-scale, annotated underwater image dataset from a COTS outbreak area on the GBR, to encourage research on Machine Learning and AI-driven technologies to improve the detection, monitoring, and management of COTS populations at reef scale. The dataset is released and hosted in a Kaggle competition that challenges the international Machine Learning community with the task of COTS detection from these underwater images.
ROApr 19, 2021
Heterogeneous Ground and Air Platforms, Homogeneous Sensing: Team CSIRO Data61's Approach to the DARPA Subterranean ChallengeNicolas Hudson, Fletcher Talbot, Mark Cox et al.
Heterogeneous teams of robots, leveraging a balance between autonomy and human interaction, bring powerful capabilities to the problem of exploring dangerous, unstructured subterranean environments. Here we describe the solution developed by Team CSIRO Data61, consisting of CSIRO, Emesent and Georgia Tech, during the DARPA Subterranean Challenge. These presented systems were fielded in the Tunnel Circuit in August 2019, the Urban Circuit in February 2020, and in our own Cave event, conducted in September 2020. A unique capability of the fielded team is the homogeneous sensing of the platforms utilised, which is leveraged to obtain a decentralised multi-agent SLAM solution on each platform (both ground agents and UAVs) using peer-to-peer communications. This enabled a shift in focus from constructing a pervasive communications network to relying on multi-agent autonomy, motivated by experiences in early circuit events. These experiences also showed the surprising capability of rugged tracked platforms for challenging terrain, which in turn led to the heterogeneous team structure based on a BIA5 OzBot Titan ground robot and an Emesent Hovermap UAV, supplemented by smaller tracked or legged ground robots. The ground agents use a common CatPack perception module, which allowed reuse of the perception and autonomy stack across all ground agents with minimal adaptation.