Toon Goedemé

CV
17papers
865citations
Novelty37%
AI Score37

17 Papers

CVDec 16, 2022
Person Detection Using an Ultra Low-resolution Thermal Imager on a Low-cost MCU

Maarten Vandersteegen, Wouter Reusen, Kristof Van Beeck et al.

Detecting persons in images or video with neural networks is a well-studied subject in literature. However, such works usually assume the availability of a camera of decent resolution and a high-performance processor or GPU to run the detection algorithm, which significantly increases the cost of a complete detection system. However, many applications require low-cost solutions, composed of cheap sensors and simple microcontrollers. In this paper, we demonstrate that even on such hardware we are not condemned to simple classic image processing techniques. We propose a novel ultra-lightweight CNN-based person detector that processes thermal video from a low-cost 32x24 pixel static imager. Trained and compressed on our own recorded dataset, our model achieves up to 91.62% accuracy (F1-score), has less than 10k parameters, and runs as fast as 87ms and 46ms on low-cost microcontrollers STM32F407 and STM32F746, respectively.

CVJul 9, 2020Code
Building Robust Industrial Applicable Object Detection Models Using Transfer Learning and Single Pass Deep Learning Architectures

Steven Puttemans, Timothy Callemein, Toon Goedemé

The uprising trend of deep learning in computer vision and artificial intelligence can simply not be ignored. On the most diverse tasks, from recognition and detection to segmentation, deep learning is able to obtain state-of-the-art results, reaching top notch performance. In this paper we explore how deep convolutional neural networks dedicated to the task of object detection can improve our industrial-oriented object detection pipelines, using state-of-the-art open source deep learning frameworks, like Darknet. By using a deep learning architecture that integrates region proposals, classification and probability estimation in a single run, we aim at obtaining real-time performance. We focus on reducing the needed amount of training data drastically by exploring transfer learning, while still maintaining a high average precision. Furthermore we apply these algorithms to two industrially relevant applications, one being the detection of promotion boards in eye tracking data and the other detecting and recognizing packages of warehouse products for augmented advertisements.

CVDec 17, 2025
Eyes on the Grass: Biodiversity-Increasing Robotic Mowing Using Deep Visual Embeddings

Lars Beckers, Arno Waes, Aaron Van Campenhout et al.

This paper presents a robotic mowing framework that actively enhances garden biodiversity through visual perception and adaptive decision-making. Unlike passive rewilding approaches, the proposed system uses deep feature-space analysis to identify and preserve visually diverse vegetation patches in camera images by selectively deactivating the mower blades. A ResNet50 network pretrained on PlantNet300K provides ecologically meaningful embeddings, from which a global deviation metric estimates biodiversity without species-level supervision. These estimates drive a selective mowing algorithm that dynamically alternates between mowing and conservation behavior. The system was implemented on a modified commercial robotic mower and validated both in a controlled mock-up lawn and on real garden datasets. Results demonstrate a strong correlation between embedding-space dispersion and expert biodiversity assessment, confirming the feasibility of deep visual diversity as a proxy for ecological richness and the effectiveness of the proposed mowing decision approach. Widespread adoption of such systems will turn ecologically worthless, monocultural lawns into vibrant, valuable biotopes that boost urban biodiversity.

CVOct 12, 2021
Weakly-Supervised Semantic Segmentation by Learning Label Uncertainty

Robby Neven, Davy Neven, Bert De Brabandere et al.

Since the rise of deep learning, many computer vision tasks have seen significant advancements. However, the downside of deep learning is that it is very data-hungry. Especially for segmentation problems, training a deep neural net requires dense supervision in the form of pixel-perfect image labels, which are very costly. In this paper, we present a new loss function to train a segmentation network with only a small subset of pixel-perfect labels, but take the advantage of weakly-annotated training samples in the form of cheap bounding-box labels. Unlike recent works which make use of box-to-mask proposal generators, our loss trains the network to learn a label uncertainty within the bounding-box, which can be leveraged to perform online bootstrapping (i.e. transforming the boxes to segmentation masks), while training the network. We evaluated our method on binary segmentation tasks, as well as a multi-class segmentation task (CityScapes vehicles and persons). We trained each task on a dataset comprised of only 18% pixel-perfect and 82% bounding-box labels, and compared the results to a baseline model trained on a completely pixel-perfect dataset. For the binary segmentation tasks, our method achieves an IoU score which is ~98.33% as good as our baseline model, while for the multi-class task, our method is 97.12% as good as our baseline model (77.5 vs. 79.8 mIoU).

CVSep 21, 2020
Feed-Forward On-Edge Fine-tuning Using Static Synthetic Gradient Modules

Robby Neven, Marian Verhelst, Tinne Tuytelaars et al.

Training deep learning models on embedded devices is typically avoided since this requires more memory, computation and power over inference. In this work, we focus on lowering the amount of memory needed for storing all activations, which are required during the backward pass to compute the gradients. Instead, during the forward pass, static Synthetic Gradient Modules (SGMs) predict gradients for each layer. This allows training the model in a feed-forward manner without having to store all activations. We tested our method on a robot grasping scenario where a robot needs to learn to grasp new objects given only a single demonstration. By first training the SGMs in a meta-learning manner on a set of common objects, during fine-tuning, the SGMs provided the model with accurate gradients to successfully learn to grasp new objects. We have shown that our method has comparable results to using standard backpropagation.

CVJul 9, 2020
Real-time Embedded Person Detection and Tracking for Shopping Behaviour Analysis

Robin Schrijvers, Steven Puttemans, Timothy Callemein et al.

Shopping behaviour analysis through counting and tracking of people in shop-like environments offers valuable information for store operators and provides key insights in the stores layout (e.g. frequently visited spots). Instead of using extra staff for this, automated on-premise solutions are preferred. These automated systems should be cost-effective, preferably on lightweight embedded hardware, work in very challenging situations (e.g. handling occlusions) and preferably work real-time. We solve this challenge by implementing a real-time TensorRT optimized YOLOv3-based pedestrian detector, on a Jetson TX2 hardware platform. By combining the detector with a sparse optical flow tracker we assign a unique ID to each customer and tackle the problem of loosing partially occluded customers. Our detector-tracker based solution achieves an average precision of 81.59% at a processing speed of 10 FPS. Besides valuable statistics, heat maps of frequently visited spots are extracted and used as an overlay on the video stream.

CVJul 9, 2020
Anyone here? Smart embedded low-resolution omnidirectional video sensor to measure room occupancy

Timothy Callemein, Kristof Van Beeck, Toon Goedemé

In this paper, we present a room occupancy sensing solution with unique properties: (i) It is based on an omnidirectional vision camera, capturing rich scene info over a wide angle, enabling to count the number of people in a room and even their position. (ii) Although it uses a camera-input, no privacy issues arise because its extremely low image resolution, rendering people unrecognisable. (iii) The neural network inference is running entirely on a low-cost processing platform embedded in the sensor, reducing the privacy risk even further. (iv) Limited manual data annotation is needed, because of the self-training scheme we propose. Such a smart room occupancy rate sensor can be used in e.g. meeting rooms and flex-desks. Indeed, by encouraging flex-desking, the required office space can be reduced significantly. In some cases, however, a flex-desk that has been reserved remains unoccupied without an update in the reservation system. A similar problem occurs with meeting rooms, which are often under-occupied. By optimising the occupancy rate a huge reduction in costs can be achieved. Therefore, in this paper, we develop such system which determines the number of people present in office flex-desks and meeting rooms. Using an omnidirectional camera mounted in the ceiling, combined with a person detector, the company can intelligently update the reservation system based on the measured occupancy. Next to the optimisation and embedded implementation of such a self-training omnidirectional people detection algorithm, in this work we propose a novel approach that combines spatial and temporal image data, improving performance of our system on extreme low-resolution images.

CVJul 9, 2020
How low can you go? Privacy-preserving people detection with an omni-directional camera

Timothy Callemein, Kristof Van Beeck, Toon Goedemé

In this work, we use a ceiling-mounted omni-directional camera to detect people in a room. This can be used as a sensor to measure the occupancy of meeting rooms and count the amount of flex-desk working spaces available. If these devices can be integrated in an embedded low-power sensor, it would form an ideal extension of automated room reservation systems in office environments. The main challenge we target here is ensuring the privacy of the people filmed. The approach we propose is going to extremely low image resolutions, such that it is impossible to recognise people or read potentially confidential documents. Therefore, we retrained a single-shot low-resolution person detection network with automatically generated ground truth. In this paper, we prove the functionality of this approach and explore how low we can go in resolution, to determine the optimal trade-off between recognition accuracy and privacy preservation. Because of the low resolution, the result is a lightweight network that can potentially be deployed on embedded hardware. Such embedded implementation enables the development of a decentralised smart camera which only outputs the required meta-data (i.e. the number of persons in the meeting room).

CVJul 9, 2020
Automated analysis of eye-tracker-based human-human interaction studies

Timothy Callemein, Kristof Van Beeck, Geert Brône et al.

Mobile eye-tracking systems have been available for about a decade now and are becoming increasingly popular in different fields of application, including marketing, sociology, usability studies and linguistics. While the user-friendliness and ergonomics of the hardware are developing at a rapid pace, the software for the analysis of mobile eye-tracking data in some points still lacks robustness and functionality. With this paper, we investigate which state-of-the-art computer vision algorithms may be used to automate the post-analysis of mobile eye-tracking data. For the case study in this paper, we focus on mobile eye-tracker recordings made during human-human face-to-face interactions. We compared two recent publicly available frameworks (YOLOv2 and OpenPose) to relate the gaze location generated by the eye-tracker to the head and hands visible in the scene camera data. In this paper we will show that the use of this single-pipeline framework provides robust results, which are both more accurate and faster than previous work in the field. Moreover, our approach does not rely on manual interventions during this process.

CVJul 9, 2020
The autonomous hidden camera crew

Timothy Callemein, Wiebe Van Ranst, Toon Goedemé

Reality TV shows that follow people in their day-to-day lives are not a new concept. However, the traditional methods used in the industry require a lot of manual labour and need the presence of at least one physical camera man. Because of this, the subjects tend to behave differently when they are aware of being recorded. This paper will present an approach to follow people in their day-to-day lives, for long periods of time (months to years), while being as unobtrusive as possible. To do this, we use unmanned cinematographically-aware cameras hidden in people's houses. Our contribution in this paper is twofold: First, we create a system to limit the amount of recorded data by intelligently controlling a video switch matrix, in combination with a multi-channel recorder. Second, we create a virtual camera man by controlling a PTZ camera to automatically make cinematographically pleasing shots. Throughout this paper, we worked closely with a real camera crew. This enabled us to compare the results of our system to the work of trained professionals.

CVApr 18, 2019
Fooling automated surveillance cameras: adversarial patches to attack person detection

Simen Thys, Wiebe Van Ranst, Toon Goedemé

Adversarial attacks on machine learning models have seen increasing interest in the past years. By making only subtle changes to the input of a convolutional neural network, the output of the network can be swayed to output a completely different result. The first attacks did this by changing pixel values of an input image slightly to fool a classifier to output the wrong class. Other approaches have tried to learn "patches" that can be applied to an object to fool detectors and classifiers. Some of these approaches have also shown that these attacks are feasible in the real-world, i.e. by modifying an object and filming it with a video camera. However, all of these approaches target classes that contain almost no intra-class variety (e.g. stop signs). The known structure of the object is then used to generate an adversarial patch on top of it. In this paper, we present an approach to generate adversarial patches to targets with lots of intra-class variety, namely persons. The goal is to generate a patch that is able successfully hide a person from a person detector. An attack that could for instance be used maliciously to circumvent surveillance systems, intruders can sneak around undetected by holding a small cardboard plate in front of their body aimed towards the surveillance camera. From our results we can see that our system is able significantly lower the accuracy of a person detector. Our approach also functions well in real-life scenarios where the patch is filmed by a camera. To the best of our knowledge we are the first to attempt this kind of attack on targets with a high level of intra-class variety like persons.

CVJul 23, 2017
Team Applied Robotics: A closer look at our robotic picking system

Wim Abbeloos, Fabian Gouwens, Simon Jansen et al.

This paper describes the vision based robotic picking system that was developed by our team, Team Applied Robotics, for the Amazon Picking Challenge 2016. This competition challenged teams to develop a robotic system that is able to pick a large variety of products from a shelve or a tote. We discuss the design considerations and our strategy, the high resolution 3D vision system, the use of a combination of texture and shape-based object detection algorithms, the robot path planning and object manipulators that were developed.

CVDec 7, 2016
Exploring the potential of combining time of flight and thermal infrared cameras for person detection

Wim Abbeloos, Toon Goedemé

Combining new, low-cost thermal infrared and time-of-flight range sensors provides new opportunities. In this position paper we explore the possibilities of combining these sensors and using their fused data for person detection. The proposed calibration approach for this sensor combination differs from the traditional stereo camera calibration in two fundamental ways. A first distinction is that the spectral sensitivity of the two sensors differs significantly. In fact, there is no sensitivity range overlap at all. A second distinction is that their resolution is typically very low, which requires special attention. We assume a situation in which the sensors' relative position is known, but their orientation is unknown. In addition, some of the typical measurement errors are discussed, and methods to compensate for them are proposed. We discuss how the fused data could allow increased accuracy and robustness without the need for complex algorithms requiring large amounts of computational power and training data.

CVDec 7, 2016
Process Monitoring of Extrusion Based 3D Printing via Laser Scanning

Matthias Faes, Wim Abbeloos, Frederik Vogeler et al.

Extrusion based 3D Printing (E3DP) is an Additive Manufacturing (AM) technique that extrudes thermoplastic polymer in order to build up components using a layerwise approach. Hereby, AM typically requires long production times in comparison to mass production processes such as Injection Molding. Failures during the AM process are often only noticed after build completion and frequently lead to part rejection because of dimensional inaccuracy or lack of mechanical performance, resulting in an important loss of time and material. A solution to improve the accuracy and robustness of a manufacturing technology is the integration of sensors to monitor and control process state-variables online. In this way, errors can be rapidly detected and possibly compensated at an early stage. To achieve this, we integrated a modular 2D laser triangulation scanner into an E3DP machine and analyzed feedback signals. A 2D laser triangulation scanner was selected here owing to the very compact size, achievable accuracy and the possibility of capturing geometrical 3D data. Thus, our implemented system is able to provide both quantitative and qualitative information. Also, in this work, first steps towards the development of a quality control loop for E3DP processes are presented and opportunities are discussed.

CVDec 7, 2016
Embedded Line Scan Image Sensors: The Low Cost Alternative for High Speed Imaging

Stef Van Wolputte, Wim Abbeloos, Stijn Helsen et al.

In this paper we propose a low-cost high-speed imaging line scan system. We replace an expensive industrial line scan camera and illumination with a custom-built set-up of cheap off-the-shelf components, yielding a measurement system with comparative quality while costing about 20 times less. We use a low-cost linear (1D) image sensor, cheap optics including a LED-based or LASER-based lighting and an embedded platform to process the images. A step-by-step method to design such a custom high speed imaging system and select proper components is proposed. Simulations allowing to predict the final image quality to be obtained by the set-up has been developed. Finally, we applied our method in a lab, closely representing the real-life cases. Our results shows that our simulations are very accurate and that our low-cost line scan set-up acquired image quality compared to the high-end commercial vision system, for a fraction of the price.

CVDec 7, 2016
Fusion of Range and Thermal Images for Person Detection

Wim Abbeloos, Toon Goedemé

Detecting people in images is a challenging problem. Differences in pose, clothing and lighting, along with other factors, cause a lot of variation in their appearance. To overcome these issues, we propose a system based on fused range and thermal infrared images. These measurements show considerably less variation and provide more meaningful information. We provide a brief introduction to the sensor technology used and propose a calibration method. Several data fusion algorithms are compared and their performance is assessed on a simulated data set. The results of initial experiments on real data are analyzed and the measurement errors and the challenges they present are discussed. The resulting fused data are used to efficiently detect people in a fixed camera set-up. The system is extended to include person tracking.

CVDec 5, 2016
Point Pair Feature based Object Detection for Random Bin Picking

Wim Abbeloos, Toon Goedemé

Point pair features are a popular representation for free form 3D object detection and pose estimation. In this paper, their performance in an industrial random bin picking context is investigated. A new method to generate representative synthetic datasets is proposed. This allows to investigate the influence of a high degree of clutter and the presence of self similar features, which are typical to our application. We provide an overview of solutions proposed in literature and discuss their strengths and weaknesses. A simple heuristic method to drastically reduce the computational complexity is introduced, which results in improved robustness, speed and accuracy compared to the naive approach.