Pedro Machado

CV
h-index19
18papers
89citations
Novelty29%
AI Score34

18 Papers

CVNov 21, 2022
Benchmarking Edge Computing Devices for Grape Bunches and Trunks Detection using Accelerated Object Detection Single Shot MultiBox Deep Learning Models

Sandro Costa Magalhães, Filipe Neves Santos, Pedro Machado et al.

Purpose: Visual perception enables robots to perceive the environment. Visual data is processed using computer vision algorithms that are usually time-expensive and require powerful devices to process the visual data in real-time, which is unfeasible for open-field robots with limited energy. This work benchmarks the performance of different heterogeneous platforms for object detection in real-time. This research benchmarks three architectures: embedded GPU -- Graphical Processing Units (such as NVIDIA Jetson Nano 2 GB and 4 GB, and NVIDIA Jetson TX2), TPU -- Tensor Processing Unit (such as Coral Dev Board TPU), and DPU -- Deep Learning Processor Unit (such as in AMD-Xilinx ZCU104 Development Board, and AMD-Xilinx Kria KV260 Starter Kit). Method: The authors used the RetinaNet ResNet-50 fine-tuned using the natural VineSet dataset. After the trained model was converted and compiled for target-specific hardware formats to improve the execution efficiency. Conclusions and Results: The platforms were assessed in terms of performance of the evaluation metrics and efficiency (time of inference). Graphical Processing Units (GPUs) were the slowest devices, running at 3 FPS to 5 FPS, and Field Programmable Gate Arrays (FPGAs) were the fastest devices, running at 14 FPS to 25 FPS. The efficiency of the Tensor Processing Unit (TPU) is irrelevant and similar to NVIDIA Jetson TX2. TPU and GPU are the most power-efficient, consuming about 5W. The performance differences, in the evaluation metrics, across devices are irrelevant and have an F1 of about 70 % and mean Average Precision (mAP) of about 60 %.

ROJul 25, 2022
Exploiting High Quality Tactile Sensors for Simplified Grasping

Pedro Machado, T. M. McGinnity

Robots are expected to grasp a wide range of objects varying in shape, weight or material type. Providing robots with tactile capabilities similar to humans is thus essential for applications involving human-to-robot or robot-to-robot interactions, particularly in those situations where a robot is expected to grasp and manipulate complex objects not previously encountered. A critical aspect for successful object grasp and manipulation is the use of high-quality fingertips equipped with multiple high-performance sensors, distributed appropriately across a specific contact surface. In this paper, we present a detailed analysis of the use of two different types of commercially available robotic fingertips (BioTac and WTS-FT), each of which is equipped with multiple sensors distributed across the fingertips' contact surface. We further demonstrate that, due to the high performance of the fingertips, a complex adaptive grasping algorithm is not required for grasping of everyday objects. We conclude that a simple algorithm based on a proportional controller will suffice for many grasping applications, provided the relevant fingertips exhibit high sensitivity. In a quantified assessment, we also demonstrate that, due in part to the sensor distribution, the BioTac-based fingertip performs better than the WTS-FT device, in enabling lifting of loads up to 850g, and that the simple proportional controller can adapt the grasp even when the object is exposed to significant external vibrational challenges.

LGFeb 22, 2023
Mitigating Adversarial Attacks in Deepfake Detection: An Exploration of Perturbation and AI Techniques

Saminder Dhesi, Laura Fontes, Pedro Machado et al.

Deep learning constitutes a pivotal component within the realm of machine learning, offering remarkable capabilities in tasks ranging from image recognition to natural language processing. However, this very strength also renders deep learning models susceptible to adversarial examples, a phenomenon pervasive across a diverse array of applications. These adversarial examples are characterized by subtle perturbations artfully injected into clean images or videos, thereby causing deep learning algorithms to misclassify or produce erroneous outputs. This susceptibility extends beyond the confines of digital domains, as adversarial examples can also be strategically designed to target human cognition, leading to the creation of deceptive media, such as deepfakes. Deepfakes, in particular, have emerged as a potent tool to manipulate public opinion and tarnish the reputations of public figures, underscoring the urgent need to address the security and ethical implications associated with adversarial examples. This article delves into the multifaceted world of adversarial examples, elucidating the underlying principles behind their capacity to deceive deep learning algorithms. We explore the various manifestations of this phenomenon, from their insidious role in compromising model reliability to their impact in shaping the contemporary landscape of disinformation and misinformation. To illustrate progress in combating adversarial examples, we showcase the development of a tailored Convolutional Neural Network (CNN) designed explicitly to detect deepfakes, a pivotal step towards enhancing model robustness in the face of adversarial threats. Impressively, this custom CNN has achieved a precision rate of 76.2% on the DFDC dataset.

ARJul 13, 2022
Estimating the Power Consumption of Heterogeneous Devices when performing AI Inference

Pedro Machado, Ivica Matic, Francisco de Lemos et al.

Modern-day life is driven by electronic devices connected to the internet. The emerging research field of the Internet-of-Things (IoT) has become popular, just as there has been a steady increase in the number of connected devices. Since many of these devices are utilised to perform CV tasks, it is essential to understand their power consumption against performance. We report the power consumption profile and analysis of the NVIDIA Jetson Nano board while performing object classification. The authors present an extensive analysis regarding power consumption per frame and the output in frames per second using YOLOv5 models. The results show that the YOLOv5n outperforms other YOLOV5 variants in terms of throughput (i.e. 12.34 fps) and low power consumption (i.e. 0.154 mWh/frame).

CVJul 6, 2022
Deep Learning approach for Classifying Trusses and Runners of Strawberries

Jakub Pomykala, Francisco de Lemos, Isibor Kennedy Ihianle et al.

The use of artificial intelligence in the agricultural sector has been growing at a rapid rate to automate farming activities. Emergent farming technologies focus on mapping and classification of plants, fruits, diseases, and soil types. Although, assisted harvesting and pruning applications using deep learning algorithms are in the early development stages, there is a demand for solutions to automate such processes. This paper proposes the use of Deep Learning for the classification of trusses and runners of strawberry plants using semantic segmentation and dataset augmentation. The proposed approach is based on the use of noises (i.e. Gaussian, Speckle, Poisson and Salt-and-Pepper) to artificially augment the dataset and compensate the low number of data samples and increase the overall classification performance. The results are evaluated using mean average of precision, recall and F1 score. The proposed approach achieved 91%, 95% and 92% on precision, recall and F1 score, respectively, for truss detection using the ResNet101 with dataset augmentation utilising Salt-and-Pepper noise; and 83%, 53% and 65% on precision, recall and F1 score, respectively, for truss detection using the ResNet50 with dataset augmentation utilising Poisson noise.

CVJul 6, 2022
Real-Time Gesture Recognition with Virtual Glove Markers

Finlay McKinnon, David Ada Adama, Pedro Machado et al.

Due to the universal non-verbal natural communication approach that allows for effective communication between humans, gesture recognition technology has been steadily developing over the previous few decades. Many different strategies have been presented in research articles based on gesture recognition to try to create an effective system to send non-verbal natural communication information to computers, using both physical sensors and computer vision. Hyper accurate real-time systems, on the other hand, have only recently began to occupy the study field, with each adopting a range of methodologies due to past limits such as usability, cost, speed, and accuracy. A real-time computer vision-based human-computer interaction tool for gesture recognition applications that acts as a natural user interface is proposed. Virtual glove markers on users hands will be created and used as input to a deep learning model for the real-time recognition of gestures. The results obtained show that the proposed system would be effective in real-time applications including social interaction through telepresence and rehabilitation.

CRJan 15, 2023
Secure Video Streaming Using Dedicated Hardware

Nicholas Murray-Hill, Laura Fontes, Pedro Machado et al.

Purpose: The purpose of this article is to present a system that enhances the security, efficiency, and reconfigurability of an Internet-of-Things (IoT) system used for surveillance and monitoring. Methods: A Multi-Processor System-On-Chip (MPSoC) composed of Central Processor Unit (CPU) and Field-Programmable Gate Array (FPGA) is proposed for increasing the security and the frame rate of a smart IoT edge device. The private encryption key is safely embedded in the FPGA unit to avoid being exposed in the Random Access Memory (RAM). This allows the edge device to securely store and authenticate the key, protecting the data transmitted from the same Integrated Circuit (IC). Additionally, the edge device can simultaneously publish and route a camera stream using a lightweight communication protocol, achieving a frame rate of 14 frames per Second (fps). The performance of the MPSoC is compared to a NVIDIA Jetson Nano (NJN) and a Raspberry Pi 4 (RPI4) and it is found that the RPI4 is the most cost-effective solution but with lower frame rate, the NJN is the fastest because it can achieve higher frame-rate but it is not secure, and the MPSoC is the optimal solution because it offers a balanced frame rate and it is secure because it never exposes the secure key into the memory. Results: The proposed system successfully addresses the challenges of security, scalability, and efficiency in an IoT system used for surveillance and monitoring. The encryption key is securely stored and authenticated, and the edge device is able to simultaneously publish and route a camera stream feed high-definition images at 14 fps.

CRNov 9, 2025
SteganoSNN: SNN-Based Audio-in-Image Steganography with Encryption

Biswajit Kumar Sahoo, Pedro Machado, Isibor Kennedy Ihianle et al.

Secure data hiding remains a fundamental challenge in digital communication, requiring a careful balance between computational efficiency and perceptual transparency. The balance between security and performance is increasingly fragile with the emergence of generative AI systems capable of autonomously generating and optimising sophisticated cryptanalysis and steganalysis algorithms, thereby accelerating the exposure of vulnerabilities in conventional data-hiding schemes. This work introduces SteganoSNN, a neuromorphic steganographic framework that exploits spiking neural networks (SNNs) to achieve secure, low-power, and high-capacity multimedia data hiding. Digitised audio samples are converted into spike trains using leaky integrate-and-fire (LIF) neurons, encrypted via a modulo-based mapping scheme, and embedded into the least significant bits of RGBA image channels using a dithering mechanism to minimise perceptual distortion. Implemented in Python using NEST and realised on a PYNQ-Z2 FPGA, SteganoSNN attains real-time operation with an embedding capacity of 8 bits per pixel. Experimental evaluations on the DIV2K 2017 dataset demonstrate image fidelity between 40.4 dB and 41.35 dB in PSNR and SSIM values consistently above 0.97, surpassing SteganoGAN in computational efficiency and robustness. SteganoSNN establishes a foundation for neuromorphic steganography, enabling secure, energy-efficient communication for Edge-AI, IoT, and biomedical applications.

CVAug 23, 2023
Computational models of object motion detectors accelerated using FPGA technology

Pedro Machado

This PhD research introduces three key contributions in the domain of object motion detection: Multi-Hierarchical Spiking Neural Network (MHSNN): A specialized four-layer Spiking Neural Network (SNN) architecture inspired by vertebrate retinas. Trained on custom lab-generated images, it exhibited 6.75% detection error for horizontal and vertical movements. While non-scalable, MHSNN laid the foundation for further advancements. Hybrid Sensitive Motion Detector (HSMD): Enhancing Dynamic Background Subtraction (DBS) using a tailored three-layer SNN, stabilizing foreground data to enhance object motion detection. Evaluated on standard datasets, HSMD outperformed OpenCV-based methods, excelling in four categories across eight metrics. It maintained real-time processing (13.82-13.92 fps) on a high-performance computer but showed room for hardware optimisation. Neuromorphic Hybrid Sensitive Motion Detector (NeuroHSMD): Building upon HSMD, this adaptation implemented the SNN component on dedicated hardware (FPGA). OpenCL simplified FPGA design and enabled portability. NeuroHSMD demonstrated an 82% speedup over HSMD, achieving 28.06-28.71 fps on CDnet2012 and CDnet2014 datasets. These contributions collectively represent significant advancements in object motion detection, from a biologically inspired neural network design to an optimized hardware implementation that outperforms existing methods in accuracy and processing speed.

ROMay 12, 2024
WeedScout: Real-Time Autonomous blackgrass Classification and Mapping using dedicated hardware

Matthew Gazzard, Helen Hicks, Isibor Kennedy Ihianle et al.

Blackgrass (Alopecurus myosuroides) is a competitive weed that has wide-ranging impacts on food security by reducing crop yields and increasing cultivation costs. In addition to the financial burden on agriculture, the application of herbicides as a preventive to blackgrass can negatively affect access to clean water and sanitation. The WeedScout project introduces a Real-Rime Autonomous Black-Grass Classification and Mapping (RT-ABGCM), a cutting-edge solution tailored for real-time detection of blackgrass, for precision weed management practices. Leveraging Artificial Intelligence (AI) algorithms, the system processes live image feeds, infers blackgrass density, and covers two stages of maturation. The research investigates the deployment of You Only Look Once (YOLO) models, specifically the streamlined YOLOv8 and YOLO-NAS, accelerated at the edge with the NVIDIA Jetson Nano (NJN). By optimising inference speed and model performance, the project advances the integration of AI into agricultural practices, offering potential solutions to challenges such as herbicide resistance and environmental impact. Additionally, two datasets and model weights are made available to the research community, facilitating further advancements in weed detection and precision farming technologies.

CVJul 23, 2025
Bearded Dragon Activity Recognition Pipeline: An AI-Based Approach to Behavioural Monitoring

Arsen Yermukan, Pedro Machado, Feliciano Domingos et al.

Traditional monitoring of bearded dragon (Pogona Viticeps) behaviour is time-consuming and prone to errors. This project introduces an automated system for real-time video analysis, using You Only Look Once (YOLO) object detection models to identify two key behaviours: basking and hunting. We trained five YOLO variants (v5, v7, v8, v11, v12) on a custom, publicly available dataset of 1200 images, encompassing bearded dragons (600), heating lamps (500), and crickets (100). YOLOv8s was selected as the optimal model due to its superior balance of accuracy (mAP@0.5:0.95 = 0.855) and speed. The system processes video footage by extracting per-frame object coordinates, applying temporal interpolation for continuity, and using rule-based logic to classify specific behaviours. Basking detection proved reliable. However, hunting detection was less accurate, primarily due to weak cricket detection (mAP@0.5 = 0.392). Future improvements will focus on enhancing cricket detection through expanded datasets or specialised small-object detectors. This automated system offers a scalable solution for monitoring reptile behaviour in controlled environments, significantly improving research efficiency and data quality.

CVMay 24, 2024
Enhancing Pollinator Conservation towards Agriculture 4.0: Monitoring of Bees through Object Recognition

Ajay John Alex, Chloe M. Barnes, Pedro Machado et al.

In an era of rapid climate change and its adverse effects on food production, technological intervention to monitor pollinator conservation is of paramount importance for environmental monitoring and conservation for global food security. The survival of the human species depends on the conservation of pollinators. This article explores the use of Computer Vision and Object Recognition to autonomously track and report bee behaviour from images. A novel dataset of 9664 images containing bees is extracted from video streams and annotated with bounding boxes. With training, validation and testing sets (6722, 1915, and 997 images, respectively), the results of the COCO-based YOLO model fine-tuning approaches show that YOLOv5m is the most effective approach in terms of recognition accuracy. However, YOLOv5s was shown to be the most optimal for real-time bee detection with an average processing and inference time of 5.1ms per video frame at the cost of slightly lower ability. The trained model is then packaged within an explainable AI interface, which converts detection events into timestamped reports and charts, with the aim of facilitating use by non-technical users such as expert stakeholders from the apiculture industry towards informing responsible consumption and production.

CVDec 21, 2023
UDEEP: Edge-based Computer Vision for In-Situ Underwater Crayfish and Plastic Detection

Dennis Monari, Jack Larkin, Pedro Machado et al.

Invasive signal crayfish have a detrimental impact on ecosystems. They spread the fungal-type crayfish plague disease (Aphanomyces astaci) that is lethal to the native white clawed crayfish, the only native crayfish species in Britain. Invasive signal crayfish extensively burrow, causing habitat destruction, erosion of river banks and adverse changes in water quality, while also competing with native species for resources and leading to declines in native populations. Moreover, pollution exacerbates the vulnerability of White-clawed crayfish, with their populations declining by over 90% in certain English counties, making them highly susceptible to extinction. To safeguard aquatic ecosystems, it is imperative to address the challenges posed by invasive species and discarded plastics in the United Kingdom's river ecosystem's. The UDEEP platform can play a crucial role in environmental monitoring by performing on-the-fly classification of Signal crayfish and plastic debris while leveraging the efficacy of AI, IoT devices and the power of edge computing (i.e., NJN). By providing accurate data on the presence, spread and abundance of these species, the UDEEP platform can contribute to monitoring efforts and aid in mitigating the spread of invasive species.

NEDec 12, 2021
NeuroHSMD: Neuromorphic Hybrid Spiking Motion Detector

Pedro Machado, Joao Filipe Ferreira, Andreas Oikonomou et al.

Vertebrate retinas are highly-efficient in processing trivial visual tasks such as detecting moving objects, yet a complex challenges for modern computers. In vertebrates, the detection of object motion is performed by specialised retinal cells named Object Motion Sensitive Ganglion Cells (OMS-GC). OMS-GC process continuous visual signals and generate spike patterns that are post-processed by the Visual Cortex. Our previous Hybrid Sensitive Motion Detector (HSMD) algorithm was the first hybrid algorithm to enhance Background subtraction (BS) algorithms with a customised 3-layer Spiking Neural Network (SNN) that generates OMS-GC spiking-like responses. In this work, we present a Neuromorphic Hybrid Sensitive Motion Detector (NeuroHSMD) algorithm that accelerates our HSMD algorithm using Field-Programmable Gate Arrays (FPGAs). The NeuroHSMD was compared against the HSMD algorithm, using the same 2012 Change Detection (CDnet2012) and 2014 Change Detection (CDnet2014) benchmark datasets. When tested against the CDnet2012 and CDnet2014 datasets, NeuroHSMD performs object motion detection at 720x480 at 28.06 Frames Per Second (fps) and 720x480 at 28.71 fps, respectively, with no degradation of quality. Moreover, the NeuroHSMD proposed in this paper was completely implemented in Open Computer Language (OpenCL) and therefore is easily replicated in other devices such as Graphical Processing Units (GPUs) and clusters of Central Processing Units (CPUs).

ROSep 9, 2021
Object recognition for robotics from tactile time series data utilising different neural network architectures

Wolfgang Bottcher, Pedro Machado, Nikesh Lama et al.

Robots need to exploit high-quality information on grasped objects to interact with the physical environment. Haptic data can therefore be used for supplementing the visual modality. This paper investigates the use of Convolutional Neural Networks (CNN) and Long-Short Term Memory (LSTM) neural network architectures for object classification on Spatio-temporal tactile grasping data. Furthermore, we compared these methods using data from two different fingertip sensors (namely the BioTac SP and WTS-FT) in the same physical setup, allowing for a realistic comparison across methods and sensors for the same tactile object classification dataset. Additionally, we propose a way to create more training examples from the recorded data. The results show that the proposed method improves the maximum accuracy from 82.4% (BioTac SP fingertips) and 90.7% (WTS-FT fingertips) with complete time-series data to about 94% for both sensor types.

CVSep 9, 2021
HSMD: An object motion detection algorithm using a Hybrid Spiking Neural Network Architecture

Pedro Machado, Andreas Oikonomou, Joao Filipe Ferreira et al.

The detection of moving objects is a trivial task performed by vertebrate retinas, yet a complex computer vision task. Object-motion-sensitive ganglion cells (OMS-GC) are specialised cells in the retina that sense moving objects. OMS-GC take as input continuous signals and produce spike patterns as output, that are transmitted to the Visual Cortex via the optic nerve. The Hybrid Sensitive Motion Detector (HSMD) algorithm proposed in this work enhances the GSOC dynamic background subtraction (DBS) algorithm with a customised 3-layer spiking neural network (SNN) that outputs spiking responses akin to the OMS-GC. The algorithm was compared against existing background subtraction (BS) approaches, available on the OpenCV library, specifically on the 2012 change detection (CDnet2012) and the 2014 change detection (CDnet2014) benchmark datasets. The results show that the HSMD was ranked overall first among the competing approaches and has performed better than all the other algorithms on four of the categories across all the eight test metrics. Furthermore, the HSMD proposed in this paper is the first to use an SNN to enhance an existing state of the art DBS (GSOC) algorithm and the results demonstrate that the SNN provides near real-time performance in realistic applications.

RONov 7, 2020
Strawberry Detection Using a Heterogeneous Multi-Processor Platform

Samuel Brandenburg, Pedro Machado, Nikesh Lama et al.

Over the last few years, the number of precision farming projects has increased specifically in harvesting robots and many of which have made continued progress from identifying crops to grasping the desired fruit or vegetable. One of the most common issues found in precision farming projects is that successful application is heavily dependent not just on identifying the fruit but also on ensuring that localisation allows for accurate navigation. These issues become significant factors when the robot is not operating in a prearranged environment, or when vegetation becomes too thick, thus covering crop. Moreover, running a state-of-the-art deep learning algorithm on an embedded platform is also very challenging, resulting most of the times in low frame rates. This paper proposes using the You Only Look Once version 3 (YOLOv3) Convolutional Neural Network (CNN) in combination with utilising image processing techniques for the application of precision farming robots targeting strawberry detection, accelerated on a heterogeneous multiprocessor platform. The results show a performance acceleration by five times when implemented on a Field-Programmable Gate Array (FPGA) when compared with the same algorithm running on the processor side with an accuracy of 78.3\% over the test set comprised of 146 images.

NESep 18, 2019
NatCSNN: A Convolutional Spiking Neural Network for recognition of objects extracted from natural images

Pedro Machado, Georgina Cosma, T. M McGinnity

Biological image processing is performed by complex neural networks composed of thousands of neurons interconnected via thousands of synapses, some of which are excitatory and others inhibitory. Spiking neural models are distinguished from classical neurons by being biological plausible and exhibiting the same dynamics as those observed in biological neurons. This paper proposes a Natural Convolutional Neural Network (NatCSNN) which is a 3-layer bio-inspired Convolutional Spiking Neural Network (CSNN), for classifying objects extracted from natural images. A two-stage training algorithm is proposed using unsupervised Spike Timing Dependent Plasticity (STDP) learning (phase 1) and ReSuMe supervised learning (phase 2). The NatCSNN was trained and tested on the CIFAR-10 dataset and achieved an average testing accuracy of 84.7% which is an improvement over the 2-layer neural networks previously applied to this dataset.