Isibor Kennedy Ihianle

h-index9

16papers

397citations

Novelty31%

AI Score43

Ranked #56,512 of 194,257 authors (top 29%)#19,566 in CV (top 33%)

16 Papers

10.7AIApr 20Code

Training and Agentic Inference Strategies for LLM-based Manim Animation Generation

Ravidu Suien Rammuni Silva, Ahmad Lotfi, Isibor Kennedy Ihianle et al.

Generating programmatic animation using libraries such as Manim presents unique challenges for Large Language Models (LLMs), requiring spatial reasoning, temporal sequencing, and familiarity with domain-specific APIs that are underrepresented in general pre-training data. A systematic study of how training and inference strategies interact in this setting is lacking in current research. This study introduces ManimTrainer, a training pipeline that combines Supervised Fine-tuning (SFT) with Reinforcement Learning (RL) based Group Relative Policy Optimisation (GRPO) using a unified reward signal that fuses code and visual assessment signals, and ManimAgent, an inference pipeline featuring Renderer-in-the-loop (RITL) and API documentation-augmented RITL (RITL-DOC) strategies. Using these techniques, this study presents the first unified training and inference study for text-to-code-to-video transformation with Manim. It evaluates 17 open-source sub-30B LLMs across nine combinations of training and inference strategies using ManimBench. Results show that SFT generally improves code quality, while GRPO enhances visual outputs and increases the models' responsiveness to extrinsic signals during self-correction at inference time. The Qwen 3 Coder 30B model with GRPO and RITL-DOC achieved the highest overall performance, with a 94% Render Success Rate (RSR) and 85.7% Visual Similarity (VS) to reference videos, surpassing the baseline GPT-4.1 model by +3 percentage points in VS. Additionally, the analysis shows that the correlation between code and visual metrics strengthens with SFT and GRPO but weakens with inference-time enhancements, highlighting the complementary roles of training and agentic inference strategies in Manim animation generation.

CYJun 12Code

LessonBench-V1: A Benchmark Dataset for Evaluating AI Lesson Generation Agents

Ravidu Suien Rammuni Silva, Ahmad Lotfi, Isibor Kennedy Ihianle et al.

Large Language Model (LLM) based AI educational content generation systems are increasingly being developed, yet no standardised benchmark exists to systematically evaluate them. This study introduces LessonBench-V1, a benchmark dataset comprising 647 human-written lessons paired with LLM-based reverse-engineered lesson plans across 240 STEM topics spanning mathematics, physics, chemistry, and computer science. The lessons are drawn from 97 trusted open sources, including LibreTexts, Brilliant.org and GeeksForGeeks. Each lesson plan is human-reviewed and produced through a pedagogically grounded methodology that synthesises Bloom's Taxonomy, Gagné's Events, Merrill's First Principles, and the 5E Instructional Model. The lesson plans capture 3,620 learning objectives with pedagogical metadata, enabling systematic, reproducible evaluation of lesson-generation AI agents and supporting further research. The study further proposes a three-dimensional evaluation pipeline for use with the dataset.

1.4CVJul 22, 2022

Taguchi based Design of Sequential Convolution Neural Network for Classification of Defective Fasteners

Manjeet Kaur, Krishan Kumar Chauhan, Tanya Aggarwal et al.

Fasteners play a critical role in securing various parts of machinery. Deformations such as dents, cracks, and scratches on the surface of fasteners are caused by material properties and incorrect handling of equipment during production processes. As a result, quality control is required to ensure safe and reliable operations. The existing defect inspection method relies on manual examination, which consumes a significant amount of time, money, and other resources; also, accuracy cannot be guaranteed due to human error. Automatic defect detection systems have proven impactful over the manual inspection technique for defect analysis. However, computational techniques such as convolutional neural networks (CNN) and deep learning-based approaches are evolutionary methods. By carefully selecting the design parameter values, the full potential of CNN can be realised. Using Taguchi-based design of experiments and analysis, an attempt has been made to develop a robust automatic system in this study. The dataset used to train the system has been created manually for M14 size nuts having two labeled classes: Defective and Non-defective. There are a total of 264 images in the dataset. The proposed sequential CNN comes up with a 96.3% validation accuracy, 0.277 validation loss at 0.001 learning rate.

5.3LGFeb 22, 2023

Mitigating Adversarial Attacks in Deepfake Detection: An Exploration of Perturbation and AI Techniques

Saminder Dhesi, Laura Fontes, Pedro Machado et al.

Deep learning constitutes a pivotal component within the realm of machine learning, offering remarkable capabilities in tasks ranging from image recognition to natural language processing. However, this very strength also renders deep learning models susceptible to adversarial examples, a phenomenon pervasive across a diverse array of applications. These adversarial examples are characterized by subtle perturbations artfully injected into clean images or videos, thereby causing deep learning algorithms to misclassify or produce erroneous outputs. This susceptibility extends beyond the confines of digital domains, as adversarial examples can also be strategically designed to target human cognition, leading to the creation of deceptive media, such as deepfakes. Deepfakes, in particular, have emerged as a potent tool to manipulate public opinion and tarnish the reputations of public figures, underscoring the urgent need to address the security and ethical implications associated with adversarial examples. This article delves into the multifaceted world of adversarial examples, elucidating the underlying principles behind their capacity to deceive deep learning algorithms. We explore the various manifestations of this phenomenon, from their insidious role in compromising model reliability to their impact in shaping the contemporary landscape of disinformation and misinformation. To illustrate progress in combating adversarial examples, we showcase the development of a tailored Convolutional Neural Network (CNN) designed explicitly to detect deepfakes, a pivotal step towards enhancing model robustness in the face of adversarial threats. Impressively, this custom CNN has achieved a precision rate of 76.2% on the DFDC dataset.

2.3ARJul 13, 2022

Estimating the Power Consumption of Heterogeneous Devices when performing AI Inference

Pedro Machado, Ivica Matic, Francisco de Lemos et al.

Modern-day life is driven by electronic devices connected to the internet. The emerging research field of the Internet-of-Things (IoT) has become popular, just as there has been a steady increase in the number of connected devices. Since many of these devices are utilised to perform CV tasks, it is essential to understand their power consumption against performance. We report the power consumption profile and analysis of the NVIDIA Jetson Nano board while performing object classification. The authors present an extensive analysis regarding power consumption per frame and the output in frames per second using YOLOv5 models. The results show that the YOLOv5n outperforms other YOLOV5 variants in terms of throughput (i.e. 12.34 fps) and low power consumption (i.e. 0.154 mWh/frame).

2.6CVJul 6, 2022

Deep Learning approach for Classifying Trusses and Runners of Strawberries

Jakub Pomykala, Francisco de Lemos, Isibor Kennedy Ihianle et al.

The use of artificial intelligence in the agricultural sector has been growing at a rapid rate to automate farming activities. Emergent farming technologies focus on mapping and classification of plants, fruits, diseases, and soil types. Although, assisted harvesting and pruning applications using deep learning algorithms are in the early development stages, there is a demand for solutions to automate such processes. This paper proposes the use of Deep Learning for the classification of trusses and runners of strawberry plants using semantic segmentation and dataset augmentation. The proposed approach is based on the use of noises (i.e. Gaussian, Speckle, Poisson and Salt-and-Pepper) to artificially augment the dataset and compensate the low number of data samples and increase the overall classification performance. The results are evaluated using mean average of precision, recall and F1 score. The proposed approach achieved 91%, 95% and 92% on precision, recall and F1 score, respectively, for truss detection using the ResNet101 with dataset augmentation utilising Salt-and-Pepper noise; and 83%, 53% and 65% on precision, recall and F1 score, respectively, for truss detection using the ResNet50 with dataset augmentation utilising Poisson noise.

1.4CVJul 6, 2022

Real-Time Gesture Recognition with Virtual Glove Markers

Finlay McKinnon, David Ada Adama, Pedro Machado et al.

Due to the universal non-verbal natural communication approach that allows for effective communication between humans, gesture recognition technology has been steadily developing over the previous few decades. Many different strategies have been presented in research articles based on gesture recognition to try to create an effective system to send non-verbal natural communication information to computers, using both physical sensors and computer vision. Hyper accurate real-time systems, on the other hand, have only recently began to occupy the study field, with each adopting a range of methodologies due to past limits such as usability, cost, speed, and accuracy. A real-time computer vision-based human-computer interaction tool for gesture recognition applications that acts as a natural user interface is proposed. Virtual glove markers on users hands will be created and used as input to a deep learning model for the real-time recognition of gestures. The results obtained show that the proposed system would be effective in real-time applications including social interaction through telepresence and rehabilitation.

2.6LGSep 12, 2024

Enhanced Online Grooming Detection Employing Context Determination and Message-Level Analysis

Jake Street, Isibor Ihianle, Funminiyi Olajide et al.

Online Grooming (OG) is a prevalent threat facing predominately children online, with groomers using deceptive methods to prey on the vulnerability of children on social media/messaging platforms. These attacks can have severe psychological and physical impacts, including a tendency towards revictimization. Current technical measures are inadequate, especially with the advent of end-to-end encryption which hampers message monitoring. Existing solutions focus on the signature analysis of child abuse media, which does not effectively address real-time OG detection. This paper proposes that OG attacks are complex, requiring the identification of specific communication patterns between adults and children. It introduces a novel approach leveraging advanced models such as BERT and RoBERTa for Message-Level Analysis and a Context Determination approach for classifying actor interactions, including the introduction of Actor Significance Thresholds and Message Significance Thresholds. The proposed method aims to enhance accuracy and robustness in detecting OG by considering the dynamic and multi-faceted nature of these attacks. Cross-dataset experiments evaluate the robustness and versatility of our approach. This paper's contributions include improved detection methodologies and the potential for application in various scenarios, addressing gaps in current literature and practices.

2.3CRJan 15, 2023

Secure Video Streaming Using Dedicated Hardware

Nicholas Murray-Hill, Laura Fontes, Pedro Machado et al.

Purpose: The purpose of this article is to present a system that enhances the security, efficiency, and reconfigurability of an Internet-of-Things (IoT) system used for surveillance and monitoring. Methods: A Multi-Processor System-On-Chip (MPSoC) composed of Central Processor Unit (CPU) and Field-Programmable Gate Array (FPGA) is proposed for increasing the security and the frame rate of a smart IoT edge device. The private encryption key is safely embedded in the FPGA unit to avoid being exposed in the Random Access Memory (RAM). This allows the edge device to securely store and authenticate the key, protecting the data transmitted from the same Integrated Circuit (IC). Additionally, the edge device can simultaneously publish and route a camera stream using a lightweight communication protocol, achieving a frame rate of 14 frames per Second (fps). The performance of the MPSoC is compared to a NVIDIA Jetson Nano (NJN) and a Raspberry Pi 4 (RPI4) and it is found that the RPI4 is the most cost-effective solution but with lower frame rate, the NJN is the fastest because it can achieve higher frame-rate but it is not secure, and the MPSoC is the optimal solution because it offers a balanced frame rate and it is secure because it never exposes the secure key into the memory. Results: The proposed system successfully addresses the challenges of security, scalability, and efficiency in an IoT system used for surveillance and monitoring. The encryption key is securely stored and authenticated, and the edge device is able to simultaneously publish and route a camera stream feed high-definition images at 14 fps.

3.6CRNov 9, 2025

SteganoSNN: SNN-Based Audio-in-Image Steganography with Encryption

Biswajit Kumar Sahoo, Pedro Machado, Isibor Kennedy Ihianle et al.

Secure data hiding remains a fundamental challenge in digital communication, requiring a careful balance between computational efficiency and perceptual transparency. The balance between security and performance is increasingly fragile with the emergence of generative AI systems capable of autonomously generating and optimising sophisticated cryptanalysis and steganalysis algorithms, thereby accelerating the exposure of vulnerabilities in conventional data-hiding schemes. This work introduces SteganoSNN, a neuromorphic steganographic framework that exploits spiking neural networks (SNNs) to achieve secure, low-power, and high-capacity multimedia data hiding. Digitised audio samples are converted into spike trains using leaky integrate-and-fire (LIF) neurons, encrypted via a modulo-based mapping scheme, and embedded into the least significant bits of RGBA image channels using a dithering mechanism to minimise perceptual distortion. Implemented in Python using NEST and realised on a PYNQ-Z2 FPGA, SteganoSNN attains real-time operation with an embedding capacity of 8 bits per pixel. Experimental evaluations on the DIV2K 2017 dataset demonstrate image fidelity between 40.4 dB and 41.35 dB in PSNR and SSIM values consistently above 0.97, surpassing SteganoGAN in computational efficiency and robustness. SteganoSNN establishes a foundation for neuromorphic steganography, enabling secure, energy-efficient communication for Edge-AI, IoT, and biomedical applications.

5.4SDJun 22

EmotionAI: A Privacy-Preserving Computational Intelligence Pipeline for Speech-Emotion-Grounded Conversational Analysis

Wai Laam Mak, Isibor Kennedy Ihianle, Pedro Machado

Reviewing recorded interviews for affective cues such as composure, hesitation and agitation is slow and subjective, and cloud services that could automate it require sensitive audio to leave the device. EmotionAI is a fully local Computational Intelligence (CI) pipeline that couples Speech Emotion Recognition (SER) with generative reasoning. Speaker diarisation, Whisper Automatic Speech Recognition (ASR) and a wav2vec2 emotion classifier produce per-segment affective evidence, which is then passed to an adversarial three-model local Large Language Model (LLM) panel for timestamp-grounded and citation-constrained question answering. Zero-shot evaluation on the RAVDESS four-class English subset (n = 672) exposes cross-corpus fragility rather than classifier superiority: the deployed classifier scores 48.8% accuracy, above random (24.9%) and majority (28.6%) baselines but below an in-domain MFCC + logistic-regression comparator (71.0%). The complete pipeline runs in a mean 157 s on CPU (real-time factor approximately 1.33) with zero external calls. The contribution is not state-of-the-art SER but an auditable, privacy-preserving integration of imperfect affective evidence into grounded conversational analysis, together with an honest empirical account of where cross-corpus transfer and human-centred validation still fall short.

5.8AIDec 2, 2024Code

ArtBrain: An Explainable end-to-end Toolkit for Classification and Attribution of AI-Generated Art and Style

Ravidu Suien Rammuni Silva, Ahmad Lotfi, Isibor Kennedy Ihianle et al.

Recently, the quality of artworks generated using Artificial Intelligence (AI) has increased significantly, resulting in growing difficulties in detecting synthetic artworks. However, limited studies have been conducted on identifying the authenticity of synthetic artworks and their source. This paper introduces AI-ArtBench, a dataset featuring 185,015 artistic images across 10 art styles. It includes 125,015 AI-generated images and 60,000 pieces of human-created artwork. This paper also outlines a method to accurately detect AI-generated images and trace them to their source model. This work proposes a novel Convolutional Neural Network model based on the ConvNeXt model called AttentionConvNeXt. AttentionConvNeXt was implemented and trained to differentiate between the source of the artwork and its style with an F1-Score of 0.869. The accuracy of attribution to the generative model reaches 0.999. To combine the scientific contributions arising from this study, a web-based application named ArtBrain was developed to enable both technical and non-technical users to interact with the model. Finally, this study presents the results of an Artistic Turing Test conducted with 50 participants. The findings reveal that humans could identify AI-generated images with an accuracy of approximately 58%, while the model itself achieved a significantly higher accuracy of around 99%.

4.1ROMay 12, 2024

WeedScout: Real-Time Autonomous blackgrass Classification and Mapping using dedicated hardware

Matthew Gazzard, Helen Hicks, Isibor Kennedy Ihianle et al.

Blackgrass (Alopecurus myosuroides) is a competitive weed that has wide-ranging impacts on food security by reducing crop yields and increasing cultivation costs. In addition to the financial burden on agriculture, the application of herbicides as a preventive to blackgrass can negatively affect access to clean water and sanitation. The WeedScout project introduces a Real-Rime Autonomous Black-Grass Classification and Mapping (RT-ABGCM), a cutting-edge solution tailored for real-time detection of blackgrass, for precision weed management practices. Leveraging Artificial Intelligence (AI) algorithms, the system processes live image feeds, infers blackgrass density, and covers two stages of maturation. The research investigates the deployment of You Only Look Once (YOLO) models, specifically the streamlined YOLOv8 and YOLO-NAS, accelerated at the edge with the NVIDIA Jetson Nano (NJN). By optimising inference speed and model performance, the project advances the integration of AI into agricultural practices, offering potential solutions to challenges such as herbicide resistance and environmental impact. Additionally, two datasets and model weights are made available to the research community, facilitating further advancements in weed detection and precision farming technologies.

3.6CVJul 23, 2025

Bearded Dragon Activity Recognition Pipeline: An AI-Based Approach to Behavioural Monitoring

Arsen Yermukan, Pedro Machado, Feliciano Domingos et al.

Traditional monitoring of bearded dragon (Pogona Viticeps) behaviour is time-consuming and prone to errors. This project introduces an automated system for real-time video analysis, using You Only Look Once (YOLO) object detection models to identify two key behaviours: basking and hunting. We trained five YOLO variants (v5, v7, v8, v11, v12) on a custom, publicly available dataset of 1200 images, encompassing bearded dragons (600), heating lamps (500), and crickets (100). YOLOv8s was selected as the optimal model due to its superior balance of accuracy (mAP@0.5:0.95 = 0.855) and speed. The system processes video footage by extracting per-frame object coordinates, applying temporal interpolation for continuity, and using rule-based logic to classify specific behaviours. Basking detection proved reliable. However, hunting detection was less accurate, primarily due to weak cricket detection (mAP@0.5 = 0.392). Future improvements will focus on enhancing cricket detection through expanded datasets or specialised small-object detectors. This automated system offers a scalable solution for monitoring reptile behaviour in controlled environments, significantly improving research efficiency and data quality.

2.0CVMay 24, 2024Code

Enhancing Pollinator Conservation towards Agriculture 4.0: Monitoring of Bees through Object Recognition

Ajay John Alex, Chloe M. Barnes, Pedro Machado et al.

In an era of rapid climate change and its adverse effects on food production, technological intervention to monitor pollinator conservation is of paramount importance for environmental monitoring and conservation for global food security. The survival of the human species depends on the conservation of pollinators. This article explores the use of Computer Vision and Object Recognition to autonomously track and report bee behaviour from images. A novel dataset of 9664 images containing bees is extracted from video streams and annotated with bounding boxes. With training, validation and testing sets (6722, 1915, and 997 images, respectively), the results of the COCO-based YOLO model fine-tuning approaches show that YOLOv5m is the most effective approach in terms of recognition accuracy. However, YOLOv5s was shown to be the most optimal for real-time bee detection with an average processing and inference time of 5.1ms per video frame at the cost of slightly lower ability. The trained model is then packaged within an explainable AI interface, which converts detection events into timestamped reports and charts, with the aim of facilitating use by non-technical users such as expert stakeholders from the apiculture industry towards informing responsible consumption and production.

2.0CVDec 21, 2023

UDEEP: Edge-based Computer Vision for In-Situ Underwater Crayfish and Plastic Detection

Dennis Monari, Jack Larkin, Pedro Machado et al.

Invasive signal crayfish have a detrimental impact on ecosystems. They spread the fungal-type crayfish plague disease (Aphanomyces astaci) that is lethal to the native white clawed crayfish, the only native crayfish species in Britain. Invasive signal crayfish extensively burrow, causing habitat destruction, erosion of river banks and adverse changes in water quality, while also competing with native species for resources and leading to declines in native populations. Moreover, pollution exacerbates the vulnerability of White-clawed crayfish, with their populations declining by over 90% in certain English counties, making them highly susceptible to extinction. To safeguard aquatic ecosystems, it is imperative to address the challenges posed by invasive species and discarded plastics in the United Kingdom's river ecosystem's. The UDEEP platform can play a crucial role in environmental monitoring by performing on-the-fly classification of Signal crayfish and plastic debris while leveraging the efficacy of AI, IoT devices and the power of edge computing (i.e., NJN). By providing accurate data on the presence, spread and abundance of these species, the UDEEP platform can contribute to monitoring efforts and aid in mitigating the spread of invasive species.