Ravidu Suien Rammuni Silva

h-index3

4papers

19citations

Novelty52%

AI Score49

Ranked #24,867 of 194,257 authors (top 13%)#1,294 in AI (top 10%)

4 Papers

10.7AIApr 20Code

Training and Agentic Inference Strategies for LLM-based Manim Animation Generation

Ravidu Suien Rammuni Silva, Ahmad Lotfi, Isibor Kennedy Ihianle et al.

Generating programmatic animation using libraries such as Manim presents unique challenges for Large Language Models (LLMs), requiring spatial reasoning, temporal sequencing, and familiarity with domain-specific APIs that are underrepresented in general pre-training data. A systematic study of how training and inference strategies interact in this setting is lacking in current research. This study introduces ManimTrainer, a training pipeline that combines Supervised Fine-tuning (SFT) with Reinforcement Learning (RL) based Group Relative Policy Optimisation (GRPO) using a unified reward signal that fuses code and visual assessment signals, and ManimAgent, an inference pipeline featuring Renderer-in-the-loop (RITL) and API documentation-augmented RITL (RITL-DOC) strategies. Using these techniques, this study presents the first unified training and inference study for text-to-code-to-video transformation with Manim. It evaluates 17 open-source sub-30B LLMs across nine combinations of training and inference strategies using ManimBench. Results show that SFT generally improves code quality, while GRPO enhances visual outputs and increases the models' responsiveness to extrinsic signals during self-correction at inference time. The Qwen 3 Coder 30B model with GRPO and RITL-DOC achieved the highest overall performance, with a 94% Render Success Rate (RSR) and 85.7% Visual Similarity (VS) to reference videos, surpassing the baseline GPT-4.1 model by +3 percentage points in VS. Additionally, the analysis shows that the correlation between code and visual metrics strengthens with SFT and GRPO but weakens with inference-time enhancements, highlighting the complementary roles of training and agentic inference strategies in Manim animation generation.

CYJun 12Code

LessonBench-V1: A Benchmark Dataset for Evaluating AI Lesson Generation Agents

Ravidu Suien Rammuni Silva, Ahmad Lotfi, Isibor Kennedy Ihianle et al.

Large Language Model (LLM) based AI educational content generation systems are increasingly being developed, yet no standardised benchmark exists to systematically evaluate them. This study introduces LessonBench-V1, a benchmark dataset comprising 647 human-written lessons paired with LLM-based reverse-engineered lesson plans across 240 STEM topics spanning mathematics, physics, chemistry, and computer science. The lessons are drawn from 97 trusted open sources, including LibreTexts, Brilliant.org and GeeksForGeeks. Each lesson plan is human-reviewed and produced through a pedagogically grounded methodology that synthesises Bloom's Taxonomy, Gagné's Events, Merrill's First Principles, and the 5E Instructional Model. The lesson plans capture 3,620 learning objectives with pedagogical metadata, enabling systematic, reproducible evaluation of lesson-generation AI agents and supporting further research. The study further proposes a three-dimensional evaluation pipeline for use with the dataset.

2.8CVDec 10, 2023Code

FM-G-CAM: A Holistic Approach for Explainable AI in Computer Vision

Ravidu Suien Rammuni Silva, Jordan J. Bird

Explainability is an aspect of modern AI that is vital for impact and usability in the real world. The main objective of this paper is to emphasise the need to understand the predictions of Computer Vision models, specifically Convolutional Neural Network (CNN) based models. Existing methods of explaining CNN predictions are mostly based on Gradient-weighted Class Activation Maps (Grad-CAM) and solely focus on a single target class. We show that from the point of the target class selection, we make an assumption on the prediction process, hence neglecting a large portion of the predictor CNN model's thinking process. In this paper, we present an exhaustive methodology called Fused Multi-class Gradient-weighted Class Activation Map (FM-G-CAM) that considers multiple top predicted classes, which provides a holistic explanation of the predictor CNN's thinking rationale. We also provide a detailed and comprehensive mathematical and algorithmic description of our method. Furthermore, along with a concise comparison of existing methods, we compare FM-G-CAM with Grad-CAM, highlighting its benefits through real-world practical use cases. Finally, we present an open-source Python library with FM-G-CAM implementation to conveniently generate saliency maps for CNN-based model predictions.

5.8AIDec 2, 2024Code

ArtBrain: An Explainable end-to-end Toolkit for Classification and Attribution of AI-Generated Art and Style

Ravidu Suien Rammuni Silva, Ahmad Lotfi, Isibor Kennedy Ihianle et al.

Recently, the quality of artworks generated using Artificial Intelligence (AI) has increased significantly, resulting in growing difficulties in detecting synthetic artworks. However, limited studies have been conducted on identifying the authenticity of synthetic artworks and their source. This paper introduces AI-ArtBench, a dataset featuring 185,015 artistic images across 10 art styles. It includes 125,015 AI-generated images and 60,000 pieces of human-created artwork. This paper also outlines a method to accurately detect AI-generated images and trace them to their source model. This work proposes a novel Convolutional Neural Network model based on the ConvNeXt model called AttentionConvNeXt. AttentionConvNeXt was implemented and trained to differentiate between the source of the artwork and its style with an F1-Score of 0.869. The accuracy of attribution to the generative model reaches 0.999. To combine the scientific contributions arising from this study, a web-based application named ArtBrain was developed to enable both technical and non-technical users to interact with the model. Finally, this study presents the results of an Artistic Turing Test conducted with 50 participants. The findings reveal that humans could identify AI-generated images with an accuracy of approximately 58%, while the model itself achieved a significantly higher accuracy of around 99%.