Rajitha de Silva

CV
h-index80
12papers
119citations
Novelty42%
AI Score51

12 Papers

CVSep 9, 2022
Deep learning-based Crop Row Detection for Infield Navigation of Agri-Robots

Rajitha de Silva, Grzegorz Cielniak, Gang Wang et al.

Autonomous navigation in agricultural environments is challenged by varying field conditions that arise in arable fields. State-of-the-art solutions for autonomous navigation in such environments require expensive hardware such as RTK-GNSS. This paper presents a robust crop row detection algorithm that withstands such field variations using inexpensive cameras. Existing datasets for crop row detection does not represent all the possible field variations. A dataset of sugar beet images was created representing 11 field variations comprised of multiple grow stages, light levels, varying weed densities, curved crop rows and discontinuous crop rows. The proposed pipeline segments the crop rows using a deep learning-based method and employs the predicted segmentation mask for extraction of the central crop using a novel central crop row selection algorithm. The novel crop row detection algorithm was tested for crop row detection performance and the capability of visual servoing along a crop row. The visual servoing-based navigation was tested on a realistic simulation scenario with the real ground and plant textures. Our algorithm demonstrated robust vision-based crop row detection in challenging field conditions outperforming the baseline.

CVSep 28, 2022
Vision based Crop Row Navigation under Varying Field Conditions in Arable Fields

Rajitha de Silva, Grzegorz Cielniak, Junfeng Gao

Accurate crop row detection is often challenged by the varying field conditions present in real-world arable fields. Traditional colour based segmentation is unable to cater for all such variations. The lack of comprehensive datasets in agricultural environments limits the researchers from developing robust segmentation models to detect crop rows. We present a dataset for crop row detection with 11 field variations from Sugar Beet and Maize crops. We also present a novel crop row detection algorithm for visual servoing in crop row fields. Our algorithm can detect crop rows against varying field conditions such as curved crop rows, weed presence, discontinuities, growth stages, tramlines, shadows and light levels. Our method only uses RGB images from a front-mounted camera on a Husky robot to predict crop rows. Our method outperformed the classic colour based crop row detection baseline. Dense weed presence within inter-row space and discontinuities in crop rows were the most challenging field conditions for our crop row detection algorithm. Our method can detect the end of the crop row and navigate the robot towards the headland area when it reaches the end of the crop row.

CVApr 4, 2022
Towards Infield Navigation: leveraging simulated data for crop row detection

Rajitha de Silva, Grzegorz Cielniak, Junfeng Gao

Agricultural datasets for crop row detection are often bound by their limited number of images. This restricts the researchers from developing deep learning based models for precision agricultural tasks involving crop row detection. We suggest the utilization of small real-world datasets along with additional data generated by simulations to yield similar crop row detection performance as that of a model trained with a large real world dataset. Our method could reach the performance of a deep learning based crop row detection model trained with real-world data by using 60% less labelled real-world data. Our model performed well against field variations such as shadows, sunlight and grow stages. We introduce an automated pipeline to generate labelled images for crop row detection in simulation domain. An extensive comparison is done to analyze the contribution of simulated data towards reaching robust crop row detection in various real-world field scenarios.

CVMay 24
Semantics-Guided Multimodal Masked Autoencoder Pretraining for 3D BEV Object Detection

Prabuddhi Wariyapperuma, Rajitha de Silva, Marc Hanheide et al.

Accurate 3D bird's-eye view (BEV) object detection is essential for autonomous driving, and depends strongly on effective multimodal representations from complementary sensors such as cameras and LiDAR. Multimodal masked autoencoders have shown strong potential for learning such representations for downstream 3D BEV object detection. However, existing methods typically apply uniform random masking to camera and LiDAR inputs, treating all regions equally, and learn representations only through masked reconstruction. We propose a semantics-guided multimodal masked autoencoder framework that introduces semantic information during pretraining through two separate components: (i) semantics-guided LiDAR voxel masking, which preserves semantically important LiDAR regions more strongly, and (ii) an auxiliary point-wise LiDAR semantic decoder branch that injects semantic guidance in addition to reconstruction. On BEVFusion 3D object detection, our semantics-guided pretraining strategy improves performance on the nuScenes mini validation set compared to the standard UniM2AE baseline: semantics-guided LiDAR voxel masking yields +1.49% mean Average Precision (mAP) and +1.66% nuScenes Detection Score (NDS), while decoder-side point semantic supervision yields +1.39% mAP and +3.22% NDS over the baseline.

ROJun 9, 2023
Leaving the Lines Behind: Vision-Based Crop Row Exit for Agricultural Robot Navigation

Rajitha de Silva, Grzegorz Cielniak, Junfeng Gao

Usage of purely vision based solutions for row switching is not well explored in existing vision based crop row navigation frameworks. This method only uses RGB images for local feature matching based visual feedback to exit crop row. Depth images were used at crop row end to estimate the navigation distance within headland. The algorithm was tested on diverse headland areas with soil and vegetation. The proposed method could reach the end of the crop row and then navigate into the headland completely leaving behind the crop row with an error margin of 50 cm.

CVMay 22
Calibration-Informative Region Selection for Online LiDAR--Camera Calibration in Agricultural Environments

Rajitha de Silva, Grzegorz Cielniak

Reliable multi-modal calibration requires identifying which observations truly constrain the extrinsic parameters and which ones mainly add noise or ambiguity. In this paper, we propose a support-map-driven approach to multi-modal calibration that decouples four functional blocks: initial calibration, cross-modal residual extraction, support-map estimation, and support-aware refinement. We instantiate this formulation for online LiDAR--camera calibration using MDPCalib, a target-less LiDAR--camera calibration method based on motion and deep point correspondences, and CMRNext, a dense LiDAR--camera matching model that predicts optical-flow-like image-plane residuals. The key contribution is a dense calibration support map that aggregates cross-modal agreement over aligned observations and highlights where calibration evidence is consistently reliable. Across the Bacchus Long-Term (BLT) dataset and KITTI, we show that calibration evidence is spatially and semantically non-uniform, indicating that some semantic regions provide stronger cues for calibration than others. On KITTI, support-guided refinement improves the calibration performance with better translation accuracy while rotational gains remain limited.

ROSep 21, 2023
A Vision-Based Navigation System for Arable Fields

Rajitha de Silva, Grzegorz Cielniak, Junfeng Gao

Vision-based navigation systems in arable fields are an underexplored area in agricultural robot navigation. Vision systems deployed in arable fields face challenges such as fluctuating weed density, varying illumination levels, growth stages and crop row irregularities. Current solutions are often crop-specific and aimed to address limited individual conditions such as illumination or weed density. Moreover, the scarcity of comprehensive datasets hinders the development of generalised machine learning systems for navigating these fields. This paper proposes a suite of deep learning-based perception algorithms using affordable vision sensors for vision-based navigation in arable fields. Initially, a comprehensive dataset that captures the intricacies of multiple crop seasons, various crop types, and a range of field variations was compiled. Next, this study delves into the creation of robust infield perception algorithms capable of accurately detecting crop rows under diverse conditions such as different growth stages, weed density, and varying illumination. Further, it investigates the integration of crop row following with vision-based crop row switching for efficient field-scale navigation. The proposed infield navigation system was tested in commercial arable fields traversing a total distance of 4.5 km with average heading and cross-track errors of 1.24° and 3.32 cm respectively.

ROMar 16
Perception-Aware Autonomous Exploration in Feature-Limited Environments

Moji Shi, Rajitha de Silva, Hang Yu et al.

Autonomous exploration in unknown environments typically relies on onboard state estimation for localisation and mapping. Existing exploration methods primarily maximise coverage efficiency, but often overlook that visual-inertial odometry (VIO) performance strongly depends on the availability of robust visual features. As a result, exploration policies can drive a robot into feature-sparse regions where tracking degrades, leading to odometry drift, corrupted maps, and mission failure. We propose a hierarchical perception-aware exploration framework for a stereo-equipped unmanned aerial vehicle (UAV) that explicitly couples exploration progress with feature observability. Our approach (i) associates each candidate frontier with an expected feature quality using a global feature map, and prioritises visually informative subgoals, and (ii) optimises a continuous yaw trajectory along the planned motion to maintain stable feature tracks. We evaluate our method in simulation across environments with varying texture levels and in real-world indoor experiments with largely textureless walls. Compared to baselines that ignore feature quality and/or do not optimise continuous yaw, our method maintains more reliable feature tracking, reduces odometry drift, and achieves on average 30\% higher coverage before the odometry error exceeds specified thresholds.

ROMar 11
Semantic Landmark Particle Filter for Robot Localisation in Vineyards

Rajitha de Silva, Jonathan Cox, James R. Heselden et al.

Reliable localisation in vineyards is hindered by row-level perceptual aliasing: parallel crop rows produce nearly identical LiDAR observations, causing geometry-only and vision-based SLAM systems to converge towards incorrect corridors, particularly during headland transitions. We present a Semantic Landmark Particle Filter (SLPF) that integrates trunk and pole landmark detections with 2D LiDAR within a probabilistic localisation framework. Detected trunks are converted into semantic walls, forming structural row boundaries embedded in the measurement model to improve discrimination between adjacent rows. GNSS is incorporated as a lightweight prior that stabilises localisation when semantic observations are sparse. Field experiments in a 10-row vineyard demonstrate consistent improvements over geometry-only (AMCL), vision-based (RTAB-Map), and GNSS baselines. Compared to AMCL, SLPF reduces Absolute Pose Error by 22% and 65% across two traversal directions; relative to a NoisyGNSS baseline, APE decreases by 65% and 61%. Row correctness improves from 0.67 to 0.73, while mean cross-track error decreases from 1.40 m to 1.26 m. These results show that embedding row-level structural semantics within the measurement model enables robust localisation in highly repetitive outdoor agricultural environments.

CVMar 11, 2025
Keypoint Semantic Integration for Improved Feature Matching in Outdoor Agricultural Environments

Rajitha de Silva, Jonathan Cox, Marija Popovic et al.

Robust robot navigation in outdoor environments requires accurate perception systems capable of handling visual challenges such as repetitive structures and changing appearances. Visual feature matching is crucial to vision-based pipelines but remains particularly challenging in natural outdoor settings due to perceptual aliasing. We address this issue in vineyards, where repetitive vine trunks and other natural elements generate ambiguous descriptors that hinder reliable feature matching. We hypothesise that semantic information tied to keypoint positions can alleviate perceptual aliasing by enhancing keypoint descriptor distinctiveness. To this end, we introduce a keypoint semantic integration technique that improves the descriptors in semantically meaningful regions within the image, enabling more accurate differentiation even among visually similar local features. We validate this approach in two vineyard perception tasks: (i) relative pose estimation and (ii) visual localisation. Across all tested keypoint types and descriptors, our method improves matching accuracy by 12.6%, demonstrating its effectiveness over multiple months in challenging vineyard conditions.

ROSep 22, 2025
Semantic-Aware Particle Filter for Reliable Vineyard Robot Localisation

Rajitha de Silva, Jonathan Cox, James R. Heselden et al.

Accurate localisation is critical for mobile robots in structured outdoor environments, yet LiDAR-based methods often fail in vineyards due to repetitive row geometry and perceptual aliasing. We propose a semantic particle filter that incorporates stable object-level detections, specifically vine trunks and support poles into the likelihood estimation process. Detected landmarks are projected into a birds eye view and fused with LiDAR scans to generate semantic observations. A key innovation is the use of semantic walls, which connect adjacent landmarks into pseudo-structural constraints that mitigate row aliasing. To maintain global consistency in headland regions where semantics are sparse, we introduce a noisy GPS prior that adaptively supports the filter. Experiments in a real vineyard demonstrate that our approach maintains localisation within the correct row, recovers from deviations where AMCL fails, and outperforms vision-based SLAM methods such as RTAB-Map.

CVSep 16, 2021
Towards agricultural autonomy: crop row detection under varying field conditions using deep learning

Rajitha de Silva, Grzegorz Cielniak, Junfeng Gao

This paper presents a novel metric to evaluate the robustness of deep learning based semantic segmentation approaches for crop row detection under different field conditions encountered by a field robot. A dataset with ten main categories encountered under various field conditions was used for testing. The effect on these conditions on the angular accuracy of crop row detection was compared. A deep convolutional encoder decoder network is implemented to predict crop row masks using RGB input images. The predicted mask is then sent to a post processing algorithm to extract the crop rows. The deep learning model was found to be robust against shadows and growth stages of the crop while the performance was reduced under direct sunlight, increasing weed density, tramlines and discontinuities in crop rows when evaluated with the novel metric.