CVFeb 4

How to rewrite the stars: Mapping your orchard over time through constellations of fruits

arXiv:2602.04722v1h-index: 26
Originality Highly original
AI Analysis

This addresses a critical bottleneck for farmers and agricultural robots in precision agriculture by enabling scalable, automated fruit growth tracking without reliance on fixed camera positions or GPS.

The paper tackles the problem of matching individual fruits across videos taken at different times to track growth, which is essential for predicting yield but previously unsolved without manual or rigid assumptions. The proposed method uses constellations of 3D centroids and a descriptor for sparse point clouds, enabling successful fruit matching, orchard mapping, and camera pose estimation for applications like autonomous robot navigation.

Following crop growth through the vegetative cycle allows farmers to predict fruit setting and yield in early stages, but it is a laborious and non-scalable task if performed by a human who has to manually measure fruit sizes with a caliper or dendrometers. In recent years, computer vision has been used to automate several tasks in precision agriculture, such as detecting and counting fruits, and estimating their size. However, the fundamental problem of matching the exact same fruits from one video, collected on a given date, to the fruits visible in another video, collected on a later date, which is needed to track fruits' growth through time, remains to be solved. Few attempts were made, but they either assume that the camera always starts from the same known position and that there are sufficiently distinct features to match, or they used other sources of data like GPS. Here we propose a new paradigm to tackle this problem, based on constellations of 3D centroids, and introduce a descriptor for very sparse 3D point clouds that can be used to match fruits across videos. Matching constellations instead of individual fruits is key to deal with non-rigidity, occlusions and challenging imagery with few distinct visual features to track. The results show that the proposed method can be successfully used to match fruits across videos and through time, and also to build an orchard map and later use it to locate the camera pose in 6DoF, thus providing a method for autonomous navigation of robots in the orchard and for selective fruit picking, for example.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes