Rodrigo Ibata

CV
h-index13
6papers
141citations
Novelty69%
AI Score35

6 Papers

IMMar 6, 2023
Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical laws

Wassim Tenachi, Rodrigo Ibata, Foivos I. Diakogiannis

Symbolic Regression is the study of algorithms that automate the search for analytic expressions that fit data. While recent advances in deep learning have generated renewed interest in such approaches, the development of symbolic regression methods has not been focused on physics, where we have important additional constraints due to the units associated with our data. Here we present $Φ$-SO, a Physical Symbolic Optimization framework for recovering analytical symbolic expressions from physics data using deep reinforcement learning techniques by learning units constraints. Our system is built, from the ground up, to propose solutions where the physical units are consistent by construction. This is useful not only in eliminating physically impossible solutions, but because the "grammatical" rules of dimensional analysis restrict enormously the freedom of the equation generator, thus vastly improving performance. The algorithm can be used to fit noiseless data, which can be useful for instance when attempting to derive an analytical property of a physical model, and it can also be used to obtain analytical approximations to noisy data. We test our machinery on a standard benchmark of equations from the Feynman Lectures on Physics and other physics textbooks, achieving state-of-the-art performance in the presence of noise (exceeding 0.1%) and show that it is robust even in the presence of substantial (10%) noise. We showcase its abilities on a panel of examples from astrophysics.

CVSep 20, 2024Code
Tackling fluffy clouds: robust field boundary delineation across global agricultural landscapes with Sentinel-1 and Sentinel-2 Time Series

Foivos I. Diakogiannis, Zheng-Shu Zhou, Jeff Wang et al.

Accurate delineation of agricultural field boundaries is essential for effective crop monitoring and resource management. However, competing methodologies often face significant challenges, particularly in their reliance on extensive manual efforts for cloud-free data curation and limited adaptability to diverse global conditions. In this paper, we introduce PTAViT3D, a deep learning architecture specifically designed for processing three-dimensional time series of satellite imagery from either Sentinel-1 (S1) or Sentinel-2 (S2). Additionally, we present PTAViT3D-CA, an extension of the PTAViT3D model incorporating cross-attention mechanisms to fuse S1 and S2 datasets, enhancing robustness in cloud-contaminated scenarios. The proposed methods leverage spatio-temporal correlations through a memory-efficient 3D Vision Transformer architecture, facilitating accurate boundary delineation directly from raw, cloud-contaminated imagery. We comprehensively validate our models through extensive testing on various datasets, including Australia's ePaddocks - CSIRO's national agricultural field boundary product - alongside public benchmarks Fields-of-the-World, PASTIS, and AI4SmallFarms. Our results consistently demonstrate state-of-the-art performance, highlighting excellent global transferability and robustness. Crucially, our approach significantly simplifies data preparation workflows by reliably processing cloud-affected imagery, thereby offering strong adaptability across diverse agricultural environments. Our code and models are publicly available at https://github.com/feevos/tfcl.

CVOct 12, 2023
SSG2: A new modelling paradigm for semantic segmentation

Foivos I. Diakogiannis, Suzanne Furby, Peter Caccetta et al.

State-of-the-art models in semantic segmentation primarily operate on single, static images, generating corresponding segmentation masks. This one-shot approach leaves little room for error correction, as the models lack the capability to integrate multiple observations for enhanced accuracy. Inspired by work on semantic change detection, we address this limitation by introducing a methodology that leverages a sequence of observables generated for each static input image. By adding this "temporal" dimension, we exploit strong signal correlations between successive observations in the sequence to reduce error rates. Our framework, dubbed SSG2 (Semantic Segmentation Generation 2), employs a dual-encoder, single-decoder base network augmented with a sequence model. The base model learns to predict the set intersection, union, and difference of labels from dual-input images. Given a fixed target input image and a set of support images, the sequence model builds the predicted mask of the target by synthesizing the partial views from each sequence step and filtering out noise. We evaluate SSG2 across three diverse datasets: UrbanMonitor, featuring orthoimage tiles from Darwin, Australia with five spectral bands and 0.2m spatial resolution; ISPRS Potsdam, which includes true orthophoto images with multiple spectral bands and a 5cm ground sampling distance; and ISIC2018, a medical dataset focused on skin lesion segmentation, particularly melanoma. The SSG2 model demonstrates rapid convergence within the first few tens of epochs and significantly outperforms UNet-like baseline models with the same number of gradient updates. However, the addition of the temporal dimension results in an increased memory footprint. While this could be a limitation, it is offset by the advent of higher-memory GPUs and coding optimizations.

LGDec 4, 2023
Class Symbolic Regression: Gotta Fit 'Em All

Wassim Tenachi, Rodrigo Ibata, Thibaut L. François et al.

We introduce 'Class Symbolic Regression' (Class SR) a first framework for automatically finding a single analytical functional form that accurately fits multiple datasets - each realization being governed by its own (possibly) unique set of fitting parameters. This hierarchical framework leverages the common constraint that all the members of a single class of physical phenomena follow a common governing law. Our approach extends the capabilities of our earlier Physical Symbolic Optimization ($Φ$-SO) framework for Symbolic Regression, which integrates dimensional analysis constraints and deep reinforcement learning for unsupervised symbolic analytical function discovery from data. Additionally, we introduce the first Class SR benchmark, comprising a series of synthetic physical challenges specifically designed to evaluate such algorithms. We demonstrate the efficacy of our novel approach by applying it to these benchmark challenges and showcase its practical utility for astrophysics by successfully extracting an analytic galaxy potential from a set of simulated orbits approximating stellar streams.

LGDec 6, 2023
Physical Symbolic Optimization

Wassim Tenachi, Rodrigo Ibata, Foivos I. Diakogiannis

We present a framework for constraining the automatic sequential generation of equations to obey the rules of dimensional analysis by construction. Combining this approach with reinforcement learning, we built $Φ$-SO, a Physical Symbolic Optimization method for recovering analytical functions from physical data leveraging units constraints. Our symbolic regression algorithm achieves state-of-the-art results in contexts in which variables and constants have known physical units, outperforming all other methods on SRBench's Feynman benchmark in the presence of noise (exceeding 0.1%) and showing resilience even in the presence of significant (10%) levels of noise.

GAMay 26, 2023
An end-to-end strategy for recovering a free-form potential from a snapshot of stellar coordinates

Wassim Tenachi, Rodrigo Ibata, Foivos I. Diakogiannis

New large observational surveys such as Gaia are leading us into an era of data abundance, offering unprecedented opportunities to discover new physical laws through the power of machine learning. Here we present an end-to-end strategy for recovering a free-form analytical potential from a mere snapshot of stellar positions and velocities. First we show how auto-differentiation can be used to capture an agnostic map of the gravitational potential and its underlying dark matter distribution in the form of a neural network. However, in the context of physics, neural networks are both a plague and a blessing as they are extremely flexible for modeling physical systems but largely consist in non-interpretable black boxes. Therefore, in addition, we show how a complementary symbolic regression approach can be used to open up this neural network into a physically meaningful expression. We demonstrate our strategy by recovering the potential of a toy isochrone system.