CVDec 12, 2025Code
Super-Resolved Canopy Height Mapping from Sentinel-2 Time Series Using LiDAR HD Reference Data across Metropolitan FranceEkaterina Kalinicheva, Florian Helen, Stéphane Mermoz et al.
Fine-scale forest monitoring is essential for understanding canopy structure and its dynamics, which are key indicators of carbon stocks, biodiversity, and forest health. Deep learning is particularly effective for this task, as it integrates spectral, temporal, and spatial signals that jointly reflect the canopy structure. To address this need, we introduce THREASURE-Net, a novel end-to-end framework for Tree Height Regression And Super-Resolution. The model is trained on Sentinel-2 time series using reference height metrics derived from LiDAR HD data at multiple spatial resolutions over Metropolitan France to produce annual height maps. We evaluate three model variants, producing tree-height predictions at 2.5 m, 5 m, and 10 m resolution. THREASURE-Net does not rely on any pretrained model nor on reference very high resolution optical imagery to train its super-resolution module; instead, it learns solely from LiDAR-derived height information. Our approach outperforms existing state-of-the-art methods based on Sentinel data and is competitive with methods based on very high resolution imagery. It can be deployed to generate high-precision annual canopy-height maps, achieving mean absolute errors of 2.62 m, 2.72 m, and 2.88 m at 2.5 m, 5 m, and 10 m resolution, respectively. These results highlight the potential of THREASURE-Net for scalable and cost-effective structural monitoring of temperate forests using only freely available satellite data. The source code for THREASURE-Net is available at: https://github.com/Global-Earth-Observation/threasure-net.
IVAug 5, 2024
Tree species classification at the pixel-level using deep learning and multispectral time series in an imbalanced contextFlorian Mouret, David Morin, Milena Planells et al.
This paper investigates tree species classification using Sentinel-2 multispectral satellite image time-series. Despite their critical importance for many applications, such maps are often unavailable, outdated, or inaccurate for large areas. The interest of using remote sensing time series to produce these maps has been highlighted in many studies. However, many methods proposed in the literature still rely on a standard classification algorithm, usually the Random Forest (RF) algorithm with vegetation indices. This study shows that the use of deep learning models can lead to a significant improvement in classification results, especially in an imbalanced context where the RF algorithm tends to predict towards the majority class. In our use case in the center of France with 10 tree species, we obtain an overall accuracy (OA) around 95% and a F1-macro score around 80% using three different benchmark deep learning architectures. In contrast, using the RF algorithm yields an OA of 93% and an F1 of 60%, indicating that the minority classes are not classified with sufficient accuracy. Therefore, the proposed framework is a strong baseline that can be easily implemented in most scenarios, even with a limited amount of reference data. Our results highlight that standard multilayer perceptron can be competitive with batch normalization and a sufficient amount of parameters. Other architectures (convolutional or attention-based) can also achieve strong results when tuned properly. Furthermore, our results show that DL models are naturally robust to imbalanced data, although similar results can be obtained using dedicated techniques.
MLJan 28, 2022
A Robust and Flexible EM Algorithm for Mixtures of Elliptical Distributions with Missing DataFlorian Mouret, Alexandre Hippert-Ferrer, Frédéric Pascal et al.
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data. A classical imputation method, the Expectation Maximization (EM) algorithm for Gaussian mixture models, has shown interesting properties when compared to other popular approaches such as those based on k-nearest neighbors or on multiple imputations by chained equations. However, Gaussian mixture models are known to be non-robust to heterogeneous data, which can lead to poor estimation performance when the data is contaminated by outliers or follows non-Gaussian distributions. To overcome this issue, a new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data. This paper shows that this problem reduces to the estimation of a mixture of Angular Gaussian distributions under generic assumptions (i.e., each sample is drawn from a mixture of elliptical distributions, which is possibly different for one sample to another). In that case, the complete-data likelihood associated with mixtures of elliptical distributions is well adapted to the EM framework with missing data thanks to its conditional distribution, which is shown to be a multivariate $t$-distribution. Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data. Furthermore, experiments conducted on real-world datasets show that this algorithm is very competitive when compared to other classical imputation methods.
MLOct 22, 2021
Reconstruction of Sentinel-2 Time Series Using Robust Gaussian Mixture Models -- Application to the Detection of Anomalous Crop Development in wheat and rapeseed cropsFlorian Mouret, Mohanad Albughdadi, Sylvie Duthoit et al.
Missing data is a recurrent problem in remote sensing, mainly due to cloud coverage for multispectral images and acquisition problems. This can be a critical issue for crop monitoring, especially for applications relying on machine learning techniques, which generally assume that the feature matrix does not have missing values. This paper proposes a Gaussian Mixture Model (GMM) for the reconstruction of parcel-level features extracted from multispectral images. A robust version of the GMM is also investigated, since datasets can be contaminated by inaccurate samples or features (e.g., wrong crop type reported, inaccurate boundaries, undetected clouds, etc). Additional features extracted from Synthetic Aperture Radar (SAR) images using Sentinel-1 data are also used to provide complementary information and improve the imputations. The robust GMM investigated in this work assigns reduced weights to the outliers during the estimation of the GMM parameters, which improves the final reconstruction. These weights are computed at each step of an Expectation-Maximization (EM) algorithm by using outlier scores provided by the isolation forest algorithm. Experimental validation is conducted on rapeseed and wheat parcels located in the Beauce region (France). Overall, we show that the GMM imputation method outperforms other reconstruction strategies. A mean absolute error (MAE) of 0.013 (resp. 0.019) is obtained for the imputation of the median Normalized Difference Index (NDVI) of the rapeseed (resp. wheat) parcels. Other indicators (e.g., Normalized Difference Water Index) and statistics (for instance the interquartile range, which captures heterogeneity among the parcel indicator) are reconstructed at the same time with good accuracy. In a dataset contaminated by irrelevant samples, using the robust GMM is recommended since the standard GMM imputation can lead to inaccurate imputed values.
IVApr 17, 2020
Outlier detection at the parcel-level in wheat and rapeseed crops using multispectral and SAR time seriesFlorian Mouret, Mohanad Albughdadi, Sylvie Duthoit et al.
This paper studies the detection of anomalous crop development at the parcel-level based on an unsupervised outlier detection technique. The experimental validation is conducted on rapeseed and wheat parcels located in Beauce (France). The proposed methodology consists of four sequential steps: 1) preprocessing of synthetic aperture radar (SAR) and multispectral images acquired using Sentinel-1 and Sentinel-2 satellites, 2) extraction of SAR and multispectral pixel-level features, 3) computation of parcel-level features using zonal statistics and 4) outlier detection. The different types of anomalies that can affect the studied crops are analyzed and described. The different factors that can influence the outlier detection results are investigated with a particular attention devoted to the synergy between Sentinel-1 and Sentinel-2 data. Overall, the best performance is obtained when using jointly a selection of Sentinel-1 and Sentinel-2 features with the isolation forest algorithm. The selected features are VV and VH backscattering coefficients for Sentinel-1 and 5 Vegetation Indexes for Sentinel-2 (among us, the Normalized Difference Vegetation Index and two variants of the Normalized Difference Water). When using these features with an outlier ratio of 10%, the percentage of detected true positives (i.e., crop anomalies) is equal to 94.1% for rapeseed parcels and 95.5% for wheat parcels.