LGMar 15, 2023
Forecasting Intraday Power Output by a Set of PV Systems using Recurrent Neural Networks and Physical CovariatesPierrick Bruneau, David Fiorelli, Christian Braun et al.
Accurate intraday forecasts of the power output by PhotoVoltaic (PV) systems are critical to improve the operation of energy distribution grids. We describe a neural autoregressive model that aims to perform such intraday forecasts. We build upon a physical, deterministic PV performance model, the output of which is used as covariates in the context of the neural model. In addition, our application data relates to a geographically distributed set of PV systems. We address all PV sites with a single neural model, which embeds the information about the PV site in specific covariates. We use a scale-free approach which relies on the explicit modeling of seasonal effects. Our proposal repurposes a model initially used in the retail sector and discloses a novel truncated Gaussian output distribution. An ablation study and a comparison to alternative architectures from the literature shows that the components in the best performing proposed model variant work synergistically to reach a skill score of 15.72% with respect to the physical model, used as a baseline.
IMNov 17, 2023
Astronomical Images Quality Assessment with Automated Machine LearningOlivier Parisot, Pierrick Bruneau, Patrik Hitzelberger
Electronically Assisted Astronomy consists in capturing deep sky images with a digital camera coupled to a telescope to display views of celestial objects that would have been invisible through direct observation. This practice generates a large quantity of data, which may then be enhanced with dedicated image editing software after observation sessions. In this study, we show how Image Quality Assessment can be useful for automatically rating astronomical images, and we also develop a dedicated model by using Automated Machine Learning.
LGJan 25, 2022
Cold Start Active Learning Strategies in the Context of Imbalanced ClassificationEtienne Brangbour, Pierrick Bruneau, Thomas Tamisier et al.
We present novel active learning strategies dedicated to providing a solution to the cold start stage, i.e. initializing the classification of a large set of data with no attached labels. Moreover, proposed strategies are designed to handle an imbalanced context in which random selection is highly inefficient. Specifically, our active learning iterations address label scarcity and imbalance using element scores, combining information extracted from a clustering structure to a label propagation model. The strategy is illustrated by a case study on annotating Twitter content w.r.t. testimonies of a real flood event. We show that our method effectively copes with class imbalance, by boosting the recall of samples from the minority class.
LGDec 7, 2020
Computing flood probabilities using Twitter: application to the Houston urban area during HarveyEtienne Brangbour, Pierrick Bruneau, Stéphane Marchand-Maillet et al.
In this paper, we investigate the conversion of a Twitter corpus into geo-referenced raster cells holding the probability of the associated geographical areas of being flooded. We describe a baseline approach that combines a density ratio function, aggregation using a spatio-temporal Gaussian kernel function, and TFIDF textual features. The features are transformed to probabilities using a logistic regression model. The described method is evaluated on a corpus collected after the floods that followed Hurricane Harvey in the Houston urban area in August-September 2017. The baseline reaches a F1 score of 68%. We highlight research directions likely to improve these initial results.
IRMar 12, 2019
Extracting localized information from a Twitter corpus for flood preventionEtienne Brangbour, Pierrick Bruneau, Stéphane Marchand-Maillet et al.
In this paper, we discuss the collection of a corpus associated to tropical storm Harvey, as well as its analysis from both spatial and topical perspectives. From the spatial perspective, our goal here is to get a first estimation of the quality and precision of the geographical information featured in the collected corpus. From a topical perspective, we discuss the representation of Twitter posts, and strategies to process an initially unlabeled corpus of tweets.
NEJan 10, 2018
Data-driven forecasting of solar irradiancePierrick Bruneau, Philippe Pinheiro, Yoann Didry
This paper describes a flexible approach to short term prediction of meteorological variables. In particular, we focus on the prediction of the solar irradiance one hour ahead, a task that has high practical value when optimizing solar energy resources. As Défi EGC 2018 provides us with time series data for multiple sensors (e.g. solar irradiance, temperature, hygrometry), recorded every minute for two years and 5 geographical sites from La Réunion island, we test the value of using recently observed data as input for prediction models, as well as the performance of models across sites. After describing our data cleaning and normalization process, we combine a variable selection step based on AutoRegressive Integrated Moving Average (ARIMA) models, to using general purpose regression techniques such as neural networks and regression trees.