Ioannis Psarros

DS
h-index4
4papers
11citations
Novelty60%
AI Score43

4 Papers

SPJan 30, 2023
Data-driven soiling detection in PV modules

Alexandros Kalimeris, Ioannis Psarros, Giorgos Giannopoulos et al.

Soiling is the accumulation of dirt in solar panels which leads to a decreasing trend in solar energy yield and may be the cause of vast revenue losses. The effect of soiling can be reduced by washing the panels, which is, however, a procedure of non-negligible cost. Moreover, soiling monitoring systems are often unreliable or very costly. We study the problem of estimating the soiling ratio in photo-voltaic (PV) modules, i.e., the ratio of the real power output to the power output that would be produced if solar panels were clean. A key advantage of our algorithms is that they estimate soiling, without needing to train on labelled data, i.e., periods of explicitly monitoring the soiling in each park, and without relying on generic analytical formulas which do not take into account the peculiarities of each installation. We consider as input a time series comprising a minimum set of measurements, that are available to most PV park operators. Our experimental evaluation shows that we significantly outperform current state-of-the-art methods for estimating soiling ratio.

37.4DSMar 25
Near Linear Time Approximation Schemes for Clustering of Partially Doubling Metrics

Anne Driemel, Jan Höckendorff, Ioannis Psarros et al.

Given a finite metric space $(X\cup Y, \mathbf{d})$ the $k$-median problem is to find a set of $k$ centers $C\subseteq Y$ that minimizes $\sum_{p\in X} \min_{c\in C} \mathbf{d}(p,c)$. In general metrics, the best polynomial time algorithm computes a $(2+ε)$-approximation for arbitrary $ε>0$ (Cohen-Addad et al. STOC 2025). However, if the metric is doubling, a near linear time $(1+ε)$-approximation algorithm is known (Cohen-Addad et al. J. ACM 2021). We show that the $(1+ε)$-approximation algorithm can be generalized to the case when either $X$ or $Y$ has bounded doubling dimension (but the other set not). The case when $X$ is doubling is motivated by the assumption that even though $X$ is part of a high-dimensional space, it may be that it is close to a low-dimensional structure. The case when $Y$ is doubling is motivated by specific clustering problems where the centers are low-dimensional. Specifically, our work in this setting implies the first near linear time approximation algorithm for the $(k,\ell)$-median problem under discrete Fréchet distance when $\ell$ is constant. We further introduce a novel complexity reduction for time series of real values that leads to a similar result for the case of discrete Fréchet distance. In order to solve the case when $Y$ has a bounded doubling dimension, we introduce a dimension reduction that replaces points from $X$ by sets of points in $Y$. To solve the case when $X$ has a bounded doubling dimension, we generalize Talwar's decomposition (Talwar STOC 2004) to our setting. The running time of our algorithms is $2^{2^t} \tilde O(n+m)$ where $t=O(\mathrm{ddim} \log \frac{\mathrm{ddim}}ε)$ and where $\mathrm{ddim}$ is the doubling dimension of $X$ (resp.\ $Y$). The results also extend to the metric facility location problem.

24.9CGMar 12
Space-Efficient Approximate Spherical Range Counting in High Dimensions

Andreas Kalavas, Ioannis Psarros

We study the following range searching problem in high-dimensional Euclidean spaces: given a finite set $P\subset \mathbb{R}^d$, where each $p\in P$ is assigned a weight $w_p$, and radius $r>0$, we need to preprocess $P$ into a data structure such that when a new query point $q\in \mathbb{R}^d$ arrives, the data structure reports the cumulative weight of points of $P$ within Euclidean distance $r$ from $q$. Solving the problem exactly seems to require space usage that is exponential to the dimension, a phenomenon known as the curse of dimensionality. Thus, we focus on approximate solutions where points up to $(1+\varepsilon)r$ away from $q$ may be taken into account, where $\varepsilon>0$ is an input parameter known during preprocessing. We build a data structure with near-linear space usage, and query time in $n^{1-Θ(\varepsilon^4/\log(1/\varepsilon))}+t_q^{\varrho}\cdot n^{1-\varrho}$, for some $\varrho=Θ(\varepsilon^2)$, where $t_q$ is the number of points of $P$ in the ambiguity zone, i.e., at distance between $r$ and $(1+\varepsilon)r$ from the query $q$. To the best of our knowledge, this is the first data structure with efficient space usage (subquadratic or near-linear for any $\varepsilon>0$) and query time that remains sublinear for any sublinear $t_q$. We supplement our worst-case bounds with a query-driven preprocessing algorithm to build data structures that are well-adapted to the query distribution.

DSFeb 19, 2025
A Query-Driven Approach to Space-Efficient Range Searching

Dimitris Fotakis, Andreas Kalavas, Ioannis Psarros

We initiate a study of a query-driven approach to designing partition trees for range-searching problems. Our model assumes that a data structure is to be built for an unknown query distribution that we can access through a sampling oracle, and must be selected such that it optimizes a meaningful performance parameter on expectation. Our first contribution is to show that a near-linear sample of queries allows the construction of a partition tree with a near-optimal expected number of nodes visited during querying. We enhance this approach by treating node processing as a classification problem, leveraging fast classifiers like shallow neural networks to obtain experimentally efficient query times. Our second contribution is to develop partition trees using sparse geometric separators. Our preprocessing algorithm, based on a sample of queries, builds a balanced tree with nodes associated with separators that minimize query stabs on expectation; this yields both fast processing of each node and a small number of visited nodes, significantly reducing query time.