Simone Moretti

2papers

2 Papers

57.6DSMar 24
Dynamic k-center clustering with lifetimes

Simone Moretti, Paolo Pellizzoni, Andrea Pietracaprina et al.

The $k$-center problem is a fundamental clustering variant with applications in learning systems and data summarization. In several real-world scenarios, the dataset to be clustered is not static, but evolves over time, as new data points arrive and old ones become stale. To account for dynamicity, the $k$-center problem has been mainly studied under the sliding window setting, where only the $N$ most recent points are considered non-stale, or the fully dynamic setting, where arbitrary sequences of point arrivals and deletions without prior notice may occur. In this paper, we introduce the dynamic setting with lifetimes, which bridges the two aforementioned classical settings by still allowing arbitrary arrivals and deletions, but making the deletion time of each point known upon its arrival. Under this new setting, we devise a deterministic $(2+\varepsilon)$-approximation algorithm with $\tilde{O}(k/\varepsilon)$ amortized update time and memory usage linear in the number of currently active points. Moreover, we develop a deterministic $(6+\varepsilon)$-approximation algorithm that, under tame update sequences, has $\tilde{O}(k/\varepsilon)$ worst-case update time and heavily sublinear working memory.

20.3DSMar 20
Dimensionality Reduction on Complex Vector Spaces for Euclidean Distance with Dynamic Weights

Simone Moretti, Paolo Pellizzoni, Francesco Silvestri

The weighted Euclidean norm $\|x\|_w$ of a vector $x\in \mathbb{R}^d$ with weights $w\in \mathbb{R}^d$ is the Euclidean norm where the contribution of each dimension is scaled by a given weight. Approaches to dimensionality reduction that satisfy the Johnson-Lindenstrauss (JL) lemma can be easily adapted to the weighted Euclidean distance if weights are known and fixed: it suffices to scale each dimension of the input vectors according to the weights, and then apply any standard approach. However, this is not the case when weights are unknown during the dimensionality reduction or might dynamically change. In this paper, we address this issue by providing a linear function that maps vectors into a smaller complex vector space and allows to retrieve a JL-like estimate for the weighted Euclidean distance once weights are revealed. Our results are based on the decomposition of the complex dimensionality reduction into several Rademacher chaos random variables, which are studied using novel concentration inequalities for sums of independent Rademacher chaoses.