32.4LGMar 15
Unlearning-based sliding window for continual learning under concept driftMichal Wozniak, Marek Klonowski, Maciej Maczynski et al.
Traditional machine learning assumes a stationary data distribution, yet many real-world applications operate on nonstationary streams in which the underlying concept evolves over time. This problem can also be viewed as task-free continual learning under concept drift, where a model must adapt sequentially without explicit task identities or task boundaries. In such settings, effective learning requires both rapid adaptation to new data and forgetting of outdated information. A common solution is based on a sliding window, but this approach is often computationally demanding because the model must be repeatedly retrained from scratch on the most recent data. We propose a different perspective based on machine unlearning. Instead of rebuilding the model each time the active window changes, we remove the influence of outdated samples using unlearning and then update the model with newly observed data. This enables efficient, targeted forgetting while preserving adaptation to evolving distributions. To the best of our knowledge, this is the first work to connect machine unlearning with concept drift mitigation for task-free continual learning. Empirical results on image stream classification across multiple drift scenarios demonstrate that the proposed approach offers a competitive and computationally efficient alternative to standard sliding-window retraining. Our implementation can be found at \hrehttps://anonymous.4open.science/r/MUNDataStream-60F3}{https://anonymous.4open.science/r/MUNDataStream-60F3}.
CLMay 8, 2025
Unpacking Robustness in Inflectional Languages: Adversarial Evaluation and Mechanistic InsightsPaweł Walkowiak, Marek Klonowski, Marcin Oleksy et al.
Various techniques are used in the generation of adversarial examples, including methods such as TextBugger which introduce minor, hardly visible perturbations to words leading to changes in model behaviour. Another class of techniques involves substituting words with their synonyms in a way that preserves the text's meaning but alters its predicted class, with TextFooler being a prominent example of such attacks. Most adversarial example generation methods are developed and evaluated primarily on non-inflectional languages, typically English. In this work, we evaluate and explain how adversarial attacks perform in inflectional languages. To explain the impact of inflection on model behaviour and its robustness under attack, we designed a novel protocol inspired by mechanistic interpretability, based on Edge Attribution Patching (EAP) method. The proposed evaluation protocol relies on parallel task-specific corpora that include both inflected and syncretic variants of texts in two languages -- Polish and English. To analyse the models and explain the relationship between inflection and adversarial robustness, we create a new benchmark based on task-oriented dataset MultiEmo, enabling the identification of mechanistic inflection-related elements of circuits within the model and analyse their behaviour under attack.
LGJul 15, 2025
How to Protect Models against Adversarial Unlearning?Patryk Jasiorski, Marek Klonowski, Michał Woźniak
AI models need to be unlearned to fulfill the requirements of legal acts such as the AI Act or GDPR, and also because of the need to remove toxic content, debiasing, the impact of malicious instances, or changes in the data distribution structure in which a model works. Unfortunately, removing knowledge may cause undesirable side effects, such as a deterioration in model performance. In this paper, we investigate the problem of adversarial unlearning, where a malicious party intentionally sends unlearn requests to deteriorate the model's performance maximally. We show that this phenomenon and the adversary's capabilities depend on many factors, primarily on the backbone model itself and strategy/limitations in selecting data to be unlearned. The main result of this work is a new method of protecting model performance from these side effects, both in the case of unlearned behavior resulting from spontaneous processes and adversary actions.
CRMar 25, 2020
Probabilistic Counters for Privacy Preserving Data AggregationDominik Bojko, Krzysztof Grining, Marek Klonowski
Probabilistic counters are well-known tools often used for space-efficient set cardinality estimation. In this paper, we investigate probabilistic counters from the perspective of preserving privacy. We use the standard, rigid differential privacy notion. The intuition is that the probabilistic counters do not reveal too much information about individuals but provide only general information about the population. Therefore, they can be used safely without violating the privacy of individuals. However, it turned out, that providing a precise, formal analysis of the privacy parameters of probabilistic counters is surprisingly difficult and needs advanced techniques and a very careful approach. We demonstrate that probabilistic counters can be used as a privacy protection mechanism without extra randomization. Namely, the inherent randomization from the protocol is sufficient for protecting privacy, even if the probabilistic counter is used multiple times. In particular, we present a specific privacy-preserving data aggregation protocol based on Morris Counter and MaxGeo Counter. Some of the presented results are devoted to counters that have not been investigated so far from the perspective of privacy protection. Another part is an improvement of previous results. We show how our results can be used to perform distributed surveys and compare the properties of counter-based solutions and a standard Laplace method.
CRMay 25, 2016
Towards Extending Noiseless Privacy -- Dependent Data and More Practical ApproachKrzysztof Grining, Marek Klonowski
In 2011 Bhaskar et al. pointed out that in many cases one can ensure sufficient level of privacy without adding noise by utilizing adversarial uncertainty. Informally speaking, this observation comes from the fact that if at least a part of the data is randomized from the adversary's point of view, it can be effectively used for hiding other values. So far the approach to that idea in the literature was mostly purely asymptotic, which greatly limited its adaptation in real-life scenarios. In this paper we aim to make the concept of utilizing adversarial uncertainty not only an interesting theoretical idea, but rather a practically useful technique, complementary to differential privacy, which is the state-of-the-art definition of privacy. This requires non-asymptotic privacy guarantees, more realistic approach to the randomness inherently present in the data and to the adversary's knowledge. In our paper we extend the concept proposed by Bhaskar et al. and present some results for wider class of data. In particular we cover the data sets that are dependent. We also introduce rigorous adversarial model. Moreover, in contrast to most of previous papers in this field, we give detailed (non-asymptotic) results which is motivated by practical reasons. Note that it required a modified approach and more subtle mathematical tools, including Stein method which, to the best of our knowledge, was not used in privacy research before. Apart from that, we show how to combine adversarial uncertainty with differential privacy approach and explore synergy between them to enhance the privacy parameters already present in the data itself by adding small amount of noise.
CRFeb 12, 2016
Practical Fault-Tolerant Data AggregationKrzysztof Grining, Marek Klonowski, Piotr Syga
During Financial Cryptography 2012 Chan et al. presented a novel privacy-protection fault-tolerant data aggregation protocol. Comparing to previous work, their scheme guaranteed provable privacy of individuals and could work even if some number of users refused to participate. In our paper we demonstrate that despite its merits, their method provides unacceptably low accuracy of aggregated data for a wide range of assumed parameters and cannot be used in majority of real-life systems. To show this we use both precise analytic and experimental methods. Additionally, we present a precise data aggregation protocol that provides provable level of security even facing massive failures of nodes. Moreover, the protocol requires significantly less computation (limited exploiting of heavy cryptography) than most of currently known fault tolerant aggregation protocols and offers better security guarantees that make it suitable for systems of limited resources (including sensor networks). To obtain our result we relax however the model and allow some limited communication between the nodes.