ITJan 24, 2022
Insertion and Deletion Correction in Polymer-based Data StorageAnisha Banerjee, Antonia Wachter-Zeh, Eitan Yaakobi
Synthetic polymer-based storage seems to be a particularly promising candidate that could help to cope with the ever-increasing demand for archival storage requirements. It involves designing molecules of distinct masses to represent the respective bits $\{0,1\}$, followed by the synthesis of a polymer of molecular units that reflects the order of bits in the information string. Reading out the stored data requires the use of a tandem mass spectrometer, that fragments the polymer into shorter substrings and provides their corresponding masses, from which the \emph{composition}, i.e. the number of $1$s and $0$s in the concerned substring can be inferred. Prior works have dealt with the problem of unique string reconstruction from the set of all possible compositions, called \emph{composition multiset}. This was accomplished either by determining which string lengths always allow unique reconstruction, or by formulating coding constraints to facilitate the same for all string lengths. Additionally, error-correcting schemes to deal with substitution errors caused by imprecise fragmentation during the readout process, have also been suggested. This work builds on this research by generalizing previously considered error models, mainly confined to substitution of compositions. To this end, we define new error models that consider insertions of spurious compositions and deletions of existing ones, thereby corrupting the composition multiset. We analyze if the reconstruction codebook proposed by Pattabiraman \emph{et al.} is indeed robust to such errors, and if not, propose new coding constraints to remedy this.
LGJun 25, 2023
Private Aggregation in Hierarchical Wireless Federated Learning with Partial and Full CollusionMaximilian Egger, Christoph Hofmeister, Antonia Wachter-Zeh et al.
In federated learning, a federator coordinates the training of a model, e.g., a neural network, on privately owned data held by several participating clients. The gradient descent algorithm, a well-known and popular iterative optimization procedure, is run to train the model. Every client computes partial gradients based on their local data and sends them to the federator, which aggregates the results and updates the model. Privacy of the clients' data is a major concern. In fact, it is shown that observing the partial gradients can be enough to reveal the clients' data. Existing literature focuses on private aggregation schemes that tackle the privacy problem in federated learning in settings where all users are connected to each other and to the federator. In this paper, we consider a hierarchical wireless system architecture in which the clients are connected to base stations; the base stations are connected to the federator either directly or through relays. We examine settings with and without relays, and derive fundamental limits on the communication cost under information-theoretic privacy with different collusion assumptions. We introduce suitable private aggregation schemes tailored for these settings whose communication costs are multiplicative factors away from the derived bounds.
ITDec 8, 2022
Secure Over-the-Air Computation using Zero-Forced Artificial NoiseLuis Maßny, Antonia Wachter-Zeh
Over-the-air computation has the potential to increase the communication-efficiency of data-dependent distributed wireless systems, but is vulnerable to eavesdropping. We consider over-the-air computation over block-fading additive white Gaussian noise channels in the presence of a passive eavesdropper. The goal is to design a secure over-the-air computation scheme. We propose a scheme that achieves MSE-security against the eavesdropper by employing zero-forced artificial noise, while keeping the distortion at the legitimate receiver small. In contrast to former approaches, the security does not depend on external helper nodes to jam the eavesdropper's received signal. We thoroughly design the system parameters of the scheme, propose an artificial noise design that harnesses unused transmit power for security, and give an explicit construction rule. Our design approach is applicable in both cases, if the eavesdropper's channel coefficients are known and if they are unknown in the signal design. Simulations demonstrate the performance, and show that our noise design outperforms other methods.
LGJul 16, 2024
Self-Regulating Random Walks for Resilient Decentralized Learning on GraphsMaximilian Egger, Rawad Bitar, Ghadir Ayache et al.
Consider the setting of multiple random walks (RWs) on a graph executing a certain computational task. For instance, in decentralized learning via RWs, a model is updated at each iteration based on the local data of the visited node and then passed to a randomly chosen neighbor. RWs can fail due to node or link failures. The goal is to maintain a desired number of RWs to ensure failure resilience. Achieving this is challenging due to the lack of a central entity to track which RWs have failed to replace them with new ones by forking (duplicating) surviving ones. Without duplications, the number of RWs will eventually go to zero, causing a catastrophic failure of the system. We propose two decentralized algorithms called DecAFork and DecAFork+ that can maintain the number of RWs in the graph around a desired value even in the presence of arbitrary RW failures. Nodes continuously estimate the number of surviving RWs by estimating their return time distribution and fork the RWs when failures are likely to happen. DecAFork+ additionally allows terminations to avoid overloading the network by forking too many RWs. We present extensive numerical simulations that show the performance of DecAFork and DecAFork+ regarding fast detection and reaction to failures compared to a baseline, and establish theoretical guarantees on the performance of both algorithms.
ITSep 25, 2023
Sequential Decoding of Convolutional Codes for Synchronization ErrorsAnisha Banerjee, Andreas Lenz, Antonia Wachter-Zeh
Sequential decoding, commonly applied to substitution channels, is a sub-optimal alternative to Viterbi decoding with significantly reduced memory costs. In this work, a sequential decoder for convolutional codes over channels that are prone to insertion, deletion, and substitution errors, is described and analyzed. Our decoder expands the code trellis by a new channel-state variable, called drift state, as proposed by Davey and MacKay. A suitable decoding metric on that trellis for sequential decoding is derived, generalizing the original Fano metric. The decoder is also extended to facilitate the simultaneous decoding of multiple received sequences that arise from a single transmitted sequence. Under low-noise environments, our decoding approach reduces the decoding complexity by a couple orders of magnitude in comparison to Viterbi's algorithm, albeit at slightly higher bit error rates. An analytical method to determine the computational cutoff rate is also suggested. This analysis is supported with numerical evaluations of bit error rates and computational complexity, which are compared with respect to optimal Viterbi decoding.
ITJul 16, 2024
Scalable and Reliable Over-the-Air Federated Edge LearningMaximilian Egger, Christoph Hofmeister, Cem Kaya et al.
Federated edge learning (FEEL) has emerged as a core paradigm for large-scale optimization. However, FEEL still suffers from a communication bottleneck due to the transmission of high-dimensional model updates from the clients to the federator. Over-the-air computation (AirComp) leverages the additive property of multiple-access channels by aggregating the clients' updates over the channel to save communication resources. While analog uncoded transmission can benefit from the increased signal-to-noise ratio (SNR) due to the simultaneous transmission of many clients, potential errors may severely harm the learning process for small SNRs. To alleviate this problem, channel coding approaches were recently proposed for AirComp in FEEL. However, their error-correction capability degrades with an increasing number of clients. We propose a digital lattice-based code construction with constant error-correction capabilities in the number of clients, and compare to nested-lattice codes, well-known for their optimal rate and power efficiency in the point-to-point AWGN channel.
LGJan 31, 2025
BICompFL: Stochastic Federated Learning with Bi-Directional CompressionMaximilian Egger, Rawad Bitar, Antonia Wachter-Zeh et al.
We address the prominent communication bottleneck in federated learning (FL). We specifically consider stochastic FL, in which models or compressed model updates are specified by distributions rather than deterministic parameters. Stochastic FL offers a principled approach to compression, and has been shown to reduce the communication load under perfect downlink transmission from the federator to the clients. However, in practice, both the uplink and downlink communications are constrained. We show that bi-directional compression for stochastic FL has inherent challenges, which we address by introducing BICompFL. Our BICompFL is experimentally shown to reduce the communication cost by an order of magnitude compared to multiple benchmarks, while maintaining state-of-the-art accuracies. Theoretically, we study the communication cost of BICompFL through a new analysis of an importance-sampling based technique, which exposes the interplay between uplink and downlink communication costs.
38.8ITApr 1
SynDe: Syndrome-guided Decoding of Raw Nanopore ReadsAnisha Banerjee, Roman Sokolovskii, Thomas Heinis et al.
Nanopore sequencing technology remains highly error-prone, making efficient error correction essential in DNA-based data storage. Prior work addressed high error rates using convolutional codes with their decoder coupled with the basecaller, but such approaches only accommodate a limited number of code classes and incur significant decoding complexity. To overcome these limitations, we propose two algorithms: PrimerSeeker, which efficiently detects primer sequences in raw nanopore sequencing reads, and SynDe, a decoder that operates on the same raw reads and supports any linear error correction code with a low-complexity graphical representation. PrimerSeeker provides primer location estimates close to those of existing approaches while being better suited for real-time primer detection during sequencing. SynDe performs well with convolutional codes augmented with periodic markers, often approaching or exceeding the performance of existing algorithms with a lower time complexity. Remarkably, the confidence scores produced by SynDe reliably identify which of its outputs should be discarded.
CRMay 11, 2025
Source Anonymity for Private Random Walk Decentralized LearningMaximilian Egger, Svenja Lage, Rawad Bitar et al.
This paper considers random walk-based decentralized learning, where at each iteration of the learning process, one user updates the model and sends it to a randomly chosen neighbor until a convergence criterion is met. Preserving data privacy is a central concern and open problem in decentralized learning. We propose a privacy-preserving algorithm based on public-key cryptography and anonymization. In this algorithm, the user updates the model and encrypts the result using a distant user's public key. The encrypted result is then transmitted through the network with the goal of reaching that specific user. The key idea is to hide the source's identity so that, when the destination user decrypts the result, it does not know who the source was. The challenge is to design a network-dependent probability distribution (at the source) over the potential destinations such that, from the receiver's perspective, all users have a similar likelihood of being the source. We introduce the problem and construct a scheme that provides anonymity with theoretical guarantees. We focus on random regular graphs to establish rigorous guarantees.
LGMay 9, 2023
FedGT: Identification of Malicious Clients in Federated Learning with Secure AggregationMarvin Xhemrishi, Johan Östman, Antonia Wachter-Zeh et al.
We propose FedGT, a novel framework for identifying malicious clients in federated learning with secure aggregation. Inspired by group testing, the framework leverages overlapping groups of clients to identify the presence of malicious clients in the groups via a decoding operation. The clients identified as malicious are then removed from the model training, which is performed over the remaining clients. By choosing the size, number, and overlap between groups, FedGT strikes a balance between privacy and security. Specifically, the server learns the aggregated model of the clients in each group - vanilla federated learning and secure aggregation correspond to the extreme cases of FedGT with group size equal to one and the total number of clients, respectively. The effectiveness of FedGT is demonstrated through extensive experiments on the MNIST, CIFAR-10, and ISIC2019 datasets in a cross-silo setting under different data-poisoning attacks. These experiments showcase FedGT's ability to identify malicious clients, resulting in high model utility. We further show that FedGT significantly outperforms the private robust aggregation approach based on the geometric median recently proposed by Pillutla et al. in multiple settings.
ITFeb 28, 2022
Computational Code-Based Privacy in Coded Federated LearningMarvin Xhemrishi, Alexandre Graell i Amat, Eirik Rosnes et al.
We propose a privacy-preserving federated learning (FL) scheme that is resilient against straggling devices. An adaptive scenario is suggested where the slower devices share their data with the faster ones and do not participate in the learning process. The proposed scheme employs code-based cryptography to ensure \emph{computational} privacy of the private data, i.e., no device with bounded computational power can obtain information about the other devices' data in feasible time. For a scenario with 25 devices, the proposed scheme achieves a speed-up of 4.7 and 4 for 92 and 128 bits security, respectively, for an accuracy of 95\% on the MNIST dataset compared with conventional mini-batch FL.
ITFeb 16, 2022
Cost-Efficient Distributed Learning via Combinatorial Multi-Armed BanditsMaximilian Egger, Rawad Bitar, Antonia Wachter-Zeh et al.
We consider the distributed SGD problem, where a main node distributes gradient calculations among $n$ workers. By assigning tasks to all the workers and waiting only for the $k$ fastest ones, the main node can trade-off the algorithm's error with its runtime by gradually increasing $k$ as the algorithm evolves. However, this strategy, referred to as adaptive $k$-sync, neglects the cost of unused computations and of communicating models to workers that reveal a straggling behavior. We propose a cost-efficient scheme that assigns tasks only to $k$ workers, and gradually increases $k$. We introduce the use of a combinatorial multi-armed bandit model to learn which workers are the fastest while assigning gradient calculations. Assuming workers with exponentially distributed response times parameterized by different means, we give empirical and theoretical guarantees on the regret of our strategy, i.e., the extra time spent to learn the mean response times of the workers. Furthermore, we propose and analyze a strategy applicable to a large class of response time distributions. Compared to adaptive $k$-sync, our scheme achieves significantly lower errors with the same computational efforts and less downlink communication while being inferior in terms of speed.
ITDec 4, 2021
Analysis of Communication Channels Related to Physical Unclonable FunctionsGeorg Maringer, Marvin Xhemrishi, Sven Puchinger et al.
Cryptographic algorithms rely on the secrecy of their corresponding keys. On embedded systems with standard CMOS chips, where secure permanent memory such as flash is not available as a key storage, the secret key can be derived from Physical Unclonable Functions (PUFs) that make use of minuscule manufacturing variations of, for instance, SRAM cells. Since PUFs are affected by environmental changes, the reliable reproduction of the PUF key requires error correction. For silicon PUFs with binary output, errors occur in the form of bitflips within the PUFs response. Modelling the channel as a Binary Symmetric Channel (BSC) with fixed crossover probability $p$ is only a first-order approximation of the real behavior of the PUF response. We propose a more realistic channel model, refered to as the Varying Binary Symmetric Channel (VBSC), which takes into account that the reliability of different PUF response bits may not be equal. We investigate its channel capacity for various scenarios which differ in the channel state information (CSI) present at encoder and decoder. We compare the capacity results for the VBSC for the different CSI cases with reference to the distribution of the bitflip probability according a work by Maes et al.
ITSep 21, 2020
On Software Implementation of Gabidulin DecodersJohannes Kunz, Julian Renner, Georg Maringer et al.
This work compares the performance of software implementations of different Gabidulin decoders. The parameter sets used within the comparison stem from their applications in recently proposed cryptographic schemes. The complexity analysis of the decoders is recalled, counting the occurrence of each operation within the respective decoders. It is shown that knowing the number of operations may be misleading when comparing different algorithms as the run-time of the implementation depends on the instruction set of the device on which the algorithm is executed.
ITSep 18, 2020
Information- and Coding-Theoretic Analysis of the RLWE ChannelGeorg Maringer, Sven Puchinger, Antonia Wachter-Zeh
Several cryptosystems based on the \emph{Ring Learning with Errors} (RLWE) problem have been proposed within the NIST post-quantum cryptography standardization process, e.g., NewHope. Furthermore, there are systems like Kyber which are based on the closely related MLWE assumption. Both previously mentioned schemes result in a non-zero decryption failure rate (DFR). The combination of encryption and decryption for these kinds of algorithms can be interpreted as data transmission over a noisy channel. To the best of our knowledge this paper is the first work that analyzes the capacity of this channel. We show how to modify the encryption schemes such that the input alphabets of the corresponding channels are increased. In particular, we present lower bounds on their capacities which show that the transmission rate can be significantly increased compared to standard proposals in the literature. Furthermore, under the common assumption of stochastically independent coefficient failures, we give lower bounds on achievable rates based on both the Gilbert-Varshamov bound and concrete code constructions using BCH codes. By means of our constructions, we can either increase the total bitrate (by a factor of $1.84$ for Kyber and by factor of $7$ for NewHope) while guaranteeing the same DFR or for the same bitrate, we can significantly reduce the DFR for all schemes considered in this work (e.g., for NewHope from $2^{-216}$ to $2^{-12769}$).
ITJan 14, 2020
Low-Rank Parity-Check Codes over the Ring of Integers Modulo a Prime PowerJulian Renner, Sven Puchinger, Antonia Wachter-Zeh et al.
We define and analyze low-rank parity-check (LRPC) codes over extension rings of the finite chain ring $\mathbb{Z}_{p^r}$, where $p$ is a prime and $r$ is a positive integer. LRPC codes have originally been proposed by Gaborit et al.(2013) over finite fields for cryptographic applications. The adaption to finite rings is inspired by a recent paper by Kamche et al. (2019), which constructed Gabidulin codes over finite principle ideal rings with applications to space-time codes and network coding. We give a decoding algorithm based on simple linear-algebraic operations. Further, we derive an upper bound on the failure probability of the decoder. The upper bound is valid for errors whose rank is equal to the free rank.
ITNov 29, 2019
Randomized Decoding of Gabidulin Codes Beyond the Unique Decoding RadiusJulian Renner, Thomas Jerkovits, Hannes Bartz et al.
We address the problem of decoding Gabidulin codes beyond their unique error-correction radius. The complexity of this problem is of importance to assess the security of some rank-metric code-based cryptosystems. We propose an approach that introduces row or column erasures to decrease the rank of the error in order to use any proper polynomial-time Gabidulin code error-erasure decoding algorithm. This approach improves on generic rank-metric decoders by an exponential factor.
ITJan 29, 2019
Interleaving Loidreau's Rank-Metric CryptosystemJulian Renner, Sven Puchinger, Antonia Wachter-Zeh
We propose and analyze an interleaved variant of Loidreau's rank-metric cryptosystem based on rank multipliers. We analyze and adapt several attacks on the system, propose design rules, and study weak keys. Finding secure instances requires near-MRD rank-metric codes which are not investigated in the literature. Thus, we propose a random code construction that makes use of the fact that short random codes over large fields are MRD with high probability. We derive an upper bound on the decryption failure rate and give example parameters for potential key size reduction.
CRDec 12, 2018
LIGA: A Cryptosystem Based on the Hardness of Rank-Metric List and Interleaved DecodingJulian Renner, Sven Puchinger, Antonia Wachter-Zeh
We propose the new rank-metric code-based cryptosystem LIGA which is based on the hardness of list decoding and interleaved decoding of Gabidulin codes. LIGA is an improved variant of the Faure-Loidreau (FL) system, which was broken in a structural attack by Gaborit, Otmani, and Talé Kalachi (GOT, 2018). We keep the FL encryption and decryption algorithms, but modify the insecure key generation algorithm. Our crucial observation is that the GOT attack is equivalent to decoding an interleaved Gabidulin code. The new key generation algorithm constructs public keys for which all polynomial-time interleaved decoders fail---hence LIGA resists the GOT attack. We also prove that the public-key encryption version of LIGA is IND-CPA secure in the standard model and the KEM version is IND-CCA2 secure in the random oracle model, both under hardness assumptions of formally defined problems related to list decoding and interleaved decoding of Gabidulin codes. We propose and analyze various exponential-time attacks on these problems, calculate their work factors, and compare the resulting parameters to NIST proposals. The strengths of LIGA are short ciphertext sizes and (relatively) small key sizes. Further, LIGA guarantees correct decryption and has no decryption failure rate. It is not based on hiding the structure of a code. Since there are efficient and constant-time algorithms for encoding and decoding Gabidulin codes, timing attacks on the encryption and decryption algorithms can be easily prevented.
ITSep 9, 2018
A Public-Key Cryptosystem from Interleaved Goppa CodesMolka Elleuch, Antonia Wachter-Zeh, Alexander Zeh
In this paper, a code-based public-key cryptosystem based on interleaved Goppa codes is presented. The scheme is based on encrypting several ciphertexts with the same Goppa code and adding a burst error to them. Possible attacks are outlined and the key size of several choices of parameters is compared to those of known schemes for the same security level. For example, for security level 128 bits, we obtain a key size of 696 Kbits whereas the classical McEliece scheme based on Goppa codes using list decoding requires a key size of 1935 Kbits.
ITJun 26, 2018
Twisted Gabidulin Codes in the GPT CryptosystemSven Puchinger, Julian Renner, Antonia Wachter-Zeh
In this paper, we investigate twisted Gabidulin codes in the GPT code-based public-key cryptosystem. We show that Overbeck's attack is not feasible for a subfamily of twisted Gabidulin codes. The resulting key sizes are significantly lower than in the original McEliece system and also slightly smaller than in Loidreau's unbroken GPT variant.
CRJan 11, 2018
Repairing the Faure-Loidreau Public-Key CryptosystemAntonia Wachter-Zeh, Sven Puchinger, Julian Renner
A repair of the Faure-Loidreau (FL) public-key code-based cryptosystem is proposed. The FL cryptosystem is based on the hardness of list decoding Gabidulin codes which are special rank-metric codes. We prove that the recent structural attack on the system by Gaborit et al. is equivalent to decoding an interleaved Gabidulin code. Since all known polynomial-time decoders for these codes fail for a large constructive class of error patterns, we are able to construct public keys that resist the attack. It is also shown that all other known attacks fail for our repair and parameter choices. Compared to other code-based cryptosystems, we obtain significantly smaller key sizes for the same security level.