ITMay 21
Improved Torn Paper Coding via Local AlignmentJunsheng Liu, Netanel Raviv
In the torn paper channel, a transmitted codeword is broken at random locations into fragments that arrive at the decoder in an unordered manner. A central theoretical challenge within this model is global alignment -- the task of determining each fragment's original position -- in order to faithfully reconstruct the entire codeword. Prior work by Shomorony and Vahid introduced an interleaved-pilot scheme that successfully achieved a vanishing error probability. However, their alignment strategy relies heavily on global statistics, requiring fragments to exceed a minimum length and effectively discarding many shorter ones as erasures, which results in rates significantly below capacity. To address this gap, we propose an improved coding scheme that achieves a provable rate increase through a novel approach we call \textit{local alignment}. This approach identifies global alignment bits within each fragment using only local information, allowing the decoder to determine the positions of fragments that are shorter than those used in previous work. Consequently, the decoder can extract information from a much larger fraction of the channel output than in previous work, yielding significantly higher rates. Furthermore, we extend our analysis to torn paper coding with lost pieces (TPC-LP), a generalized model that accounts for length-dependent fragment deletion. For a class of TPC-LP channels that delete all fragments below a logarithmic length threshold while allowing arbitrary length-dependent deletion probabilities for longer fragments, we show that the proposed local alignment strategy achieves an arbitrarily small additive gap to capacity as the threshold increases.
ITMay 7, 2024
Explicit Formula for Partial Information DecompositionAobo Lyu, Andrew Clark, Netanel Raviv
Mutual information between two random variables is a well-studied notion, whose understanding is fairly complete. Mutual information between one random variable and a pair of other random variables, however, is a far more involved notion. Specifically, Shannon's mutual information does not capture fine-grained interactions between those three variables, resulting in limited insights in complex systems. To capture these fine-grained interactions, in 2010 Williams and Beer proposed to decompose this mutual information to information atoms, called unique, redundant, and synergistic, and proposed several operational axioms that these atoms must satisfy. In spite of numerous efforts, a general formula which satisfies these axioms has yet to be found. Inspired by Judea Pearl's do-calculus, we resolve this open problem by introducing the do-operation, an operation over the variable system which sets a certain marginal to a desired value, which is distinct from any existing approaches. Using this operation, we provide the first explicit formula for calculating the information atoms so that Williams and Beer's axioms are satisfied, as well as additional properties from subsequent studies in the field.
ITApr 4
Structural Impossibility of Antichain-Lattice Partial Information DecompositionAobo Lyu, Andrew Clark, Netanel Raviv
Partial Information Decomposition (PID) represents multivariate mutual information via antichain-lattice that aims to specify which source groups can recover which informational components of a target. For three or more sources, widely desired PID axioms become mutually incompatible. This is often treated as an axiomatic tuning issue. This paper argues that the obstruction is representational, rooted in the antichain indexing itself, so that purely axiomatic adjustments within an antichain-lattice structure cannot resolve it in general. We first introduce System Information Decomposition (SID) for the special target-free three-variable setting, obtaining a self-consistent entropy decomposition with an operational redundancy definition. More fundamentally, we then show that for general multivariate PID, there is no universal rule that recovers the decomposed mutual information from the antichain-indexed information atoms. In particular, two systems can share identical atoms regardless of any axioms while having different mutual information. These results reveal the limits of antichain-lattice and motivate relation-based foundations for multivariate information measures.
LGNov 15, 2023
Gram-Schmidt Methods for Unsupervised Feature Extraction and SelectionBahram Yaghooti, Netanel Raviv, Bruno Sinopoli
Feature extraction and selection in the presence of nonlinear dependencies among the data is a fundamental challenge in unsupervised learning. We propose using a Gram-Schmidt (GS) type orthogonalization process over function spaces to detect and map out such dependencies. Specifically, by applying the GS process over some family of functions, we construct a series of covariance matrices that can either be used to identify new large-variance directions, or to remove those dependencies from known directions. In the former case, we provide information-theoretic guarantees in terms of entropy reduction. In the latter, we provide precise conditions by which the chosen function family eliminates existing redundancy in the data. Each approach provides both a feature extraction and a feature selection algorithm. Our feature extraction methods are linear, and can be seen as natural generalization of principal component analysis (PCA). We provide experimental results for synthetic and real-world benchmark datasets which show superior performance over state-of-the-art (linear) feature extraction and selection algorithms. Surprisingly, our linear feature extraction algorithms are comparable and often outperform several important nonlinear feature extraction methods such as autoencoders, kernel PCA, and UMAP. Furthermore, one of our feature selection algorithms strictly generalizes a recent Fourier-based feature selection mechanism (Heidari et al., IEEE Transactions on Information Theory, 2022), yet at significantly reduced complexity.
ITFeb 11
Multivariate Partial Information Decomposition: Constructions, Inconsistencies, and Alternative MeasuresAobo Lyu, Andrew Clark, Netanel Raviv
While mutual information effectively quantifies dependence between two variables, it does not by itself reveal the complex, fine-grained interactions among variables, i.e., how multiple sources contribute redundantly, uniquely, or synergistically to a target in multivariate settings. The Partial Information Decomposition (PID) framework was introduced to address this by decomposing the mutual information between a set of source variables and a target variable into fine-grained information atoms such as redundant, unique, and synergistic components. In this work, we review the axiomatic system and desired properties of the PID framework and make three main contributions. First, we resolve the two-source PID case by providing explicit closed-form formulas for all information atoms that satisfy the full set of axioms and desirable properties. Second, we prove that for three or more sources, PID suffers from fundamental inconsistencies: we review the known three-variable counterexample where the sum of atoms exceeds the total information, and extend it to a comprehensive impossibility theorem showing that no lattice-based decomposition can be consistent for all subsets when the number of sources exceeds three. Finally, we deviate from the PID lattice approach to avoid its inconsistencies, and present explicit measures of multivariate unique and synergistic information. Our proposed measures, which rely on new systems of random variables that eliminate higher-order dependencies, satisfy key axioms such as additivity and continuity, provide a robust theoretical explanation of high-order relations, and show strong numerical performance in comprehensive experiments on the Ising model. Our findings highlight the need for a new framework for studying multivariate information decomposition.
ITNov 3, 2025
Efficient Vector Symbolic Architectures from Histogram RecoveryZirui Deng, Netanel Raviv
Vector symbolic architectures (VSAs) are a family of information representation techniques which enable composition, i.e., creating complex information structures from atomic vectors via binding and superposition, and have recently found wide ranging applications in various neurosymbolic artificial intelligence (AI) systems. Recently, Raviv proposed the use of random linear codes in VSAs, suggesting that their subcode structure enables efficient binding, while preserving the quasi-orthogonality that is necessary for neural processing. Yet, random linear codes are difficult to decode under noise, which severely limits the resulting VSA's ability to support recovery, i.e., the retrieval of information objects and their attributes from a noisy compositional representation. In this work we bridge this gap by utilizing coding theoretic tools. First, we argue that the concatenation of Reed-Solomon and Hadamard codes is suitable for VSA, due to the mutual quasi-orthogonality of the resulting codewords (a folklore result). Second, we show that recovery of the resulting compositional representations can be done by solving a problem we call histogram recovery. In histogram recovery, a collection of $N$ histograms over a finite field is given as input, and one must find a collection of Reed-Solomon codewords of length $N$ whose entry-wise symbol frequencies obey those histograms. We present an optimal solution to the histogram recovery problem by using algorithms related to list-decoding, and analyze the resulting noise resilience. Our results give rise to a noise-resilient VSA with formal guarantees regarding efficient encoding, quasi-orthogonality, and recovery, without relying on any heuristics or training, and while operating at improved parameters relative to similar solutions such as the Hadamard code.
ITMay 11
Closed-Form Gaussian Estimators for Multi-Source Partial Information DecompositionAobo Lyu, Andrew Clark, Netanel Raviv
Computing multi-source partial information decomposition (PID) for continuous data is hard: existing closed-form Gaussian estimators are restricted to two source variables, while continuous arbitrary-source estimators are typically learning-based and do not provide closed-form expressions. To address this, we develop closed-form Gaussian estimators for multi-source PID. We provide two-source redundancy, multi-source unique information, the K-th order synergistic effect from source subsets of size K, and the total synergistic effect. The estimators are derived from the conditional-independence-based information measures introduced in our earlier work, under which every quantity reduces to a log-determinant expression in covariance blocks of the system. The resulting estimator is plug-in consistent, affine invariant, source-permutation symmetric, and additive over independent systems. We validate it on a controlled Gaussian benchmark, evaluate its computational efficiency against baselines, and confirm its numerical stability in finite-sample regimes. To our knowledge, this is the first covariance-based closed-form estimator that provides multi-source continuous PID measures.
CRJun 14, 2021
Multivariate Public Key Cryptosystem from Sidon SpacesNetanel Raviv, Ben Langton, Itzhak Tamo
A Sidon space is a subspace of an extension field over a base field in which the product of any two elements can be factored uniquely, up to constants. This paper proposes a new public-key cryptosystem of the multivariate type which is based on Sidon spaces, and has the potential to remain secure even if quantum supremacy is attained. This system, whose security relies on the hardness of the well-known MinRank problem, is shown to be resilient to several straightforward algebraic attacks. In particular, it is proved that the two popular attacks on the MinRank problem, the kernel attack, and the minor attack, succeed only with exponentially small probability. The system is implemented in software, and its hardness is demonstrated experimentally.
LGJun 8, 2021
Enhancing Robustness of Neural Networks through Fourier StabilizationNetanel Raviv, Aidan Kelley, Michael Guo et al.
Despite the considerable success of neural networks in security settings such as malware detection, such models have proved vulnerable to evasion attacks, in which attackers make slight changes to inputs (e.g., malware) to bypass detection. We propose a novel approach, \emph{Fourier stabilization}, for designing evasion-robust neural networks with binary inputs. This approach, which is complementary to other forms of defense, replaces the weights of individual neurons with robust analogs derived using Fourier analytic tools. The choice of which neurons to stabilize in a neural network is then a combinatorial optimization problem, and we propose several methods for approximately solving it. We provide a formal bound on the per-neuron drop in accuracy due to Fourier stabilization, and experimentally demonstrate the effectiveness of the proposed approach in boosting robustness of neural networks in several detection settings. Moreover, we show that our approach effectively composes with adversarial training.
CROct 30, 2020
Low Latency Cross-Shard Transactions in Coded BlockchainCanran Wang, Netanel Raviv
Although blockchain, the supporting technology of Bitcoin and various cryptocurrencies, has offered a potentially effective framework for numerous applications, it still suffers from the adverse affects of the impossibility triangle. Performance, security, and decentralization of blockchains normally do not scale simultaneously with the number of participants in the network. The recent introduction of error correcting codes in sharded blockchain by Li et al. partially settles this trilemma, boosting throughput without compromising security and decentralization. In this paper, we improve the coded sharding scheme in three ways. First, we propose a novel 2-Dimensional Sharding strategy, which inherently supports cross-shard transactions, alleviating the need for complicated inter-shard communication protocols. Second, we employ distributed storage techniques in the propagation of blocks, improving latency under restricted bandwidth. Finally, we incorporate polynomial cryptographic primitives of low degree, which brings coded blockchain techniques into the realm of feasible real-world parameters.
LGApr 22, 2020
CodNN -- Robust Neural Networks From Coded ClassificationNetanel Raviv, Siddharth Jain, Pulakesh Upadhyaya et al.
Deep Neural Networks (DNNs) are a revolutionary force in the ongoing information revolution, and yet their intrinsic properties remain a mystery. In particular, it is widely known that DNNs are highly sensitive to noise, whether adversarial or random. This poses a fundamental challenge for hardware implementations of DNNs, and for their deployment in critical applications such as autonomous driving. In this paper we construct robust DNNs via error correcting codes. By our approach, either the data or internal layers of the DNN are coded with error correcting codes, and successful computation under noise is guaranteed. Since DNNs can be seen as a layered concatenation of classification tasks, our research begins with the core task of classifying noisy coded inputs, and progresses towards robust DNNs. We focus on binary data and linear codes. Our main result is that the prevalent parity code can guarantee robustness for a large family of DNNs, which includes the recently popularized binarized neural networks. Further, we show that the coded classification problem has a deep connection to Fourier analysis of Boolean functions. In contrast to existing solutions in the literature, our results do not rely on altering the training process of the DNN, and provide mathematically rigorous guarantees rather than experimental evidence.
LGJan 9, 2020
What is the Value of Data? On Mathematical Methods for Data Quality EstimationNetanel Raviv, Siddharth Jain, Jehoshua Bruck
Data is one of the most important assets of the information age, and its societal impact is undisputed. Yet, rigorous methods of assessing the quality of data are lacking. In this paper, we propose a formal definition for the quality of a given dataset. We assess a dataset's quality by a quantity we call the expected diameter, which measures the expected disagreement between two randomly chosen hypotheses that explain it, and has recently found applications in active learning. We focus on Boolean hyperplanes, and utilize a collection of Fourier analytic, algebraic, and probabilistic methods to come up with theoretical guarantees and practical solutions for the computation of the expected diameter. We also study the behaviour of the expected diameter on algebraically structured datasets, conduct experiments that validate this notion of quality, and demonstrate the feasibility of our techniques.
ITDec 10, 2018
Private Polynomial Computation from Lagrange EncodingNetanel Raviv, David A. Karpuk
Private computation is a generalization of private information retrieval, in which a user is able to compute a function on a distributed dataset without revealing the identity of that function to the servers. In this paper it is shown that Lagrange encoding, a powerful technique for encoding Reed-Solomon codes, enables private computation in many cases of interest. In particular, we present a scheme that enables private computation of polynomials of any degree on Lagrange encoded data, while being robust to Byzantine and straggling servers, and to servers colluding to attempt to deduce the identities of the functions to be evaluated. Moreover, incorporating ideas from the well-known Shamir secret sharing scheme allows the data itself to be concealed from the servers as well. Our results extend private computation to high degree polynomials and to data-privacy, and reveal a tight connection between private computation and coded computation.
ITDec 4, 2018
Private Information Retrieval in Graph Based Replication SystemsNetanel Raviv, Itzhak Tamo, Eitan Yaakobi
In a Private Information Retrieval (PIR) protocol, a user can download a file from a database without revealing the identity of the file to each individual server. A PIR protocol is called $t$-private if the identity of the file remains concealed even if $t$ of the servers collude. Graph based replication is a simple technique, which is prevalent in both theory and practice, for achieving erasure robustness in storage systems. In this technique each file is replicated on two or more storage servers, giving rise to a (hyper-)graph structure. In this paper we study private information retrieval protocols in graph based replication systems. The main interest of this work is maximizing the parameter $t$, and in particular, understanding the structure of the colluding sets which emerge in a given graph. Our main contribution is a $2$-replication scheme which guarantees perfect privacy from acyclic sets in the graph, and guarantees partial-privacy in the presence of cycles. Furthermore, by providing an upper bound, it is shown that the PIR rate of this scheme is at most a factor of two from its optimal value for an important family of graphs. Lastly, we extend our results to larger replication factors and to graph-based coding, which is a similar technique with smaller storage overhead and larger PIR rate.
ITJun 4, 2018
Lagrange Coded Computing: Optimal Design for Resiliency, Security and PrivacyQian Yu, Songze Li, Netanel Raviv et al.
We consider a scenario involving computations over a massive dataset stored distributedly across multiple workers, which is at the core of distributed learning algorithms. We propose Lagrange Coded Computing (LCC), a new framework to simultaneously provide (1) resiliency against stragglers that may prolong computations; (2) security against Byzantine (or malicious) workers that deliberately modify the computation for their benefit; and (3) (information-theoretic) privacy of the dataset amidst possible collusion of workers. LCC, which leverages the well-known Lagrange polynomial to create computation redundancy in a novel coded form across workers, can be applied to any computation scenario in which the function of interest is an arbitrary multivariate polynomial of the input dataset, hence covering many computations of interest in machine learning. LCC significantly generalizes prior works to go beyond linear computations. It also enables secure and private computing in distributed settings, improving the computation and communication efficiency of the state-of-the-art. Furthermore, we prove the optimality of LCC by showing that it achieves the optimal tradeoff between resiliency, security, and privacy, i.e., in terms of tolerating the maximum number of stragglers and adversaries, and providing data privacy against the maximum number of colluding workers. Finally, we show via experiments on Amazon EC2 that LCC speeds up the conventional uncoded implementation of distributed least-squares linear regression by up to $13.43\times$, and also achieves a $2.36\times$-$12.65\times$ speedup over the state-of-the-art straggler mitigation strategies.
ITJul 12, 2017
Gradient Coding from Cyclic MDS Codes and Expander GraphsNetanel Raviv, Itzhak Tamo, Rashish Tandon et al.
Gradient coding is a technique for straggler mitigation in distributed learning. In this paper we design novel gradient codes using tools from classical coding theory, namely, cyclic MDS codes, which compare favorably with existing solutions, both in the applicable range of parameters and in the complexity of the involved algorithms. Second, we introduce an approximate variant of the gradient coding problem, in which we settle for approximate gradient computation instead of the exact one. This approach enables graceful degradation, i.e., the $\ell_2$ error of the approximate gradient is a decreasing function of the number of stragglers. Our main result is that normalized adjacency matrices of expander graphs yield excellent approximate gradient codes, which enable significantly less computation compared to exact gradient coding, and guarantee faster convergence than trivial solutions under standard assumptions. We experimentally test our approach on Amazon EC2, and show that the generalization error of approximate gradient coding is very close to the full gradient while requiring significantly less computation from the workers.