Bulut Kuskonmaz

ITFeb 2, 2022

Investigation of Alternative Measures for Mutual Information

Bulut Kuskonmaz, Jaron Skovsted Gundersen, Rafal Wisniewski

Mutual information $I(X;Y)$ is a useful definition in information theory to estimate how much information the random variable $Y$ holds about the random variable $X$. One way to define the mutual information is by comparing the joint distribution of $X$ and $Y$ with the product of the marginals through the KL-divergence. If the two distributions are close to each other there will be almost no leakage of $X$ from $Y$ since the two variables are close to being independent. In the discrete setting the mutual information has the nice interpretation of how many bits $Y$ reveals about $X$ and if $I(X;Y)=H(X)$ (the Shannon entropy of $X$) then $X$ is completely revealed. However, in the continuous case we do not have the same reasoning. For instance the mutual information can be infinite in the continuous case. This fact enables us to try different metrics or divergences to define the mutual information. In this paper, we are evaluating different metrics or divergences such as Kullback-Liebler (KL) divergence, Wasserstein distance, Jensen-Shannon divergence and total variation distance to form alternatives to the mutual information in the continuous case. We deploy different methods to estimate or bound these metrics and divergences and evaluate their performances.

LGSep 23, 2021

Secure PAC Bayesian Regression via Real Shamir Secret Sharing

Jaron Skovsted Gundersen, Bulut Kuskonmaz, Rafael Wisniewski

A common approach of system identification and machine learning is to generate a model by using training data to predict the test data instances as accurate as possible. Nonetheless, concerns about data privacy are increasingly raised, but not always addressed. We present a secure protocol for learning a linear model relying on recently described technique called real number secret sharing. We take as our starting point the PAC Bayesian bounds and deduce a closed form for the model parameters which depends on the data and the prior from the PAC Bayesian bounds. To obtain the model parameters one needs to solve a linear system. However, we consider the situation where several parties hold different data instances and they are not willing to give up the privacy of the data. Hence, we suggest to use real number secret sharing and multiparty computation to share the data and solve the linear regression in a secure way without violating the privacy of data. We suggest two methods; a secure inverse method and a secure Gaussian elimination method, and compare these methods at the end. The benefit of using secret sharing directly on real numbers is reflected in the simplicity of the protocols and the number of rounds needed. However, this comes with the drawback that a share might leak a small amount of information, but in our analysis we argue that the leakage is small.

Bulut Kuskonmaz

2 Papers