GMNov 9, 2013
Logarithms and Square Roots of Real MatricesJean Gallier
In these notes, we consider the problem of finding the logarithm or the square root of a real matrix. It is known that for every real n x n matrix, A, if no real eigenvalue of A is negative or zero, then A has a real logarithm, that is, there is a real matrix, X, such that e^X = A. Furthermore, if the eigenvalues, xi, of X satisfy the property -pi < Im(xi) < pi, then X is unique. It is also known that under the same condition every real n x n matrix, A, has a real square root, that is, there is a real matrix, X, such that X^2 = A. Moreover, if the eigenvalues, rho e^{i theta}, of X satisfy the condition -pi/2 < theta < pi/2, then X is unique. These theorems are the theoretical basis for various numerical methods for exponentiating a matrix or for computing its logarithm using a method known as scaling and squaring (resp. inverse scaling and squaring). Such methods play an important role in the log-Euclidean framework due to Arsigny, Fillard, Pennec and Ayache and its applications to medical imaging. Actually, there is a necessary and sufficient condition for a real matrix to have a real logarithm (or a real square root) but it is fairly subtle as it involves the parity of the number of Jordan blocks associated with negative eigenvalues. As far as I know, with the exception of Higham's recent book, proofs of these results are scattered in the literature and it is not easy to locate them. Moreover, Higham's excellent book assumes a certain level of background in linear algebra that readers interested in the topics of this paper may not possess so we feel that a more elementary presentation might be a valuable supplement to Higham. In these notes, I present a unified exposition of these results and give more direct proofs of some of them using the Real Jordan Form.
CLJan 20, 2016
Semantic Word Clusters Using Signed Normalized Graph CutsJoão Sedoc, Jean Gallier, Lyle Ungar et al.
Vector space representations of words capture many aspects of word similarity, but such methods tend to make vector spaces in which antonyms (as well as synonyms) are close to each other. We present a new signed spectral normalized graph cut algorithm, signed clustering, that overlays existing thesauri upon distributionally derived vector representations of words, so that antonym relationships between word pairs are represented by negative weights. Our signed clustering algorithm produces clusters of words which simultaneously capture distributional and synonym relations. We evaluate these clusters against the SimLex-999 dataset (Hill et al.,2014) of human judgments of word pair similarities, and also show the benefit of using our clusters to predict the sentiment of a given text.
LGJan 18, 2016
Spectral Theory of Unsigned and Signed Graphs. Applications to Graph Clustering: a SurveyJean Gallier
This is a survey of the method of graph cuts and its applications to graph clustering of weighted unsigned and signed graphs. I provide a fairly thorough treatment of the method of normalized graph cuts, a deeply original method due to Shi and Malik, including complete proofs. The main thrust of this paper is the method of normalized cuts. I give a detailed account for K = 2 clusters, and also for K > 2 clusters, based on the work of Yu and Shi. I also show how both graph drawing and normalized cut K-clustering can be easily generalized to handle signed graphs, which are weighted graphs in which the weight matrix W may have negative coefficients. Intuitively, negative coefficients indicate distance or dissimilarity. The solution is to replace the degree matrix by the matrix in which absolute values of the weights are used, and to replace the Laplacian by the Laplacian with the new degree matrix of absolute values. As far as I know, the generalization of K-way normalized clustering to signed graphs is new. Finally, I show how the method of ratio cuts, in which a cut is normalized by the size of the cluster rather than its volume, is just a special case of normalized cuts.
CVNov 11, 2013
Notes on Elementary Spectral Graph Theory. Applications to Graph Clustering Using Normalized CutsJean Gallier
These are notes on the method of normalized graph cuts and its applications to graph clustering. I provide a fairly thorough treatment of this deeply original method due to Shi and Malik, including complete proofs. I include the necessary background on graphs and graph Laplacians. I then explain in detail how the eigenvectors of the graph Laplacian can be used to draw a graph. This is an attractive application of graph Laplacians. The main thrust of this paper is the method of normalized cuts. I give a detailed account for K = 2 clusters, and also for K > 2 clusters, based on the work of Yu and Shi. Three points that do not appear to have been clearly articulated before are elaborated: 1. The solutions of the main optimization problem should be viewed as tuples in the K-fold cartesian product of projective space RP^{N-1}. 2. When K > 2, the solutions of the relaxed problem should be viewed as elements of the Grassmannian G(K,N). 3. Two possible Riemannian distances are available to compare the closeness of solutions: (a) The distance on (RP^{N-1})^K. (b) The distance on the Grassmannian. I also clarify what should be the necessary and sufficient conditions for a matrix to represent a partition of the vertices of a graph to be clustered.
NANov 9, 2013
Remarks on the Cayley Representation of Orthogonal Matrices and on Perturbing the Diagonal of a Matrix to Make it InvertibleJean Gallier
This note contains two remarks. The first remark concerns the extension of the well-known Cayley representation of rotation matrices by skew symmetric matrices to rotation matrices admitting -1 as an eigenvalue and then to all orthogonal matrices. We review a method due to Hermann Weyl and another method involving multiplication by a diagonal matrix whose entries are +1 or -1. The second remark has to do with ways of flipping the signs of the entries of a diagonal matrix, C, with nonzero diagonal entries, obtaining a new matrix, E, so that E + A is invertible, where A is any given matrix (invertible or not).