LGApr 22, 2022
Quantum Semi-Supervised Kernel LearningSeyran Saeedi, Aliakbar Panahi, Tom Arodz
Quantum computing leverages quantum effects to build algorithms that are faster then their classical variants. In machine learning, for a given model architecture, the speed of training the model is typically determined by the size of the training dataset. Thus, quantum machine learning methods have the potential to facilitate learning using extremely large datasets. While the availability of data for training machine learning models is steadily increasing, oftentimes it is much easier to collect feature vectors that to obtain the corresponding labels. One of the approaches for addressing this issue is to use semi-supervised learning, which leverages not only the labeled samples, but also unlabeled feature vectors. Here, we present a quantum machine learning algorithm for training Semi-Supervised Kernel Support Vector Machines. The algorithm uses recent advances in quantum sample-based Hamiltonian simulation to extend the existing Quantum LS-SVM algorithm to handle the semi-supervised term in the loss. Through a theoretical study of the algorithm's computational complexity, we show that it maintains the same speedup as the fully-supervised Quantum LS-SVM.
LGNov 12, 2019
word2ket: Space-efficient Word Embeddings inspired by Quantum EntanglementAliakbar Panahi, Seyran Saeedi, Tom Arodz
Deep learning natural language processing models often use vector word embeddings, such as word2vec or GloVe, to represent words. A discrete sequence of words can be much more easily integrated with downstream neural layers if it is represented as a sequence of continuous vectors. Also, semantic relationships between words, learned from a text corpus, can be encoded in the relative configurations of the embedding vectors. However, storing and accessing embedding vectors for all words in a dictionary requires large amount of space, and may stain systems with limited GPU memory. Here, we used approaches inspired by quantum computing to propose two related methods, {\em word2ket} and {\em word2ketXS}, for storing word embedding matrix during training and inference in a highly efficient way. Our approach achieves a hundred-fold or more reduction in the space required to store the embeddings with almost no relative drop in accuracy in practical natural language processing tasks.
LGOct 18, 2019
Differentiable Combinatorial Losses through Generalized Gradients of Linear ProgramsXi Gao, Han Zhang, Aliakbar Panahi et al.
When samples have internal structure, we often see a mismatch between the objective optimized during training and the model's goal during inference. For example, in sequence-to-sequence modeling we are interested in high-quality translated sentences, but training typically uses maximum likelihood at the word level. The natural training-time loss would involve a combinatorial problem -- dynamic programming-based global sequence alignment -- but solutions to combinatorial problems are not differentiable with respect to their input parameters, so surrogate, differentiable losses are used instead. Here, we show how to perform gradient descent over combinatorial optimization algorithms that involve continuous parameters, for example edge weights, and can be efficiently expressed as linear programs. We demonstrate usefulness of gradient descent over combinatorial optimization in sequence-to-sequence modeling using differentiable encoder-decoder architecture with softmax or Gumbel-softmax, and in image classification in a weakly supervised setting where instead of the correct class for each photo, only groups of photos labeled with correct but unordered set of classes are available during training.
LGJul 30, 2019
Approximation Capabilities of Neural ODEs and Invertible Residual NetworksHan Zhang, Xi Gao, Jacob Unterman et al.
Neural ODEs and i-ResNet are recently proposed methods for enforcing invertibility of residual neural models. Having a generic technique for constructing invertible models can open new avenues for advances in learning systems, but so far the question of whether Neural ODEs and i-ResNets can model any continuous invertible function remained unresolved. Here, we show that both of these models are limited in their approximation capabilities. We then prove that any homeomorphism on a $p$-dimensional Euclidean space can be approximated by a Neural ODE operating on a $2p$-dimensional Euclidean space, and a similar result for i-ResNets. We conclude by showing that capping a Neural ODE or an i-ResNet with a single linear layer is sufficient to turn the model into a universal approximator for non-invertible continuous functions.
LGFeb 5, 2019
Quantum Sparse Support Vector MachinesSeyran Saeedi, Tom Arodz
We analyze the computational complexity of Quantum Sparse Support Vector Machine, a linear classifier that minimizes the hinge loss and the $L_1$ norm of the feature weights vector and relies on a quantum linear programming solver instead of a classical solver. Sparse SVM leads to sparse models that use only a small fraction of the input features in making decisions, and is especially useful when the total number of features, $p$, approaches or exceeds the number of training samples, $m$. We prove a $Ω(m)$ worst-case lower bound for computational complexity of any quantum training algorithm relying on black-box access to training samples; quantum sparse SVM has at least linear worst-case complexity. However, we prove that there are realistic scenarios in which a sparse linear classifier is expected to have high accuracy, and can be trained in sublinear time in terms of both the number of training samples and the number of features.