MLJun 11, 2020
Pairwise Supervision Can Provably Elicit a Decision BoundaryHan Bao, Takuya Shimada, Liyuan Xu et al.
Similarity learning is a general problem to elicit useful representations by predicting the relationship between a pair of patterns. This problem is related to various important preprocessing tasks such as metric learning, kernel learning, and contrastive learning. A classifier built upon the representations is expected to perform well in downstream classification; however, little theory has been given in literature so far and thereby the relationship between similarity and classification has remained elusive. Therefore, we tackle a fundamental question: can similarity information provably leads a model to perform well in downstream classification? In this paper, we reveal that a product-type formulation of similarity learning is strongly related to an objective of binary classification. We further show that these two different problems are explicitly connected by an excess risk bound. Consequently, our results elucidate that similarity learning is capable of solving binary classification by directly eliciting a decision boundary.
LGJun 20, 2019
Data Interpolating Prediction: Alternative Interpretation of MixupTakuya Shimada, Shoichiro Yamaguchi, Kohei Hayashi et al.
Data augmentation by mixing samples, such as Mixup, has widely been used typically for classification tasks. However, this strategy is not always effective due to the gap between augmented samples for training and original samples for testing. This gap may prevent a classifier from learning the optimal decision boundary and increase the generalization error. To overcome this problem, we propose an alternative framework called Data Interpolating Prediction (DIP). Unlike common data augmentations, we encapsulate the sample-mixing process in the hypothesis class of a classifier so that train and test samples are treated equally. We derive the generalization bound and show that DIP helps to reduce the original Rademacher complexity. Also, we empirically demonstrate that DIP can outperform existing Mixup.
LGApr 26, 2019
Classification from Pairwise Similarities/Dissimilarities and Unlabeled Data via Empirical Risk MinimizationTakuya Shimada, Han Bao, Issei Sato et al.
Pairwise similarities and dissimilarities between data points might be easier to obtain than fully labeled data in real-world classification problems, e.g., in privacy-aware situations. To handle such pairwise information, an empirical risk minimization approach has been proposed, giving an unbiased estimator of the classification risk that can be computed only from pairwise similarities and unlabeled data. However, this direction cannot handle pairwise dissimilarities so far. On the other hand, semi-supervised clustering is one of the methods which can use both similarities and dissimilarities. Nevertheless, they typically require strong geometrical assumptions on the data distribution such as the manifold assumption, which may deteriorate the performance. In this paper, we derive an unbiased risk estimator which can handle all of similarities/dissimilarities and unlabeled data. We theoretically establish estimation error bounds and experimentally demonstrate the practical usefulness of our empirical risk minimization method.