Eliav Buchnik

h-index6

3papers

21citations

Novelty55%

AI Score24

Ranked #173,152 of 194,257 authors (top 89%)#37,533 in LG (top 93%)

3 Papers

1.2LGMay 31, 2020

Graph Learning with Loss-Guided Training

Eliav Buchnik, Edith Cohen

Classically, ML models trained with stochastic gradient descent (SGD) are designed to minimize the average loss per example and use a distribution of training examples that remains {\em static} in the course of training. Research in recent years demonstrated, empirically and theoretically, that significant acceleration is possible by methods that dynamically adjust the training distribution in the course of training so that training is more focused on examples with higher loss. We explore {\em loss-guided training} in a new domain of node embedding methods pioneered by {\sc DeepWalk}. These methods work with implicit and large set of positive training examples that are generated using random walks on the input graph and therefore are not amenable for typical example selection methods. We propose computationally efficient methods that allow for loss-guided training in this framework. Our empirical evaluation on a rich collection of datasets shows significant acceleration over the baseline static methods, both in terms of total training performed and overall computation.

2.2LGMar 14, 2018

Eliav Buchnik, Edith Cohen, Avinatan Hassidim et al.

Optimization of machine learning models is commonly performed through stochastic gradient updates on randomly ordered training examples. This practice means that sub-epochs comprise of independent random samples of the training data that may not preserve informative structure present in the full data. We hypothesize that the training can be more effective with {\em self-similar} arrangements that potentially allow each epoch to provide benefits of multiple ones. We study this for "matrix factorization" -- the common task of learning metric embeddings of entities such as queries, videos, or words from example pairwise associations. We construct arrangements that preserve the weighted Jaccard similarities of rows and columns and experimentally observe training acceleration of 3\%-37\% on synthetic and recommendation datasets. Principled arrangements of training examples emerge as a novel and potentially powerful enhancement to SGD that merits further exploration.

7.3LGMar 7, 2017

Bootstrapped Graph Diffusions: Exposing the Power of Nonlinearity

Eliav Buchnik, Edith Cohen

Graph-based semi-supervised learning (SSL) algorithms predict labels for all nodes based on provided labels of a small set of seed nodes. Classic methods capture the graph structure through some underlying diffusion process that propagates through the graph edges. Spectral diffusion, which includes personalized page rank and label propagation, propagates through random walks. Social diffusion propagates through shortest paths. A common ground to these diffusions is their {\em linearity}, which does not distinguish between contributions of few "strong" relations and many "weak" relations. Recently, non-linear methods such as node embeddings and graph convolutional networks (GCN) demonstrated a large gain in quality for SSL tasks. These methods introduce multiple components and greatly vary on how the graph structure, seed label information, and other features are used. We aim here to study the contribution of non-linearity, as an isolated ingredient, to the performance gain. To do so, we place classic linear graph diffusions in a self-training framework. Surprisingly, we observe that SSL using the resulting {\em bootstrapped diffusions} not only significantly improves over the respective non-bootstrapped baselines but also outperform state-of-the-art non-linear SSL methods. Moreover, since the self-training wrapper retains the scalability of the base method, we obtain both higher quality and better scalability.