LGFeb 16, 2021
Constructing Multiclass Classifiers using Binary Classifiers Under Log-LossAssaf Ben-Yishai, Or Ordentlich
The construction of multiclass classifiers from binary elements is studied in this paper, and performance is quantified by the regret, defined with respect to the Bayes optimal log-loss. We discuss two known methods. The first is one vs. all (OVA), for which we prove that the multiclass regret is upper bounded by the sum of binary regrets of the constituent classifiers. The second is hierarchical classification, based on a binary tree. For this method we prove that the multiclass regret is exactly a weighted sum of constituent binary regrets where the weighing is determined by the tree structure. We also introduce a leverage-hierarchical classification method, which potentially yields smaller log-loss and regret. The advantages of these classification methods are demonstrated by simulation on both synthetic and real-life datasets.
ITAug 4, 2020
Simple Modulo can Significantly Outperform Deep Learning-based DeepcodeAssaf Ben-Yishai, Ofer Shayevitz
Deepcode (H.Kim et al.2018) is a recently suggested Deep Learning-based scheme for communication over the AWGN channel with noisy feedback, claimed to be superior to all previous schemes in the literature. Deepcode's use of nonlinear coding (via Deep Learning) has been inspired by known shortcomings (Y.-H. Kim et al 2007) of linear feedback schemes. In 2014, we presented a nonlinear feedback coding scheme based on a combination of the classical SK scheme and modulo-arithmetic, using a small number of elementary operations without any type of neural network. This Modulo-SK scheme has been omitted from the performance comparisons made in the Deepcode paper, due to its use of common randomness (dither), and in a later version since it was incorrectly interpreted as a variable-length coding scheme. However, the dither in Modulo-SK was used only for the standard purpose of tractable performance analysis, and is not required in practice. In this short note, we show that a fully-deterministic Modulo-SK (without dithering) can outperform Deepcode. For example, to attain an error probability of 10^(-4) at rate 1/3 Modulo-SK requires 3dB less feedback SNR than Deepcode. To attain an error probability of 10^(-6) with noiseless feedback, Deepcode requires 150 rounds of communication, whereas Modulo-SK requires only 15 rounds, even if the feedback is noisy (with 27dB SNR). We further address the numerical stability issues of the original SK scheme reported in the Deepcode paper, and explain how they can be avoided. We augment this report with an online-available, fully-functional Matlab simulation for both the classical and Modulo-SK schemes. Finally, note that Modulo-SK is by no means claimed to be the best possible solution; in particular, using deep learning in conjunction with modulo-arithmetic might lead to better designs, and remains a fascinating direction for future research.