Active Learning at the ImageNet ScaleZeyad Ali Sami Emam, Hong-Min Chu, Ping-Yeh Chiang et al.
Active learning (AL) algorithms aim to identify an optimal subset of data for annotation, such that deep neural networks (DNN) can achieve better performance when trained on this labeled subset. AL is especially impactful in industrial scale settings where data labeling costs are high and practitioners use every tool at their disposal to improve model performance. The recent success of self-supervised pretraining (SSP) highlights the importance of harnessing abundant unlabeled data to boost model performance. By combining AL with SSP, we can make use of unlabeled data while simultaneously labeling and training on particularly informative samples. In this work, we study a combination of AL and SSP on ImageNet. We find that performance on small toy datasets -- the typical benchmark setting in the literature -- is not representative of performance on ImageNet due to the class imbalanced samples selected by an active learner. Among the existing baselines we test, popular AL algorithms across a variety of small and large scale settings fail to outperform random sampling. To remedy the class-imbalance problem, we propose Balanced Selection (BASE), a simple, scalable AL algorithm that outperforms random sampling consistently by selecting more balanced samples for annotation than existing methods. Our code is available at: https://github.com/zeyademam/active_learning .
9.0LGJul 26, 2020
WrapNet: Neural Net Inference with Ultra-Low-Resolution ArithmeticRenkun Ni, Hong-min Chu, Oscar Castañeda et al.
Low-resolution neural networks represent both weights and activations with few bits, drastically reducing the multiplication complexity. Nonetheless, these products are accumulated using high-resolution (typically 32-bit) additions, an operation that dominates the arithmetic complexity of inference when using extreme quantization (e.g., binary weights). To further optimize inference, we propose a method that adapts neural networks to use low-resolution (8-bit) additions in the accumulators, achieving classification accuracy comparable to their 32-bit counterparts. We achieve resilience to low-resolution accumulation by inserting a cyclic activation layer, as well as an overflow penalty regularizer. We demonstrate the efficacy of our approach on both software and hardware platforms.
Deep Learning with a Rethinking Structure for Multi-label ClassificationYao-Yuan Yang, Yi-An Lin, Hong-Min Chu et al.
Multi-label classification (MLC) is an important class of machine learning problems that come with a wide spectrum of applications, each demanding a possibly different evaluation criterion. When solving the MLC problems, we generally expect the learning algorithm to take the hidden correlation of the labels into account to improve the prediction performance. Extracting the hidden correlation is generally a challenging task. In this work, we propose a novel deep learning framework to better extract the hidden correlation with the help of the memory structure within recurrent neural networks. The memory stores the temporary guesses on the labels and effectively allows the framework to rethink about the goodness and correlation of the guesses before making the final prediction. Furthermore, the rethinking process makes it easy to adapt to different evaluation criteria to match real-world application needs. In particular, the framework can be trained in an end-to-end style with respect to any given MLC evaluation criteria. The end-to-end design can be seamlessly combined with other deep learning techniques to conquer challenging MLC problems like image tagging. Experimental results across many real-world data sets justify that the rethinking framework indeed improves MLC performance across different evaluation criteria and leads to superior performance over state-of-the-art MLC algorithms.
12.4LGAug 2, 2016
Can Active Learning Experience Be Transferred?Hong-Min Chu, Hsuan-Tien Lin
Active learning is an important machine learning problem in reducing the human labeling effort. Current active learning strategies are designed from human knowledge, and are applied on each dataset in an immutable manner. In other words, experience about the usefulness of strategies cannot be updated and transferred to improve active learning on other datasets. This paper initiates a pioneering study on whether active learning experience can be transferred. We first propose a novel active learning model that linearly aggregates existing strategies. The linear weights can then be used to represent the active learning experience. We equip the model with the popular linear upper- confidence-bound (LinUCB) algorithm for contextual bandit to update the weights. Finally, we extend our model to transfer the experience across datasets with the technique of biased regularization. Empirical studies demonstrate that the learned experience not only is competitive with existing strategies on most single datasets, but also can be transferred across datasets to improve the performance on future learning tasks.