Multi-Granularity Regularized Re-Balancing for Class Incremental Learning
This addresses data imbalance issues in incremental learning for applications like fault diagnosis, but it is incremental as it builds on existing re-balancing methods.
The paper tackles catastrophic forgetting in class incremental learning by proposing Multi-Granularity Regularized re-Balancing (MGRB), an assumption-agnostic method that uses re-balancing and a novel regularization term based on class hierarchies to improve learning of both old and new classes, with experimental validation on public and real-world datasets.
Deep learning models suffer from catastrophic forgetting when learning new tasks incrementally. Incremental learning has been proposed to retain the knowledge of old classes while learning to identify new classes. A typical approach is to use a few exemplars to avoid forgetting old knowledge. In such a scenario, data imbalance between old and new classes is a key issue that leads to performance degradation of the model. Several strategies have been designed to rectify the bias towards the new classes due to data imbalance. However, they heavily rely on the assumptions of the bias relation between old and new classes. Therefore, they are not suitable for complex real-world applications. In this study, we propose an assumption-agnostic method, Multi-Granularity Regularized re-Balancing (MGRB), to address this problem. Re-balancing methods are used to alleviate the influence of data imbalance; however, we empirically discover that they would under-fit new classes. To this end, we further design a novel multi-granularity regularization term that enables the model to consider the correlations of classes in addition to re-balancing the data. A class hierarchy is first constructed by grouping the semantically or visually similar classes. The multi-granularity regularization then transforms the one-hot label vector into a continuous label distribution, which reflects the relations between the target class and other classes based on the constructed class hierarchy. Thus, the model can learn the inter-class relational information, which helps enhance the learning of both old and new classes. Experimental results on both public datasets and a real-world fault diagnosis dataset verify the effectiveness of the proposed method.