CV LGDec 1, 2022

Rethinking Two Consensuses of the Transferability in Deep Learning

Yixiong Chen, Jingxian Li, Chris Ding, Li Liu

arXiv:2212.00399v17.34 citationsh-index: 72

Originality Incremental advance

AI Analysis

This work provides new insights for researchers in deep learning by refining understanding of transferability, though it is incremental as it builds on existing consensuses.

The paper challenges two established consensuses on transferability in deep transfer learning by proposing a method to measure it, finding that larger data amounts and diversity in downstream tasks also reduce transferability, and lower layers are not the most transferable due to domain sensitivity.

Deep transfer learning (DTL) has formed a long-term quest toward enabling deep neural networks (DNNs) to reuse historical experiences as efficiently as humans. This ability is named knowledge transferability. A commonly used paradigm for DTL is firstly learning general knowledge (pre-training) and then reusing (fine-tuning) them for a specific target task. There are two consensuses of transferability of pre-trained DNNs: (1) a larger domain gap between pre-training and downstream data brings lower transferability; (2) the transferability gradually decreases from lower layers (near input) to higher layers (near output). However, these consensuses were basically drawn from the experiments based on natural images, which limits their scope of application. This work aims to study and complement them from a broader perspective by proposing a method to measure the transferability of pre-trained DNN parameters. Our experiments on twelve diverse image classification datasets get similar conclusions to the previous consensuses. More importantly, two new findings are presented, i.e., (1) in addition to the domain gap, a larger data amount and huge dataset diversity of downstream target task also prohibit the transferability; (2) although the lower layers learn basic image features, they are usually not the most transferable layers due to their domain sensitivity.

View on arXiv PDF

Similar