MLPs Compass: What is learned when MLPs are combined with PLMs?
This work addresses the problem of understanding and improving linguistic structure capture in PLMs for researchers, though it is incremental as it builds on known MLP capabilities.
The paper investigates whether adding Multilayer Perceptrons (MLPs) to pre-trained language models (PLMs) like BERT improves their ability to capture linguistic structure, finding through experiments on 10 probing tasks across three linguistic levels that MLPs enhance this comprehension.
While Transformer-based pre-trained language models and their variants exhibit strong semantic representation capabilities, the question of comprehending the information gain derived from the additional components of PLMs remains an open question in this field. Motivated by recent efforts that prove Multilayer-Perceptrons (MLPs) modules achieving robust structural capture capabilities, even outperforming Graph Neural Networks (GNNs), this paper aims to quantify whether simple MLPs can further enhance the already potent ability of PLMs to capture linguistic information. Specifically, we design a simple yet effective probing framework containing MLPs components based on BERT structure and conduct extensive experiments encompassing 10 probing tasks spanning three distinct linguistic levels. The experimental results demonstrate that MLPs can indeed enhance the comprehension of linguistic structure by PLMs. Our research provides interpretable and valuable insights into crafting variations of PLMs utilizing MLPs for tasks that emphasize diverse linguistic structures.