Parameter sharing between dependency parsers for related languages
This work addresses the challenge of efficient multilingual parsing for related languages, offering a linguistically motivated solution with practical gains, though it is incremental in refining parameter-sharing strategies.
The paper tackled the problem of determining which parameters to share between neural dependency parsers for related languages, finding that sharing transition classifier parameters consistently improves performance, while sharing word and character LSTM parameters varies in usefulness. The result led to a proposed architecture with tunable sharing, achieving significant improvements over monolingual baselines.
Previous work has suggested that parameter sharing between transition-based neural dependency parsers for related languages can lead to better performance, but there is no consensus on what parameters to share. We present an evaluation of 27 different parameter sharing strategies across 10 languages, representing five pairs of related languages, each pair from a different language family. We find that sharing transition classifier parameters always helps, whereas the usefulness of sharing word and/or character LSTM parameters varies. Based on this result, we propose an architecture where the transition classifier is shared, and the sharing of word and character parameters is controlled by a parameter that can be tuned on validation data. This model is linguistically motivated and obtains significant improvements over a monolingually trained baseline. We also find that sharing transition classifier parameters helps when training a parser on unrelated language pairs, but we find that, in the case of unrelated languages, sharing too many parameters does not help.