Explicit Learning and the LLM in Machine Translation
This addresses the problem of enabling LLMs to learn low-resource languages from grammar books, benefiting language preservation and translation, but it is incremental as it builds on prior work with controlled experiments.
The study investigated whether large language models (LLMs) can learn new languages from grammar book explanations, termed 'explicit learning,' and found they have a measurable capacity for this, though it decreases with linguistic complexity, with supervised fine-tuning improving performance but struggling to generalize to novel or complex features.
This study explores an LLM's ability to learn new languages using explanations found in a grammar book, a process we term "explicit learning." To rigorously assess this ability, we design controlled translation experiments between English and constructed languages generated, through specific cryptographic means, from Latin or French. Contrary to previous studies, our results demonstrate that LLMs do possess a measurable capacity for explicit learning. This ability, however, diminishes as the complexity of the linguistic phenomena to be learned increases. Supervised fine-tuning on ad hoc chains of thought significantly enhances LLM performance but struggles to generalize to typologically novel or more complex linguistic features. These findings point to the need for more diverse training sets and alternative fine-tuning strategies to further improve explicit learning by LLMs, benefiting low-resource languages typically described in grammar books but lacking extensive corpora.