Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs
This addresses the challenge of making black-box LLMs more controllable and capable for users needing advanced reasoning and planning, though it is incremental as it builds on existing adaptation methods.
The paper tackles the problem of enhancing black-box large language models (LLMs) for complex tasks without access to their parameters by introducing Matryoshka Pilot, a lightweight controller that decomposes tasks into intermediate outputs, resulting in improved performance on diverse long-horizon tasks.
Despite the impressive generative abilities of black-box large language models (LLMs), their inherent opacity hinders further advancements in capabilities such as reasoning, planning, and personalization. Existing works aim to enhance LLM capabilities via domain-specific adaptation, which require additional training on accessible model parameters, an infeasible option for black-box LLMs. To address this challenge, we introduce Matryoshka Pilot (M-Pilot), a lightweight white-box LLM controller that guides a large-scale black-box LLM generator by decomposing complex tasks into a series of intermediate outputs. Specifically, we consider the black-box LLM as an environment, with M-Pilot serving as a policy to provide intermediate guidance through prompts for driving the black-box LLM. M-Pilot is trained to pivot the outputs of the black-box LLM aligning with preferences during iterative interaction, which enables controllable multi-turn generation and self-improvement in optimizing intermediate guidance. Empirical evaluations on diverse tasks demonstrate that our method effectively enhances the capabilities of black-box LLMs in complex, long-horizon tasks. Our code is publicly available at: https://github.com/lichangh20/Matryoshka.