Affordances Enable Partial World Modeling with LLMs
This work addresses the problem of inefficient search in robotics for researchers, offering an incremental improvement by extracting partial models from large models.
The paper tackles the inefficiency of using large language models as full world models by proposing partial world models based on affordances, which in tabletop robotics tasks reduce search branching factor and achieve higher rewards compared to full models.
Full models of the world require complex knowledge of immense detail. While pre-trained large models have been hypothesized to contain similar knowledge due to extensive pre-training on vast amounts of internet scale data, using them directly in a search procedure is inefficient and inaccurate. Conversely, partial models focus on making high quality predictions for a subset of state and actions: those linked through affordances that achieve user intents~\citep{khetarpal2020can}. Can we posit large models as partial world models? We provide a formal answer to this question, proving that agents achieving task-agnostic, language-conditioned intents necessarily possess predictive partial-world models informed by affordances. In the multi-task setting, we introduce distribution-robust affordances and show that partial models can be extracted to significantly improve search efficiency. Empirical evaluations in tabletop robotics tasks demonstrate that our affordance-aware partial models reduce the search branching factor and achieve higher rewards compared to full world models.