Scaling-Aware Adapter for Structure-Grounded LLM Reasoning
This addresses modality-specific bottlenecks in biomolecular structure reasoning for researchers in computational biology and AI, though it appears incremental as an architectural improvement over existing methods.
The paper tackles the problem of structural hallucinations and inflexible modality fusion in biomolecular structure reasoning with LLMs by introducing Cuttlefish, a unified all-atom LLM that adaptively scales structural tokens and grounds reasoning in geometric cues, achieving superior performance on diverse benchmarks.
Large language models (LLMs) are enabling reasoning over biomolecular structures, yet existing methods remain modality-specific and typically compress structural inputs through sequence-based tokenization or fixed-length query connectors. Such architectures either omit the geometric groundings requisite for mitigating structural hallucinations or impose inflexible modality fusion bottlenecks that concurrently over-compress and suboptimally allocate structural tokens, thereby impeding the realization of generalized all-atom reasoning. We introduce Cuttlefish, a unified all-atom LLM that grounds language reasoning in geometric cues while scaling modality tokens with structural complexity. First, Scaling-Aware Patching leverages an instruction-conditioned gating mechanism to generate variable-size patches over structural graphs, adaptively scaling the query token budget with structural complexity to mitigate fixed-length connector bottlenecks. Second, Geometry Grounding Adapter refines these adaptive tokens via cross-attention to modality embeddings and injects the resulting modality tokens into the LLM, exposing explicit geometric cues to reduce structural hallucination. Experiments across diverse all-atom benchmarks demonstrate that Cuttlefish achieves superior performance in heterogeneous structure-grounded reasoning. Code is available at the project repository.