Ren Tao

h-index1
2papers

2 Papers

74.8LGMay 6
Conceptors for Semantic Steering

Ilias Triantafyllopoulos, Young-Min Cho, Ren Tao et al.

Activation-based steering provides control of LLM behavior at inference time, but the dominant paradigm reduces each concept to a single direction whose geometry is left largely unexamined. Rather than selecting a single steering direction, we use conceptors: soft projection matrices estimated from activations pooled across both poles of a bipolar concept, which preserve the concept's full multidimensional subspace. A geometric analysis shows the bipolar subspace strictly subsumes the single-vector baseline. We further show that the conceptor quota provides a parameter-free layer-selection diagnostic, predicting concept separability with Pearson correlations up to r=0.96 across three instruction-tuned models and three semantic dimensions. Beyond selection, conceptors admit a closed-form Boolean algebra (AND, OR, NOT): we evaluate conceptor compositionality on thematically related sub-concepts. Across a systematic five-axis design-space evaluation, conceptors match or outperform additive baselines at layers where concept subspaces are multi-dimensional while producing substantially fewer degenerate outputs. Conceptor steering is a geometrically principled, compositional, and practically safer alternative to single-direction steering from a limited number of contrastive pairs.

LGAug 21, 2025
An Efficient Open World Environment for Multi-Agent Social Learning

Eric Ye, Ren Tao, Natasha Jaques

Many challenges remain before AI agents can be deployed in real-world environments. However, one virtue of such environments is that they are inherently multi-agent and contain human experts. Using advanced social intelligence in such an environment can help an AI agent learn adaptive skills and behaviors that a known expert exhibits. While social intelligence could accelerate training, it is currently difficult to study due to the lack of open-ended multi-agent environments. In this work, we present an environment in which multiple self-interested agents can pursue complex and independent goals, reflective of real world challenges. This environment will enable research into the development of socially intelligent AI agents in open-ended multi-agent settings, where agents may be implicitly incentivized to cooperate to defeat common enemies, build and share tools, and achieve long horizon goals. In this work, we investigate the impact on agent performance due to social learning in the presence of experts and implicit cooperation such as emergent collaborative tool use, and whether agents can benefit from either cooperation or competition in this environment.