NE NCFeb 4, 2014

Short-term plasticity as cause-effect hypothesis testing in distal reward learning

arXiv:1402.0710v531 citations

Originality Incremental advance

AI Analysis

This addresses the plasticity-stability dilemma in distal reward learning for AI or neural systems, offering an incremental solution by interpreting short-term plasticity's role.

The paper tackles the problem of learning cause-effect relationships from ambiguous sensory-motor signals with delays, proposing a model that uses short-term plasticity to test hypotheses and consolidate them into long-term memory only when they consistently predict rewards, resulting in preserved network topologies and improved learning by biasing exploration toward reward-associated actions.

Asynchrony, overlaps and delays in sensory-motor signals introduce ambiguity as to which stimuli, actions, and rewards are causally related. Only the repetition of reward episodes helps distinguish true cause-effect relationships from coincidental occurrences. In the model proposed here, a novel plasticity rule employs short and long-term changes to evaluate hypotheses on cause-effect relationships. Transient weights represent hypotheses that are consolidated in long-term memory only when they consistently predict or cause future rewards. The main objective of the model is to preserve existing network topologies when learning with ambiguous information flows. Learning is also improved by biasing the exploration of the stimulus-response space towards actions that in the past occurred before rewards. The model indicates under which conditions beliefs can be consolidated in long-term memory, it suggests a solution to the plasticity-stability dilemma, and proposes an interpretation of the role of short-term plasticity.

View on arXiv PDF

Similar