LIFE -- an energy efficient advanced continual learning agentic AI framework for frontier systems
For HPC operators, LIFE offers a more adaptive and energy-efficient approach to network management, but the contribution is incremental as it combines existing ideas.
The paper proposes LIFE, an agentic AI framework for energy-efficient continual learning in HPC systems, combining orchestrator, context engineering, memory, and lattice learning to enable self-evolving network management. In a Kubernetes-like cluster, it detects and mitigates latency spikes for critical microservices.
The rapid advancement of AI has changed the character of HPC usage such as dimensioning, provisioning, and execution. Not only has energy demand been amplified, but existing rudimentary continual learning capabilities limit ability of AI to effectively manage HPCs. This paper reviews emerging directions beyond monolithic transformers, emphasizing agentic AI and brain inspired architectures as complementary paths toward sustainable, adaptive systems. We propose LIFE, a reasoning and Learning framework that is Incremental, Flexible, and Energy efficient that is implemented as an agent centric system rather than a single monolithic model. LIFE uniquely combines four components to realize self evolving network management and operations in HPCs. The components are an orchestrator, Agentic Context Engineering, a novel memory system, and information lattice learning. LIFE can also generalize to enable a variety of orthogonal use cases. We ground LIFE in a specific closed loop HPC operations example for detecting and mitigating latency spikes experienced by critical micro services running on a Kubernetes like cluster.