Davide Di Gioia

AI
4papers
3citations
Novelty48%
AI Score45

4 Papers

AIApr 1
Cognitive Friction: A Decision-Theoretic Framework for Bounded Deliberation in Tool-Using Agents

Davide Di Gioia

Autonomous tool-using agents operating in networked environments must decide which information source to query and when to stop querying and act. Without principled bounds on information-acquisition costs, unconstrained agents exhibit systematic failure modes: excessive tool use under congestion, prolonged deliberation under time decay, and brittle behavior under ambiguous evidence. We propose the Triadic Cognitive Architecture (TCA), a unified decision-theoretic framework that formalizes these failure modes through the concept of Cognitive Friction. By synthesizing nonlinear filtering theory, congestion-dependent cost dynamics, and HJB optimal stopping, we model deliberation as a stochastic control problem over a joint belief-congestion state space, where information acquisition is explicitly priced by tool-dependent signal quality and live network load. Rather than relying on arbitrary heuristic stop-tokens or fixed query budgets, TCA derives an HJB-inspired stopping boundary and instantiates a computable rollout-based approximation of belief-dependent value-of-information with a net-utility halting condition. We validate the framework on two controlled simulation environments, the Emergency Medical Diagnostic Grid (EMDG) and the Network Security Triage Grid (NSTG), designed to isolate key decision-theoretic quantities under reproducible conditions. TCA reduces time-to-action while improving resource outcomes without degrading accuracy: over greedy baselines, TCA gains 36 viability points in EMDG and 33 integrity points in NSTG. Ablations confirm joint optimization of selection and stopping is essential; stopping rules alone recover at most 4 viability points. A sensitivity sweep over alpha, beta, lambda_S shows stable accuracy and interpretable tradeoffs; an empirical sweep over eta in {0, 0.1, 0.3, 0.5} confirms eta=0 is optimal on EMDG trajectories under high temporal urgency.

AIMar 30
Entropic Claim Resolution: Uncertainty-Driven Evidence Selection for RAG

Davide Di Gioia

Current Retrieval-Augmented Generation (RAG) systems predominantly rely on relevance-based dense retrieval, sequentially fetching documents to maximize semantic similarity with the query. However, in knowledge-intensive and real-world scenarios characterized by conflicting evidence or fundamental query ambiguity, relevance alone is insufficient for resolving epistemic uncertainty. We introduce Entropic Claim Resolution (ECR), a novel inference-time algorithm that reframes RAG reasoning as entropy minimization over competing semantic answer hypotheses. Unlike action-driven agentic frameworks (e.g., ReAct) or fixed-pipeline RAG architectures, ECR sequentially selects atomic evidence claims by maximizing Expected Entropy Reduction (EER), a decision-theoretic criterion for the value of information. The process dynamically terminates when the system reaches a mathematically defined state of epistemic sufficiency (H <= epsilon, subject to epistemic coherence). We integrate ECR into a production-grade multi-strategy retrieval pipeline (CSGR++) and analyze its theoretical properties. Our framework provides a rigorous foundation for uncertainty-aware evidence selection, shifting the paradigm from retrieving what is most relevant to retrieving what is most discriminative.

AIMar 17
Cascade-Aware Multi-Agent Routing: Spatio-Temporal Sidecars and Geometry-Switching

Davide Di Gioia

A common architectural pattern in advanced AI reasoning systems is the symbolic graph network: specialized agents or modules connected by delegation edges, routing tasks through a dynamic execution graph. Current schedulers optimize load and fitness but are geometry-blind: they do not model how failures propagate differently in tree-like versus cyclic regimes. In tree-like delegation, a single failure can cascade exponentially; in dense cyclic graphs, failures tend to self-limit. We identify this observability gap, quantify its system-level cost, and propose a lightweight mitigation. We formulate online geometry control for route-risk estimation on time-indexed execution graphs with route-local failure history. Our approach combines (i) a Euclidean spatio-temporal propagation baseline, (ii) a hyperbolic route-risk model with temporal decay (and optional burst excitation), and (iii) a learned geometry selector over structural features. The selector is a compact MLP (9->12->1) using six topology statistics plus three geometry-aware signals: BFS shell-growth slope, cycle-rank norm, and fitted Poincare curvature. On the Genesis 3 benchmark distribution, adaptive switching improves win rate in the hardest non_tree regime from 64-72% (fixed hyperbolic variants) to 92%, and achieves 87.2% overall win rate. To measure total system value, we compare against Genesis 3 routing without any spatio-temporal sidecar, using only native bandit/LinUCB signals (team fitness and mean node load). This baseline achieves 50.4% win rate overall and 20% in tree-like regimes; the full sidecar recovers 87.2% overall (+36.8 pp), with +48 to +68 pp gains in tree-like settings, consistent with a cascade-sensitivity analysis. Overall, a 133-parameter sidecar substantially mitigates geometry-blind failure propagation in one high-capability execution-graph system.

LGMar 23
Learning When to Act: Interval-Aware Reinforcement Learning with Predictive Temporal Structure

Davide Di Gioia

Autonomous agents operating in continuous environments must decide not only what to do, but when to act. We introduce a lightweight adaptive temporal control system that learns the optimal interval between cognitive ticks from experience, replacing ad hoc biologically inspired timers with a principled learned policy. The policy state is augmented with a predictive hyperbolic spread signal (a "curvature signal" shorthand) derived from hyperbolic geometry: the mean pairwise Poincare distance among n sampled futures embedded in the Poincare ball. High spread indicates a branching, uncertain future and drives the agent to act sooner; low spread signals predictability and permits longer rest intervals. We further propose an interval-aware reward that explicitly penalises inefficiency relative to the chosen wait time, correcting a systematic credit-assignment failure of naive outcome-based rewards in timing problems. We additionally introduce a joint spatio-temporal embedding (ATCPG-ST) that concatenates independently normalised state and position projections in the Poincare ball; spatial trajectory divergence provides an independent timing signal unavailable to the state-only variant (ATCPG-SO). This extension raises mean hyperbolic spread (kappa) from 1.88 to 3.37 and yields a further 5.8 percent efficiency gain over the state-only baseline. Ablation experiments across five random seeds demonstrate that (i) learning is the dominant efficiency factor (54.8 percent over no-learning), (ii) hyperbolic spread provides significant complementary gain (26.2 percent over geometry-free control), (iii) the combined system achieves 22.8 percent efficiency over the fixed-interval baseline, and (iv) adding spatial position information to the spread embedding yields an additional 5.8 percent.