6.0MAMar 23
Human-Inspired Pavlovian and Instrumental Learning for Autonomous Agent NavigationJingfeng Shan, Francesco Guidi, Mehrdad Saeidi et al.
Autonomous agents operating in uncertain environments must balance fast responses with goal-directed planning. Classical MF RL often converges slowly and may induce unsafe exploration, whereas MB methods are computationally expensive and sensitive to model mismatch. This paper presents a human-inspired hybrid RL architecture integrating Pavlovian, Instrumental MF, and Instrumental MB components. Inspired by Pavlovian and Instrumental learning from neuroscience, the framework considers contextual radio cues, here intended as georeferenced environmental features acting as CS, to shape intrinsic value signals and bias decision-making. Learning is further modulated by internal motivational drives through a dedicated motivational signal. A Bayesian arbitration mechanism adaptively blends MF and MB estimates based on predicted reliability. Simulation results show that the hybrid approach accelerates learning, improves operational safety, and reduces navigation in high-uncertainty regions compared to standard RL baselines. Pavlovian conditioning promotes safer exploration and faster convergence, while arbitration enables a smooth transition from exploration to efficient, plan-driven exploitation. Overall, the results highlight the benefits of biologically inspired modularity for robust and adaptive autonomous systems under uncertainty.
SPOct 31, 2023
Multi-Base Station Cooperative Sensing with AI-Aided TrackingElia Favarelli, Elisabetta Matricardi, Lorenzo Pucci et al.
In this work, we investigate the performance of a joint sensing and communication (JSC) network consisting of multiple base stations (BSs) that cooperate through a fusion center (FC) to exchange information about the sensed environment while concurrently establishing communication links with a set of user equipments (UEs). Each BS within the network operates as a monostatic radar system, enabling comprehensive scanning of the monitored area and generating range-angle maps that provide information regarding the position of a group of heterogeneous objects. The acquired maps are subsequently fused in the FC. Then, a convolutional neural network (CNN) is employed to infer the category of the targets, e.g., pedestrians or vehicles, and such information is exploited by an adaptive clustering algorithm to group the detections originating from the same target more effectively. Finally, two multi-target tracking algorithms, the probability hypothesis density (PHD) filter and multi-Bernoulli mixture (MBM) filter, are applied to estimate the state of the targets. Numerical results demonstrated that our framework could provide remarkable sensing performance, achieving an optimal sub-pattern assignment (OSPA) less than 60 cm, while keeping communication services to UEs with a reduction of the communication capacity in the order of 10% to 20%. The impact of the number of BSs engaged in sensing is also examined, and we show that in the specific case study, 3 BSs ensure a localization error below 1 m.
MLOct 4, 2019
Model Order Selection Based on Information Theoretic Criteria: Design of the PenaltyAndrea Mariani, Andrea Giorgetti, Marco Chiani
Information theoretic criteria (ITC) have been widely adopted in engineering and statistics for selecting, among an ordered set of candidate models, the one that better fits the observed sample data. The selected model minimizes a penalized likelihood metric, where the penalty is determined by the criterion adopted. While rules for choosing a penalty that guarantees a consistent estimate of the model order are known, theoretical tools for its design with finite samples have never been provided in a general setting. In this paper, we study model order selection for finite samples under a design perspective, focusing on the generalized information criterion (GIC), which embraces the most common ITC. The theory is general, and as case studies we consider: a) the problem of estimating the number of signals embedded in additive white Gaussian noise (AWGN) by using multiple sensors; b) model selection for the general linear model (GLM), which includes e.g. the problem of estimating the number of sinusoids in AWGN. The analysis reveals a trade-off between the probabilities of overestimating and underestimating the order of the model. We then propose to design the GIC penalty to minimize underestimation while keeping the overestimation probability below a specified level. For the considered problems, this method leads to analytical derivation of the optimal penalty for a given sample size. A performance comparison between the penalty optimized GIC and common AIC and BIC is provided, demonstrating the effectiveness of the proposed design strategy.