When is Cognitive Radar Beneficial?
This work provides guidelines for radar engineers on selecting waveform strategies, though it is incremental in analyzing specific scenarios.
The paper investigates when online reinforcement learning-based cognitive radar outperforms rule-based adaptive waveform selection in dynamic spectrum access, finding that learning approaches generalize better for realistic channels but may underperform in short time-horizon problems due to convergence limitations.
When should an online reinforcement learning-based frequency agile cognitive radar be expected to outperform a rule-based adaptive waveform selection strategy? We seek insight regarding this question by examining a dynamic spectrum access scenario, in which the radar wishes to transmit in the widest unoccupied bandwidth during each pulse repetition interval. Online learning is compared to a fixed rule-based sense-and-avoid strategy. We show that given a simple Markov channel model, the problem can be examined analytically for simple cases via stochastic dominance. Additionally, we show that for more realistic channel assumptions, learning-based approaches demonstrate greater ability to generalize. However, for short time-horizon problems that are well-specified, we find that machine learning approaches may perform poorly due to the inherent limitation of convergence time. We draw conclusions as to when learning-based approaches are expected to be beneficial and provide guidelines for future study.