LG SPJan 2, 2024

Reinforcement Learning for SAR View Angle Inversion with Differentiable SAR Renderer

Yanni Wang, Hecheng Jia, Shilei Fu, Huiping Lin, Feng Xu

arXiv:2401.01165v14.62 citationsh-index: 8

Originality Highly original

AI Analysis

This work addresses the electromagnetic inverse problem for SAR imaging, which is incremental as it builds on existing learning-based approaches by incorporating a differentiable renderer and reinforcement learning to handle data scarcity and background interference.

The study tackled the problem of reversing radar view angles in SAR images given a target model by proposing an interactive deep reinforcement learning framework with a differentiable SAR renderer, which outperformed reference methods in cross-domain applications by mitigating inconsistency between simulated and real domains.

The electromagnetic inverse problem has long been a research hotspot. This study aims to reverse radar view angles in synthetic aperture radar (SAR) images given a target model. Nonetheless, the scarcity of SAR data, combined with the intricate background interference and imaging mechanisms, limit the applications of existing learning-based approaches. To address these challenges, we propose an interactive deep reinforcement learning (DRL) framework, where an electromagnetic simulator named differentiable SAR render (DSR) is embedded to facilitate the interaction between the agent and the environment, simulating a human-like process of angle prediction. Specifically, DSR generates SAR images at arbitrary view angles in real-time. And the differences in sequential and semantic aspects between the view angle-corresponding images are leveraged to construct the state space in DRL, which effectively suppress the complex background interference, enhance the sensitivity to temporal variations, and improve the capability to capture fine-grained information. Additionally, in order to maintain the stability and convergence of our method, a series of reward mechanisms, such as memory difference, smoothing and boundary penalty, are utilized to form the final reward function. Extensive experiments performed on both simulated and real datasets demonstrate the effectiveness and robustness of our proposed method. When utilized in the cross-domain area, the proposed method greatly mitigates inconsistency between simulated and real domains, outperforming reference methods significantly.

View on arXiv PDF

Similar