Muhammed Murat Özbek

5.5ROOct 13, 2022Code

Harfang3D Dog-Fight Sandbox: A Reinforcement Learning Research Platform for the Customized Control Tasks of Fighter Aircrafts

Muhammed Murat Özbek, Süleyman Yıldırım, Muhammet Aksoy et al.

The advent of deep learning (DL) gave rise to significant breakthroughs in Reinforcement Learning (RL) research. Deep Reinforcement Learning (DRL) algorithms have reached super-human level skills when applied to vision-based control problems as such in Atari 2600 games where environment states were extracted from pixel information. Unfortunately, these environments are far from being applicable to highly dynamic and complex real-world tasks as in autonomous control of a fighter aircraft since these environments only involve 2D representation of a visual world. Here, we present a semi-realistic flight simulation environment Harfang3D Dog-Fight Sandbox for fighter aircrafts. It is aimed to be a flexible toolbox for the investigation of main challenges in aviation studies using Reinforcement Learning. The program provides easy access to flight dynamics model, environment states, and aerodynamics of the plane enabling user to customize any specific task in order to build intelligent decision making (control) systems via RL. The software also allows deployment of bot aircrafts and development of multi-agent tasks. This way, multiple groups of aircrafts can be configured to be competitive or cooperative agents to perform complicated tasks including Dog Fight. During the experiments, we carried out training for two different scenarios: navigating to a designated location and within visual range (WVR) combat, shortly Dog Fight. Using Deep Reinforcement Learning techniques for both scenarios, we were able to train competent agents that exhibit human-like behaviours. Based on this results, it is confirmed that Harfang3D Dog-Fight Sandbox can be utilized as a 3D realistic RL research platform.

2.5AIJan 14, 2022

Reinforcement Learning based Air Combat Maneuver Generation

Muhammed Murat Ozbek, Emre Koyuncu

The advent of artificial intelligence technology paved the way of many researches to be made within air combat sector. Academicians and many other researchers did a research on a prominent research direction called autonomous maneuver decision of UAV. Elaborative researches produced some outcomes, but decisions that include Reinforcement Learning(RL) came out to be more efficient. There have been many researches and experiments done to make an agent reach its target in an optimal way, most prominent are Genetic Algorithm(GA) , A star, RRT and other various optimization techniques have been used. But Reinforcement Learning is the well known one for its success. In DARPHA Alpha Dogfight Trials, reinforcement learning prevailed against a real veteran F16 human pilot who was trained by Boeing. This successor model was developed by Heron Systems. After this accomplishment, reinforcement learning bring tremendous attention on itself. In this research we aimed our UAV which has a dubin vehicle dynamic property to move to the target in two dimensional space in an optimal path using Twin Delayed Deep Deterministic Policy Gradients (TD3) and used in experience replay Hindsight Experience Replay(HER).We did tests on two different environments and used simulations.

Muhammed Murat Özbek

2 Papers