LGMLJul 17, 2018

Airline Passenger Name Record Generation using Generative Adversarial Networks

arXiv:1807.06657v158 citations
Originality Incremental advance
AI Analysis

This work addresses data sharing problems for airlines and travel industry actors, but it is incremental as it adapts existing GAN techniques to a specific domain with categorical and numerical features.

The paper tackled the challenge of generating realistic synthetic Passenger Name Records (PNRs) to address data ownership issues in the travel industry, using a GAN-based method that achieved good distribution matching without memorization and enabled effective model training for business applications like client segmentation and nationality prediction.

Passenger Name Records (PNRs) are at the heart of the travel industry. Created when an itinerary is booked, they contain travel and passenger information. It is usual for airlines and other actors in the industry to inter-exchange and access each other's PNR, creating the challenge of using them without infringing data ownership laws. To address this difficulty, we propose a method to generate realistic synthetic PNRs using Generative Adversarial Networks (GANs). Unlike other GAN applications, PNRs consist of categorical and numerical features with missing/NaN values, which makes the use of GANs challenging. We propose a solution based on Cramér GANs, categorical feature embedding and a Cross-Net architecture. The method was tested on a real PNR dataset, and evaluated in terms of distribution matching, memorization, and performance of predictive models for two real business problems: client segmentation and passenger nationality prediction. Results show that the generated data matches well with the real PNRs without memorizing them, and that it can be used to train models for real business applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes