LG AIFeb 28, 2024

Autoencoder-based General Purpose Representation Learning for Customer Embedding

Jan Henrik Bertrand, David B. Hoffmann, Jacopo Pio Gargano, Laurent Mombaerts, Jonathan Taws

arXiv:2402.18164v24.63 citationsh-index: 3

Originality Incremental advance

AI Analysis

This work addresses the problem of learning general-purpose embeddings for diverse entities in tabular data, which is incremental as it builds on existing autoencoder methods.

The paper tackled the challenge of representing complex tabular entities in a latent space by introducing DEEPCAE, a novel method for calculating regularization terms in multi-layer contractive autoencoders, which achieved a 34% improvement in reconstruction error compared to a stacked CAE across 13 datasets.

Recent advances in representation learning have successfully leveraged the underlying domain-specific structure of data across various fields. However, representing diverse and complex entities stored in tabular format within a latent space remains challenging. In this paper, we introduce DEEPCAE, a novel method for calculating the regularization term for multi-layer contractive autoencoders (CAEs). Additionally, we formalize a general-purpose entity embedding framework and use it to empirically show that DEEPCAE outperforms all other tested autoencoder variants in both reconstruction performance and downstream prediction performance. Notably, when compared to a stacked CAE across 13 datasets, DEEPCAE achieves a 34% improvement in reconstruction error.

View on arXiv PDF

Similar