LG AIAug 2, 2023

EmbeddingTree: Hierarchical Exploration of Entity Features in Embedding

Yan Zheng, Junpeng Wang, Chin-Chia Michael Yeh, Yujie Fan, Huiyuan Chen, Liang Wang, Wei Zhang

arXiv:2308.01329v16.64 citationsh-index: 26

Originality Incremental advance

AI Analysis

This work addresses the lack of structural interpretation in embedding learning, which is a problem for researchers and practitioners needing to understand and manipulate embeddings, though it is incremental as it builds on existing embedding methods.

The paper tackles the problem of interpreting how features are encoded in learned embedding spaces by proposing EmbeddingTree, a hierarchical exploration algorithm that relates entity semantics to embedding vectors, and demonstrates its efficacy on industry-scale merchant data and a public music dataset.

Embedding learning transforms discrete data entities into continuous numerical representations, encoding features/properties of the entities. Despite the outstanding performance reported from different embedding learning algorithms, few efforts were devoted to structurally interpreting how features are encoded in the learned embedding space. This work proposes EmbeddingTree, a hierarchical embedding exploration algorithm that relates the semantics of entity features with the less-interpretable embedding vectors. An interactive visualization tool is also developed based on EmbeddingTree to explore high-dimensional embeddings. The tool helps users discover nuance features of data entities, perform feature denoising/injecting in embedding training, and generate embeddings for unseen entities. We demonstrate the efficacy of EmbeddingTree and our visualization tool through embeddings generated for industry-scale merchant data and the public 30Music listening/playlists dataset.

View on arXiv PDF

Similar