AdaTag: Multi-Attribute Value Extraction from Product Profiles with Adaptive Decoding
This work addresses the challenge of efficiently extracting multiple attribute values from product data for e-Commerce platforms, representing an incremental advancement in sequence labeling architectures.
The paper tackles the problem of multi-attribute value extraction from product profiles in e-Commerce by proposing AdaTag, which uses adaptive decoding with pretrained attribute embeddings, a hypernetwork, and a Mixture-of-Experts module to balance knowledge sharing and attribute specificity, resulting in marked improvements over previous methods on a real-world dataset.
Automatic extraction of product attribute values is an important enabling technology in e-Commerce platforms. This task is usually modeled using sequence labeling architectures, with several extensions to handle multi-attribute extraction. One line of previous work constructs attribute-specific models, through separate decoders or entirely separate models. However, this approach constrains knowledge sharing across different attributes. Other contributions use a single multi-attribute model, with different techniques to embed attribute information. But sharing the entire network parameters across all attributes can limit the model's capacity to capture attribute-specific characteristics. In this paper we present AdaTag, which uses adaptive decoding to handle extraction. We parameterize the decoder with pretrained attribute embeddings, through a hypernetwork and a Mixture-of-Experts (MoE) module. This allows for separate, but semantically correlated, decoders to be generated on the fly for different attributes. This approach facilitates knowledge sharing, while maintaining the specificity of each attribute. Our experiments on a real-world e-Commerce dataset show marked improvements over previous methods.