Damian Konrad Kowalczyk

LGNov 16, 2022

CASPR: Customer Activity Sequence-based Prediction and Representation

Pin-Jung Chen, Sahil Bhatnagar, Sagar Goyal et al.

Tasks critical to enterprise profitability, such as customer churn prediction, fraudulent account detection or customer lifetime value estimation, are often tackled by models trained on features engineered from customer data in tabular format. Application-specific feature engineering adds development, operationalization and maintenance costs over time. Recent advances in representation learning present an opportunity to simplify and generalize feature engineering across applications. When applying these advancements to tabular data researchers deal with data heterogeneity, variations in customer engagement history or the sheer volume of enterprise datasets. In this paper, we propose a novel approach to encode tabular data containing customer transactions, purchase history and other interactions into a generic representation of a customer's association with the business. We then evaluate these embeddings as features to train multiple models spanning a variety of applications. CASPR, Customer Activity Sequence-based Prediction and Representation, applies Transformer architecture to encode activity sequences to improve model performance and avoid bespoke feature engineering across applications. Our experiments at scale validate CASPR for both small and large enterprise applications.

CVApr 26, 2020

On the Limits to Multi-Modal Popularity Prediction on Instagram -- A New Robust, Efficient and Explainable Baseline

Christoffer Riis, Damian Konrad Kowalczyk, Lars Kai Hansen

Our global population contributes visual content on platforms like Instagram, attempting to express themselves and engage their audiences, at an unprecedented and increasing rate. In this paper, we revisit the popularity prediction on Instagram. We present a robust, efficient, and explainable baseline for population-based popularity prediction, achieving strong ranking performance. We employ the latest methods in computer vision to maximize the information extracted from the visual modality. We use transfer learning to extract visual semantics such as concepts, scenes, and objects, allowing a new level of scrutiny in an extensive, explainable ablation study. We inform feature selection towards a robust and scalable model, but also illustrate feature interactions, offering new directions for further inquiry in computational social science. Our strongest models inform a lower limit to population-based predictability of popularity on Instagram. The models are immediately applicable to social media monitoring and influencer identification.

Damian Konrad Kowalczyk

2 Papers