Machine Learning to Predict Digital Frustration from Clickstream Data
This addresses the problem of user frustration for businesses to reduce lost sales and complaints, but it is incremental as it applies standard and LSTM methods to a specific domain.
The research tackled predicting user frustration in e-commerce sessions using clickstream data, achieving up to 91% accuracy and a ROC AUC of 0.9705 with an LSTM model, and showed reliable prediction within the first 20-30 interactions.
Many businesses depend on their mobile apps and websites, so user frustration while trying to complete a task on these channels can cause lost sales and complaints. In this research, I use clickstream data from a real e-commerce site to predict whether a session is frustrated or not. Frustration is defined using certain rules based on rage bursts, back and forth navigation (U turns), cart churn, search struggle, and long wandering sessions, and applies these rules to 5.4 million raw clickstream events (304,881 sessions). From each session, I build tabular features and train standard classifier models. I also use the full event sequence to train a discriminative LSTM classifier. XGBoost reaches about 90% accuracy, ROC AUC of 0.9579, while the LSTM performs best with about 91% accuracy and a ROC AUC of 0.9705. Finally, the research shows that with only the first 20 to 30 interactions, the LSTM already predicts frustration reliably.