Fast Online "Next Best Offers" using Deep Learning
This work addresses the need for fast, real-time recommendations in online settings, representing an incremental improvement with specific optimizations for efficiency.
The paper tackles the problem of low-latency online recommendation for 'next-best-offers' by presenting iPrescribe, a scalable architecture that achieves a 90th percentile latency of 38 milliseconds using an ensemble of deep learning and machine learning algorithms.
In this paper, we present iPrescribe, a scalable low-latency architecture for recommending 'next-best-offers' in an online setting. The paper presents the design of iPrescribe and compares its performance for implementations using different real-time streaming technology stacks. iPrescribe uses an ensemble of deep learning and machine learning algorithms for prediction. We describe the scalable real-time streaming technology stack and optimized machine-learning implementations to achieve a 90th percentile recommendation latency of 38 milliseconds. Optimizations include a novel mechanism to deploy recurrent Long Short Term Memory (LSTM) deep learning networks efficiently.