Analytics at Wharton Research

‘Children of the HMM’: Modeling Longitudinal Customer Behavior at Hulu.Com

Stand-alone marketing models are well-suited to deal with different behavioral features such as variation in transaction frequency (customer heterogeneity with latent classes), recency and attrition (“buy ‘till you die” models), and more general changes in customer transaction rates (hidden Markov models, HMMs). We unite these modeling approaches in a integrative framework as special cases or “children” of the HMM. We then selectively constrain the general model to assess the impact of each component on model performance. Instead of selecting latent-state models primarily using likelihood-based criteria, we favor a multi-faceted empirical evaluation using summaries of posterior predictive distributions; thus focusing model checking on managerially relevant features of the data, such as reach, frequency, and “streakiness.”

We apply our methods to daily viewing incidence data from Hulu.com, a leading U.S. online streaming video provider. We find that increasing model complexity can improve some aspects of model performance (as expected) but worsen others in non-obvious ways. For instance, only models allowing back-and-forth movements among latent states can capture streakiness (a pattern of growing importance given the increasing availability of digital media data); but as a trade-off, these models still perform worse than their simpler counterparts in both forecasting and capturing other audience measurement criteria. Finally, using machine-learning classification techniques, customers are grouped based on similar model fit and features of their past consumption patterns. This allows researchers and managers to portend the “winning model” prior to having to fit the models for all customers. We discuss the generality of the methods and findings for different mixes of patterns of customer behavior.

Keywords: Hidden Markov Models, Non-Stationarity, Customer Migration, Streakiness, Posterior Predictive Model Checking, Nested Models, Hierarchical Bayes, Machine Learning, Classification Trees