When More Data Isn’t Better: Rethinking Granularity in Marketing Analytics

Choosing the right level of detail when analyzing data—whether by time, geography, or customer segment—is a daily challenge for marketing leaders. New research, by Mingyung Kim (former Wharton PhD student and now Assistant Professor at Ohio State) and her co-dissertation advisors, Eric Bradlow, Wharton Marketing Professor and Vice Dean for AI & Analytics at Wharton, and Wharton Marketing Professor Raghu Iyengar introduces a practical framework for making smarter choices about data aggregation and parameter granularity, with significant implications for forecasting, pricing, and segmentation.

Key Takeaways for Marketing Professionals

  • Granularity is a Choice That Matters
    Managers must decide how detailed their data should be (e.g., weekly vs. monthly sales, city vs. state-level analysis), and how finely they should model customer differences (e.g., each store vs. store clusters). These choices affect model performance and decision outcomes.

  • Most Granular Isn’t Always Best
    Sometimes the most detailed data actually makes predictions worse. Other times, data that is too aggregated hides meaningful differences. The research shows the best results come from choosing the right level of detail—not simply the most detail.

  • Bayesian Dual Clustering (BDC)
    A new approach that helps analysts find the most useful level of detail in both their data and their models. It automatically identifies when information should be grouped and when it should be kept separate.

  • Try It Yourself
    Even without BDC which requires significant computation, everyone can test multiple data structures and choose the one that performs best in- and especially out-of-sample.

Real-World Application: Juice Sales, Granularity, and Price Elasticity

Grocery store juice aisle displaying a variety of colorful juice bottles and cartons on shelves, with fresh produce nearby.
Short on time? Here’s the takeaway:

When the researchers applied their method to a large dataset of juice sales, they found that being thoughtful about how products are grouped, rather than automatically using the most detailed data, led to more accurate forecasts and more realistic measures of price sensitivity. For instance, brand mattered more than package size, a finding that can help businesses set smarter prices and avoid overfitting noisy data.

Want to dive deeper? Read on for the full analysis:

To demonstrate the power of the BDC method, the team applied it to a large Nielsen dataset of SKU-level juice sales across multiple stores and time periods. They found that BDC gave better predictions than methods that rely on either extremely detailed data or overly broad categories. It also uncovered more meaningful product groupings.

Notably, the model revealed that price elasticity varied more by brand than by package size, challenging common practices in retail pricing analytics. Traditional models that fixed data at the most granular level missed this distinction—and risked misinforming pricing strategies. BDC’s more deliberate structure helped avoid these pitfalls, yielding clearer insights for demand planning and marketing ROI.

About Wharton AI & Analytics Insights

Wharton AI & Analytics Insights is a thought leadership series from the Wharton AI & Analytics Initiative. Featuring short-form videos and curated digital content, the series highlights cutting-edge faculty research and real-world business applications in artificial intelligence and analytics. Designed for corporate partners, alumni, and industry professionals, the series brings Wharton expertise to the forefront of today’s most dynamic technologies.