Identifying Key Customers and Industries through Customer Lifetime Value (CLV) Modeling and Regression Analysis

About the Datathon

Twenty self-formed teams comprised of approx. 80 Penn/Wharton students were provided the same core set of Ryder System, Inc., customer data at the contract level to model the most valuable customers and industry segments to their business. Ryder sought to create a forward-looking enterprise value model which would allow the company to tie customer-centric programs more directly to future customer value.

The teams had one week to develop their own set of goals and methodologies, identify external data sources, and create an enterprise value model coded in R. By leveraging their diverse set of analytics skills, the teams were tasked to develop a creative solution for Ryder to implement.


Ryder needed a predictive model to estimate the enterprise value of each of its customers, something of a “modified CLTV” model that would incorporate internal customer data, contract data, and external measures like industry growth rates to paint a holistic picture of each customer’s future potential.

Specifically, Ryder wanted the analysis to help inform resource allocation for various sales lead generation, retention/loyalty, and upsell/cross-sell marketing campaigns. Ryder’s goal was to explicitly tie retention spending to the expected future value of a customer.


The student teams presented their findings and solutions to Ryder executives at a Datathon symposium. The teams developed models that ranged from a standard regression analysis, a beta-geometric/geta binomial model to capture customer lifetime value (CLV), and random forest models. In the end, Ryder chose the team using a CLV Buy-Till-You-Die modeling framework.

By the Numbers


Approx. 80 Penn/Wharton students competed in the Ryder Datathon.


The winning team identified that 13% of customers contributed 75% of revenue.

The Winning Team


The winning team sought to identify customers with the highest future revenue projections using a CLV model and coupled these customers within targeted industries with higher revenue contribution.


The team reviewed customer transactions across all three different business segments, including Fleet Management Solutions, Supply Chain Solutions, and Dedicated Transportation Solutions. They interpreted each contract change as a transaction by the customer using a combination of regression analysis and customer lifetime value (CLV) to predict future contract purchases.


By combining a CLV model with regression analysis, the team identified that 13% of customers contributed 75% of revenue. The team also identified that customers in the Fleet Management Systems business unit and the trucking industry have the highest customer lifetime value (CLV).