Forecasting Drug Utilization and Enabling Cost-Effective Substitution: A Predictive Analytics Collaboration with RxSense

Objective
The objective of the project was to develop a forecasting model to predict drug utilization trends for 2025, enabling RxSense’s clients—pharmacy benefit managers (PBMs) and health plans—to more effectively budget for and manage pharmacy costs.
The goal was not only to build accurate and scalable predictive tools, but also to generate actionable insights that support cost containment and resource planning. Our team was given multi-year anonymized claims data spanning more than 95 million records across three healthcare organizations of various sizes.
About RxSense
RxSense, founded in 2015 by Rick Bates (WG’96), is a healthcare technology company that provides modern, cloud-based platforms to improve the management and access of cost-effective pharmacy benefits.
Approach
To achieve the objective, we broke the project into three key components: (1) base model utilization forecast, (2) external data integration, and (3) cost savings and drug substitution. This modular structure allowed us to explore core business questions from multiple angles while testing scalable analytical methods.
- Base Model Utilization Forecast
We evaluated seven statistical and machine learning models to determine which best captured utilization trends across organizations and drug classes. We developed monthly predictions and validated performance using 2024 actuals. Key questions included evaluating model accuracy (MAPE, RMSE), interpretability, and building a generalized forecasting framework. - External Data Integration
We investigated what and how external signals could be integrated to enhance model performance and forecast sensitivity. This included exploratory integration of Google Trends data, informed by academic research, as well as team-driven ideation. We also piloted two frameworks: a drug relationship database inspired by knowledge graph research, and the Drug Uptake Propensity Score (DUPS) model, designed to forecast utilization for newly approved drugs without historical claims data. - Cost Savings & Substitution
We developed a dual approach to substitution identification: a rule-based method using GPI-4 classification and a graph-based pipeline to assess clinical similarity. Outputs were verified using LLM prompts to generate substitution tables and assign suitability flags based on predefined thresholds.
Solution
For the utilization forecasting, we built a flexible and generalized pipeline that allowed users to configure inputs such as organization and drug class, select from multiple models and prediction periods, and generate 2025 predictions with built-in performance metrics (MAPE, RMSE). Based on model testing, we recommended a hybrid approach using statistical models (e.g., NBD) for magnitude estimation and deep learning models (e.g., LSTM, GRU) to capture temporal patterns and seasonality.
For external data integration, we initially explored incorporating public signals like Google Trends into ARIMA models to improve short-term forecast accuracy. However, this integration showed no meaningful improvement in performance. We then shifted focus to developing two exploratory tools backed by extensive research: the DUPS framework for forecasting uptake of newly approved drugs using regulatory and clinical features, and a drug relationship database inspired by graph-based research to support future modeling and substitution efforts.
For drug substitution, we developed a generalized substitution framework that enhances GPI-4 logic with a graph-based similarity engine. Outputs were validated using LLM prompts to flag clinical suitability and generate substitution tables, helping PBMs and health plans identify cost-saving alternatives that align with therapeutic goals.
As a next step, the predictive power of both the base models and external data frameworks could be significantly enhanced if RxSense is able to incorporate richer member-level demographic information such as age, gender etc., which was not available in the current dataset. These features would improve forecast precision, enrich the DUPS framework, and create new opportunities for targeted external signal integration. Additionally, the forecasting and substitution tools could be integrated into RxSense’s production environment with real-time pricing and formulary data, enabling broader client adoption and ongoing refinement based on live feedback.
Impact
The RxSense team was highly complimentary of our progress and outcomes over the eight-week engagement. The project’s focus areas—forecasting utilization trends and identifying cost-effective drug substitutes—directly support high-priority initiatives for RxSense and their clients. RxSense already dedicates internal resources to both forecasting and clinical substitution, and our contributions provided fresh perspectives, expanded frameworks, and new tooling that extend their existing capabilities.
First, we introduced new concepts and approaches, such as a graph-based drug relationship model that can be layered into their substitution framework, and a DUPS model to anticipate usage for newly approved drugs. We also offered model testing insights that can guide refinement of their internal forecasting efforts. Second, we delivered a generalized forecasting pipeline and supporting database that RxSense can now enhance with their proprietary member-level data and tailor for different use cases. Lastly, we created early automation tools, such as a self-contained flowchart to identify lower-cost therapeutic alternatives, which can streamline current workflows.
Overall, the project sparked meaningful discussions, introduced reusable components, and laid groundwork that RxSense’s internal teams can further develop and scale based on their richer data assets and organizational priorities.
Data Visualizations
Sample Prediction Forecast

High-Level Flow Chart of Drug Substitution and Cost Savings Framework

About the AI & Analytics Accelerator
The AI & Analytics Accelerator, part of the Wharton AI & Analytics Initiative, partners with organizations to develop cutting-edge AI and data-driven solutions for real-world challenges. Through collaboration with Wharton faculty, researchers, and students, the Accelerator transforms complex data into actionable insights, driving innovation across industries.