Why Better AI Tutors Aren’t About Better Answers

Executives investing in AI-powered learning often focus on improving the chatbot: better explanations, more accurate answers, smarter prompts. But new research from the University of Pennsylvania and the Wharton School suggests that’s not where the biggest gains come from.
The real opportunity isn’t in how AI responds, it’s in how AI guides.
Learn the latest from researchers Angel Tsai-Hsuan Chung, doctoral candidate in Wharton’s Operations, Information and Decisions (OID) department, Botong Zhang, incoming PhD student in Penn’s Computer and Information Science (CIS) department, Ling-Chieh Kung, associate professor with National Taiwan University’s College of Management, Hamsa Bastani, associate professor in OID, and Osbert Bastani, associate professor in Computer and Information Science at the University of Pennsylvania.
Key Takeaways
- Smarter sequencing beats smarter answers.
Most AI tutors are reactive, waiting for users to ask questions. This research shows that proactively sequencing the right problems at the right time significantly improves learning outcomes, even when the chatbot itself stays the same. - Personalization drives measurable performance gains.
In a five-month randomized controlled trial of 770 students, adaptive problem sequencing (dynamically adjusting the difficulty and order of tasks based on a learner’s progress) improved final exam performance by 0.15 standard deviations, equivalent to roughly 6–9 months of additional learning. - Engagement, not content, is the real bottleneck.
The gains weren’t driven by more content or harder problems. They came from increased engagement: students spent more time, persisted longer, and interacted more productively with the AI. - AI can infer skill more effectively than traditional systems.
Instead of relying on simple signals like correct/incorrect answers, the system analyzed how students worked, including code edits and AI conversations, to estimate skill and adapt in real time. - The biggest gains come from less-experienced learners.
Personalization had the strongest impact on beginners, suggesting AI-driven learning systems can help close skill gaps rather than widen them.
Real World Application

Short on time? Here’s the takeaway:
Many organizations are deploying AI for training as a Q&A layer — employees ask questions, and AI responds. This research suggests that approach leaves significant value on the table.
The study tested a different model: AI systems that don’t just answer questions, but actively structure the learning journey. By dynamically adjusting task difficulty based on real-time behavior, the system improved outcomes without increasing instruction time or instructor involvement.
For L&D leaders, the implication is clear:
If your AI strategy focuses only on better content delivery, you’re optimizing the wrong layer.
The bigger opportunity is designing systems that:
- Guide employees through progressively challenging tasks
- Adapt in real time to skill level and learning speed
- Sustain engagement over time, not just provide answers on demand
This content was created with the assistance of generative AI. All AI-generated materials are reviewed and edited by the Wharton AI & Analytics Initiative to ensure accuracy, clarity, and alignment with our standards.
About Wharton AI & Analytics Insights
Wharton AI & Analytics Insights is a thought leadership series from the Wharton AI & Analytics Initiative. Featuring short-form videos and curated digital content, the series highlights cutting-edge faculty research and real-world business applications in artificial intelligence and analytics. Designed for corporate partners, alumni, and industry professionals, the series brings Wharton expertise to the forefront of today’s most dynamic technologies.
