How DoorDash's AI Cut Errors by 90%
DoorDash's three-pillar approach processes an exabyte of data daily while reducing compliance issues by 99%.
When you're processing an exabyte of data daily (like watching Netflix's entire library 3,500 times!), every second counts in customer support.
The enterprise challenge
When operating at DoorDash's scale, even small inefficiencies in support systems can create significant operational challenges. The company's traditional knowledge-based support system faced three limitations impacting operational efficiency.
First is the discoverability problem. Despite having a comprehensive knowledge base, support articles were difficult to locate within the system.
This created unnecessary friction in the support process, leading to longer resolution times and increased dependency on human agents for even basic queries.
Second is the information extraction challenge. The traditional system required support agents and Dashers to parse through articles to find relevant information manually.
This time-intensive process reduced the efficiency of both the support team and delivery operations, creating a bottleneck in DoorDash's fast-paced delivery ecosystem.
Third, the language barrier. With a diverse contractor base operating across multiple regions, having support content exclusively in English significantly limited the system's effectiveness.
This not only impacted support accessibility but also created inconsistencies in service delivery across different markets.
The scale of operations magnified these challenges. DoorDash processes an exabyte of data daily.
At this scale, DoorDash needed a solution that could provide instant and accurate support, operate across languages, and optimize the allocation of human support resources.
A three-pillar approach for intelligent support
To address these challenges, DoorDash developed an advanced support system built on three core components: a Retrieval-Augmented Generation (RAG) system, an LLM Guardrail system, and a comprehensive Quality Monitor.
(Overview of the three components of the RAG-based support system: RAG system, LLM Guardrail, and LLM Judge)
The brain: RAG system
DoorDash's RAG system operates like a highly efficient digital librarian. When a support query arrives, the system first condenses multi-message conversations into a clear, single-point issue.
This condensed query is then matched against historical cases and knowledge-base articles to find relevant information.
The system integrates this information into a carefully crafted response template, ensuring answers are both contextually appropriate and accurate.
This approach is compelling because it can handle complex support scenarios. Rather than relying on pre-built decision trees, the system dynamically assembles responses based on real-world support cases and verified knowledge base content.
This results in more nuanced, accurate support that adapts to evolving business needs.
The Guardian: LLM Guardrail System
The Guardrail system serves as a sophisticated quality control mechanism, using a two-tier approach to ensure response accuracy:
A cost-effective initial check uses semantic similarity comparisons to verify response relevance
A more sophisticated LLM-based evaluator provides a detailed analysis of response accuracy and compliance
This dual-layer approach has proven remarkably effective, reducing hallucinations by 90% and cutting serious compliance issues by 99%.
The system achieves this while maintaining operational efficiency—an important consideration for a business processing thousands of support requests daily.
The judge: quality monitoring system
DoorDash's LLM Judge system provides continuous quality assurance across five critical dimensions:
Retrieval correctness
Response accuracy
Grammar and language precision
Contextual coherence
Request relevance
The monitoring system combines automated evaluation with human expert review, creating a robust feedback loop for continuous improvement.
This hybrid approach ensures the system maintains high-performance standards and evolves with changing support needs.
Strategic implementation
DoorDash's implementation strategy focused on four key areas, each crucial for creating a support system that could operate reliably at scale while maintaining high-quality standards.
Knowledge base improvement
The knowledge base serves as the foundation of accurate AI responses. DoorDash's approach went beyond simple content updates, focusing on structural improvements:
The team conducted comprehensive reviews to eliminate misleading terminology and outdated information. They developed a developer-friendly management portal to streamline article updates and expansion.
This systematic approach to knowledge management ensures that the AI system always works with verified, current information—critical for maintaining response accuracy in a dynamic business environment.
Retrieval system optimization
DoorDash implemented a two-pronged approach to improve information retrieval:
First, query contextualization streamlines complex support conversations into precise, actionable prompts while maintaining comprehensive context.
Second, the team optimized their vector store with carefully selected embedding models to enhance retrieval accuracy.
This balanced approach ensures fast, relevant responses while effectively managing computational costs.
Prompt engineering
The team recognized that the effectiveness of their LLM system heavily depended on how they communicated with it.
Their prompt refinement strategy followed three key principles:
Complex prompts are broken down into manageable components, enabling parallel processing where possible.
The team deliberately avoids negative language constructions, which typically challenge AI models. Instead, they focus on clear, positive action statements with illustrative examples.
They've implemented chain-of-thought prompting, encouraging the model to show its reasoning process—making it easier to identify and correct potential logic errors.
Quality assurance framework
DoorDash developed an open-source evaluation tool comparable to software development unit testing to maintain consistent performance. This framework enables:
Rapid prompt refinement and response evaluation through automated testing.
Any prompt changes trigger predefined test suites, preventing problematic updates from reaching production.
New issues identified during operation are systematically added to test suites, creating a continuously improving quality firewall.
Impact beyond numbers
The transformation goes beyond the impressive 90% reduction in errors and 99% decrease in compliance issues. DoorDash now:
Automatically handles thousands of support requests daily
Delivers consistent support across multiple languages
Enables support agents to focus on complex problem-solving
Maintains high-quality support even as volume scales
Learn more about it here.
Here are some of the insightful editions you may have missed: