How CarGurus Slashed Service Production Time by 90%
This auto marketplace took a radical approach to decomposing monolith and turned it into a developer-friendly ecosystem.
Every engineering organization hits an inflection point where adding more senior engineers stops solving the problem.
At CarGurus, that moment came when their monolithic architecture, which had served them well through years of growth, began showing classic signs of strain: increasingly longer build+deploy times and an uptick in broken workflows that slowed them down.
Most organizations in this situation rush to break their monolith into microservices, but CarGurus first confronted the root cause.
In a system with hundreds of services and thousands of jobs, nobody could clearly tell who owned what anymore.
This realization led them to build Showroom, a developer portal that would fundamentally reshape how they thought about system evolution at scale.
In this edition, I share interesting details about a journey that challenged traditional assumptions about monolith decomposition by starting with a deceptively simple question: Who owns what?
Their decomposition journey with Showroom
The starting point of the solution they envisioned was a simple service catalog. With hundreds of services sprawling across their ecosystem, getting everything in one place was a wise first step.
The challenging part wasn't just building a catalog but also ensuring it stayed relevant. So they introduced something called RoadTests, a gatekeeper in their CI system.
They ensured the registration process for adding a new service was quick—just a few button clicks—simple and effective because they knew adding friction would only create resistance.
But services were only half the story. Thousands of jobs ran across four different environments in Rundeck, and tracking ownership was becoming a nightmare.
So, they clearly defined ownership first
Rather than manually cataloging everything, they built an intelligent classification system that could automatically figure out job ownership with 90% accuracy.
For the remaining 10%, they added a smart little feature – a banner that would nudge developers to claim ownership of unclassified jobs when they spotted them.
Now, every service and job had a home. The service registry was enforced, jobs were automatically synced, and, most importantly, everyone knew exactly where to look for information.
They unknowingly laid the foundation for the first two pillars of their developer portal: discoverability and governance.
Next, turning compliance into a game
While having a central catalog was great, developers wouldn't naturally gravitate to it unless it solved everyday problems. They started by tackling a common headache: production readiness checklists.
Instead of these manual checklists in Excel sheets and wiki docs, they built compliance rules. These were automated checks that verified everything from API documentation to test coverage.
The system was pluggable, too, so teams could easily add new rules as their needs evolved.
But they turned it into a game to make it interesting. Each service got a compliance score right in the UI. Developers started competing for those high green scores in the 90s, aiming for that perfect 100%.
(show room compliance checks)
This gamification worked better than they expected. Developers who came to check their compliance scores started discovering other useful features.
But the real feature that would make their portal indispensable was still to come: deployments...
Developers would often choose the wrong build, accidentally deploy untested commits, or struggle with complex manual processes. The complexity was mind-boggling in a mono repo environment with multiple services.
So, how did they solve this problem?
They integrated directly with GitHub to provide a comprehensive deployment view. Developers could now see exactly what changes were going into a build, including impactful commits.
(Showroom deployments)
This eliminated human error almost entirely. Developers could deploy, roll back, or get instant logs with a single click. They even integrated Slack notifications to match their team's workflow, encouraging developers to use the new UI.
The results were remarkable. Just by launching this feature, they saved approximately 7,000 developer hours. Deployment times plummeted, and human errors became almost non-existent.
And that’s how they reimagined operational efficiency. Developers no longer needed to remember different deployment processes for different services. Whether working on a monolith or a microservice, they got the exact same streamlined experience.
What were the results?
Service production time collapsed from a sluggish 75 days to a mere 7 days. Deployment times plummeted by 97%, while lead times for changes decreased by 60%.
The team's reliability metrics showed equally impressive gains, with change failure rates dropping from a risky 25% to a robust 5%.
Ultimately, build times accelerated by 96% largely because developers became 220% more efficient.
Showroom has become more than just a source of truth for ownership - it is now a centralized developer hub for working more efficiently in many different ways.
Learn more about it here, here and here.
Here are some of the insightful editions you may have missed: