The rapid adoption of AI-powered code generation has created a new problem for engineering teams: software ships faster than ever, but keeping it running in production remains a slow, manual grind. Resolve AI, the production-operations startup backed by Greylock and Lightspeed Venture Partners, is rolling out an expanded platform designed to close that gap.
The company announced a major update that introduces always-on background agents, a redesigned investigation architecture and a shared workspace where engineers and AI agents collaborate on live incidents in real time. The centerpiece is a multi-agent investigation system developed by Resolve AI's in-house research lab.
Instead of deploying a single agent to diagnose a production failure — like one engineer pulling an on-call shift — the platform now dispatches a coordinated team of specialized agents. These agents pursue multiple hypotheses in parallel, independently verify each other's conclusions and construct complete causal chains from root cause to symptom.
Accuracy gains from agent teamwork
Resolve AI CEO and co-founder Spiros Xanthos said the new architecture delivers more than a twofold improvement in root cause accuracy on internal benchmarks compared to earlier versions. He described the shift as moving from one agent working alone to a team of agents collaborating like human engineers debugging an issue together.
The accuracy claim comes from internal evaluations built to mirror real-world complexity at enterprise customers such as Coinbase, Salesforce, DoorDash and Zscaler. Xanthos acknowledged the benchmarks are not third-party audited but said they represent hundreds of difficult cases similar to what those companies encounter daily.
The practical impact is significant. Resolve AI's agents now act as first responders for every on-call alert, typically triaging within five minutes before a human engineer becomes involved. In previous disclosures, the company cited DoorDash reducing time to root cause by up to 87 percent.
Preventing hallucinated answers in live outages
A core challenge for large language models in high-stakes environments is their tendency to generate plausible-sounding but incorrect answers. In the context of a live outage, that could send an engineering team chasing the wrong fix while services stay down.
Xanthos acknowledged this directly. He said models out of the box always try to give an answer even without enough evidence, which often leads to wrong conclusions.
Resolve AI's countermeasure is layered verification among its agents. Each agent investigating a hypothesis must cite every piece of evidence it relies on and present that evidence to another agent for independent review. Peer agents actively attempt to disprove the theory by identifying gaps in logic.
Why This Matters
The software industry faces an acute tension: AI code generation has exploded in adoption but keeping that software running remains overwhelmingly manual. For companies shipping more code than ever before, production failures can cascade quickly when debugging still depends on human engineers spending tens of minutes or hours diagnosing issues.
Resolve AI's approach directly addresses this bottleneck by automating incident response with systems designed not just for speed but for reliability through cross-verification among specialized agents. For engineering teams at large tech companies dealing with frequent deployments and complex microservice architectures, this could mean fewer outages and faster recovery times when things go wrong.



