AI agents have a fundamental weakness: they cannot learn from their own mistakes. Anthropic is trying to change that with a new system called “dreaming.”

The company unveiled the feature Tuesday at its Code with Claude developer conference in San Francisco. Dreaming lets Claude’s managed agents review past sessions, extract patterns and write structured “playbooks” that future sessions can reference. The system improves agent performance over time without modifying the underlying model weights.

Anthropic also moved two experimental features into public beta. Outcomes lets developers define success criteria for agent tasks. Multi-agent orchestration allows multiple AI agents to collaborate on complex workflows. Both were previously available only in research preview.

Together, the three features tackle what Anthropic calls the hardest problems in running AI agents at scale: accuracy, learning and avoiding bottlenecks on complex work.

How Dreaming Works

Dreaming operates at a higher level of abstraction than standard agent memory. It runs as a scheduled process that reviews an agent’s past sessions and memory stores. The system surfaces insights no single session could detect on its own.

Alex Albert, who leads research product management at Anthropic, compared dreaming to how people create skills after working through a task. “They might do a workflow with Claude and at the end of that workflow, after they’ve iterated and zigzagged a little bit, they want to record that path from A to B,” Albert said. “A very similar thing is happening with dreaming.”

Dreaming does not change the AI model itself. The agent writes learnings as plain-text notes and structured playbooks. Humans can inspect and audit every step.

When asked about trust, Albert acknowledged that users must place some confidence in the system. But he noted that all memories are observable. “They’re learning to write better notes for their future self,” he said.

Real-World Results

Early adopters reported significant improvements. Legal AI company Harvey saw task completion rates increase roughly six times after implementing dreaming. Medical document review firm Wisedocs cut its document review time by 50% using outcomes. Netflix now processes logs from hundreds of builds simultaneously using multi-agent orchestration.

During the keynote, Anthropic demonstrated all three features live. The team simulated a lunar drone landing mission with three specialist agents. After an initial round of imperfect results, a dreaming session ran overnight. The system produced a detailed descent playbook that improved success rates in subsequent tests.

Why This Matters

Enterprises have demanded self-correcting AI agents before trusting them with production workloads. Dreaming directly addresses that need. It allows agents to learn from experience and adapt without requiring engineers to retrain models or manually label data.

The timing aligns with Anthropic’s explosive growth. CEO Dario Amodei disclosed that the company saw 80x annualized revenue and usage growth in the first quarter of 2026. API volume is up nearly 70x year over year. The average Claude Code developer now spends 20 hours per week using the tool.

“We tried to plan very well for a world of 10x growth per year,” Amodei said. “And yet we saw 80x.”

Dreaming and the other features give developers a clear path to building agents that improve over time. For companies running AI at scale, that could be the difference between a proof of concept and a production system.