For months, enterprises raced to push every possible task through large language models, encouraged by C-suite mandates and vendor hype. Now the bill has arrived. Leaked audio from consulting firm Accenture reveals a growing crisis: companies are burning through token budgets so fast that even executives who championed AI adoption are struggling to justify the expense.
In the audio obtained by 404Media, Accenture staff acknowledge that offloading simple tasks to AI, especially when agentic workflows are involved, leads to massive and unpredictable token overspend. The firm’s agentic AI strategy lead, Justive Kwak, described the problem as “rapid escalation in AI token spend” as organizations move from chatbots to deployment of tools like Copilot, Claude Code and Codex. “It’s really not a niche problem,” Kwak said. “It is a problem that every enterprise will face if they are bullish on AI.”
This is not a isolated issue. Amazon recently cut its internal AI leaderboard, and an unnamed company reportedly burned through $500 million worth of tokens in a single month. Uber is capping AI tool usage to reduce costs. Even OpenAI CEO Sam Altman has acknowledged that token costs are becoming a major concern. The shift from flat subscription pricing to per-token billing has created a perfect storm: companies pay for every input, every output, every verbose answer and every mistake.
The Cost of Unchecked AI Consumption
Accenture had previously been among the most vocal advocates for aggressive AI adoption, even tying promotions to AI usage. That policy now looks unsustainable. The leaked audio shows Accenture is grappling with overspend at both its own operations and client organizations. Measuring return on investment has become “all but impossible,” Kwak noted, because no one can predict how many tokens a given task will require or whether the AI will produce a usable result on the first attempt.
At the root of the problem lies a fundamental uncertainty. Large language models are powerful but unreliable. They generate verbose responses, require follow-up corrections and consume tokens even when they fail. For CFOs and COOs, this unpredictability makes budgeting a nightmare. As Kwak put it in the audio, “Leadership, especially at the CFO, COO and CIO level, are still asking the question of whether they’re getting value from what we’re spending on in the context of AI.”
Why This Matters
The AI token crisis directly affects every enterprise that has invested in generative AI tools. Companies that rushed to deploy AI without cost controls now face ballooning expenses and difficult trade-offs. Some are switching to cheaper models others are capping usage outright. If unchecked, AI overspend could erode the productivity gains the technology promised to deliver. For employees, tighter budgets may mean fewer AI assistants or stricter oversight of how they use them. For vendors like OpenAI and Anthropic, the backlash could slow adoption and force pricing changes.
How Companies Are Responding
Across industries, organizations are beginning to rein in spending. Three notable measures have emerged:
These moves mark a sharp departure from the “tokenmaxxing” era, when Jensen Huang urged Nvidia engineers to spend half their annual salary on AI tokens every year. The shift toward cost discipline suggests the market is maturing from blind adoption to measured deployment.
Measuring What Matters
The deeper challenge remains: how to quantify the value of AI output. Traditional productivity metrics like time saved or tasks completed are inadequate when token costs vary wildly. Accenture’s internal conversation highlights a growing recognition that enterprises need new frameworks for evaluating return on AI investment. Without them, the current pullback may only be the beginning of a broader correction.
As Sam Altman has admitted, the cost problem is now a meme. But for the CFOs and COOs listening to the Accenture leak, it is anything but a joke. The era of unrestrained AI spending is ending. The question now is whether companies can find a sustainable balance between innovation and expense before the next quarterly earnings call.



