Major technology companies are rethinking their artificial intelligence strategies as rising costs tied to token usage threaten to outpace returns. A new report from Goldman Sachs warns that the emergence of AI agents could drive token demand up by as much as 24 times current levels.

The investment bank's analysis highlights a growing financial strain on firms like Uber and Microsoft, which are already feeling the bite of tokenized billing models. These models charge companies based on the number of tokens processed by large language models, a cost that scales rapidly with increased usage.

The Token Cost Problem

Tokenized billing ties AI expenses directly to computational volume. Each query or task an AI agent performs consumes tokens, and as agents become more autonomous and widespread, consumption multiplies. The Goldman Sachs report suggests this surge could make current pricing structures unsustainable for many enterprises.

Uber and Microsoft are among those grappling with these rising costs. Both companies have invested heavily in AI features but now face difficult decisions about where to deploy resources most effectively. The report indicates that without significant efficiency gains or pricing changes, the financial burden could limit further adoption.

Why This Matters

The findings affect every business using or considering generative AI tools. For startups and large enterprises alike, token costs represent a hidden variable that can quickly erode profit margins. Companies may need to prioritize high-value use cases over experimental deployments.

Consumers could also feel the impact indirectly through higher prices or reduced access to free-tier AI services. If major providers pass along cost increases, the affordability of everyday AI tools may decline.

A Strategic Reassessment

The report comes at a time when many tech leaders are questioning whether current investments in generative AI deliver proportional value. Limited return on investment has already prompted some firms to scale back ambitious projects.

Goldman Sachs analysts suggest that companies will need to develop more disciplined approaches to deploying agents and managing token consumption. This may involve optimizing prompts, caching responses or shifting toward smaller specialized models for routine tasks.

The broader implication is clear: the era of unlimited experimentation with large language models may be ending. Businesses must now balance innovation against operational costs in a way they have not had to before.