The rising Artificial Intelligence (AI) consumption costs: from innovation to inflation

AI’s cost paradox

Something has quietly changed in how buyers talk about Artificial Intelligence (AI). The early discussions centered on feasibility: where AI can be used, how quickly it can be piloted, and what value it can deliver. Today, the conversations revolve around AI adoption costs and urgency. Buyers are now questioning rapidly rising AI expenses, even as headline model prices appear to be moving in the right direction.

Reach out to discuss this topic in depth.

There is no doubt that the per-token cost of Large Language Models (LLMs) from OpenAI and Anthropic has been dropping significantly. However, this creates a token cost illusion: buyers focus too heavily on the declining unit cost while their AI Total Cost of Ownership (TCO) continues to rise. Therefore, the real issue is the disconnect between falling model prices and rising enterprise AI spend. The token cost illusion is increasingly frustrating buyers who expected AI-led productivity gains to outpace the associated costs by a significant margin. Exhibit 1 illustrates the sharp decline in flagship model pricing across OpenAI and Anthropic over time.

Exhibit 1: Pricing trends across flagship OpenAI and Anthropic Claude

Source: List price trends for input tokens as per and

What is really driving the spike?

Although unit economics appear favorable, organizations often underestimate token consumption, the largest cost multiplier. Additionally, several other cost layers are influencing the total AI economics.

We have seen the same pattern in enterprise cloud adoption. The pitch was cost savings; the reality was a new category of sprawl. AI is running a similar play.

Here are the top four reasons driving the spike in overall costs:

1: Tokenmaxxing is being rewarded

One key observation is that AI adoption is being measured more aggressively than AI value realization. Some organizations have introduced dashboards to track AI use in terms of token consumption. Many organizations implicitly assume that higher token use automatically drives higher productivity. This leads to tokenmaxxing, where heavy AI consumption is treated as a badge of effectiveness. NVIDIA Chief Executive Officer (CEO) Jensen Huang famously said that he would be deeply alarmed if a $500,000 engineer did not consume tokens worth at least $250,000. Many other technology leaders are also reinforcing this mindset. The risk, however, is that token volume becomes a proxy for progress instead of value realization.

2: Large context window = more tokens per request

Another prominent but less obvious factor driving AI consumption is rapid context window expansion. Over the past few quarters, OpenAI and Anthropic have announced significant increases in context length, with newer models now supporting up to 1 million tokens:

In April 2025, OpenAI announced a 1 million-token context window for GPT-4.1, a five times increase from the previous threshold
In August 2025, Anthropic announced a 1 million-token context window for Claude Sonnet 4
In February 2026, Anthropic rolls out Claude Opus 4.6 with 1 million-token context support

This matters because larger context windows encourage users to pass more data, instructions, and history into each prompt, which can materially increase tokens consumed per request.

3: Increasing underlying infrastructure cost

Another factor increasing the overall AI cost is the supporting cloud infrastructure needed to run AI at scale. Vector databases are a good example; while the per-unit cost has improved over the past couple of years, the overall consumption has grown faster than overall AI workload growth as RAG (Retrieval-Augmented Generation) became the default architecture for enterprise AI. Every chunked and embedded document adds vectors, making the overall AI economics more complex than model pricing alone would suggest.

4: Poor financial discipline and governance

Poor financial discipline and governance around AI consumption are also driving up the AI costs. In many enterprises, AI use is scaling faster than the mechanisms needed to monitor, govern, and optimize it. The result is that enterprises cannot clearly track which models teams are using, how costs are building across the stack, and whether that consumption is actually delivering measurable business value. Without that oversight, AI can quickly shift from a productivity enabler to another growing cost center.

Why are prices expected to grow further?

Here is the critical insight: AI companies are not passing the true cost to customers. Venture capital and large technology firms are subsidizing the industry. Billions of dollars in external funding currently offset the gap between the astronomical costs of Graphics Processing Unit (GPU) clusters, power, and model training, and what customers pay. Major AI firms currently prioritize adoption over profitability.

As the market matures, this gap is likely to narrow. For buyers, it means present-day economics should not automatically be treated as the steady-state AI economics. The costs will likely rise further if enterprises maintain the status quo.

What must buyers do?

Rising AI costs are not a reason to pump the brakes on adoption, but they are a reason to be mindful of how that spending is managed.

While some cost growth is entirely justified, AI is moving from pilot projects to production-scale deployment, and that transition comes with real infrastructure, talent, and integration expenses. The problem is not the AI spending itself; it is the lack of spending discipline.

Organizations must treat AI economics as a board-level priority. This means enterprises must track spending, usage, and business outcomes as closely as innovation efforts. A practical first step is establishing cross-functional AI Cost Control Teams, bringing finance, Information Technology (IT), and business units together to review AI spending monthly, surface cost drivers early, and make informed optimization decisions before overruns become headlines.

Equally important is avoiding vendor lock-in. The AI landscape is disrupting itself faster than any single platform can keep up with. Enterprises that commit exclusively to one provider risk overpaying and underperforming. A deliberate multi-model strategy, where different models are selected for different tasks based on capability and cost, is no longer optional. Organizations often implement this approach through model routing, which can reduce AI costs by 25-40% without compromising output quality.

Finally, negotiate proactively, not reactively. Large enterprises with significant workloads hold more leverage than they typically use. Volume commitments, multi-year contracts, and competitive benchmarking across providers can meaningfully drive down unit costs. The vendors want the business, and in an increasingly crowded market, they will negotiate.

Bottom line

AI spend left unmanaged will grow unchecked. Enterprises that treat cost discipline as a core competency rather than an afterthought will compound their AI advantages faster and more sustainably than those that do not.

Implementing this mindset is challenging while organizations remain focused on rapid AI adoption and experimentation. However, many enterprises begin addressing AI cost governance only after unexpected spending materially impacts budgets or operating margins. The opportunity is to act earlier, establishing financial visibility, governance, and accountability before AI costs become difficult to control.

If you found this blog interesting, check out, AI-powered observability: The next frontier in modern operations – Everest Group Research Portal, which delves deeper into another topic relating to AI.

Reach out to Prateek Gupta ([email protected]) and Rohan Pant ([email protected]) for more information.

Source link

What's Hot

The rising Artificial Intelligence (AI) consumption costs: from innovation to inflation

Health and wellness influencers dominate social media. A new report shines a light on who they actually are.

The Best Risk Mitigation Strategy in Data? A Single Source of Truth – O’Reilly

The rising Artificial Intelligence (AI) consumption costs: from innovation to inflation

How music can help with road-safety

Edge browser leaves passwords exposed in plain text, says researcher

Android 17 Has A Major Shortcoming That Google Forgot To Fix

SpaceX-Cursor deal amid the vertical integration wars: Go full-stack or fall behind

Are Psychiatric Times the past or future?

Windows shell spoofing vulnerability puts sensitive data at risk – Computerworld

Understanding U-Net Architecture in Deep Learning

Hard-braking events as indicators of road segment crash risk

Redefining AI efficiency with extreme compression