By Sebastian Pellejero
NEW YORK, July 23 (Reuters Breakingviews) - Are artificial intelligence startups like software companies? Since OpenAI’s ChatGPT chatbot made its viral debut late in 2022, investors have treated the new technology as if it shares the same financial characteristics as previous purveyors of computer code: low marginal costs and the potential for vast scale. This belief has helped propel OpenAI’s valuation to $300 billion, while rival Anthropic is seeking a $100 billion price tag. The bet is that those firms that grab users early will entrench themselves as the default interface for AI, capturing data, distribution, and pricing power. After all, that is how software giants like Microsoft MSFT.O, Salesforce CRM.N and Oracle ORCL.N established themselves.
This analogy is starting to crack because AI firms have fundamentally different costs. Training a large language model can run to hundreds of millions of dollars. But getting the model to generate responses in real time – a process known as inference – is an ongoing expense. Every query spins up thousands of chips, consuming power, and communications bandwidth. This looks less like software and more like a utility: an energy-intensive, infrastructure-heavy business whose running costs depend on consumer demand.
The stakes are massive. Meeting future demand for inference services worldwide will require a minimum $3.7 trillion investment in AI-focused data centers, reckons McKinsey. That outlay is expected to drive incremental electricity demand to roughly 733 terawatt-hours, as much energy as powers 68 million homes, according to the International Energy Agency.
To see why, consider how AI queries work. Mid-sized models typically run on clusters of two to four chips, while larger tasks can require up to eight. Generating a 1 million-token output, a common benchmark for advanced workloads, can take 30 to 40 minutes on an 8-chip cluster, translating to roughly $40 to $55 in direct compute costs when using Nvidia’s NVDA.O H100 processors, based on current pricing on Amazon Web Services. Unlike traditional software, where marginal costs approach zero, each AI query incurs new expenses.
Power compounds the burden. An H100 chip draws around 700 watts at full load, and at industrial rates of 10 to 15 cents per kilowatt-hour in the United States, energy costs quickly add up. A firm serving 10 billion tokens daily could face an annual electricity bill in the millions of dollars, with cooling and inefficiencies often doubling that load.
Yet while costs of each AI query have remained stubborn, revenue per query is falling. Since 2022, token prices have dropped as much as 280-fold, says Stanford University. Larger users can negotiate volume deals that reduce standard token prices by up to half off.
The price war is largely strategic. Model developers are slashing rates to drive adoption, stake out market share, and lock in users. But that land grab has run ahead of efficiencies and squeezed margins. Providers which depend on renting cloud computing capacity must deftly manage those costs, while competing with the big tech platforms they rely on.
This explains why tech giants are racing to control their own infrastructure. Direct energy procurement, paired with in-house chip development, helps lower costs. Cloud computing giants like Google owner Alphabet GOOGL.O, Microsoft, and Amazon AMZN.O are locking in electricity supplies through long-term power purchase agreements while ramping up investments in energy. This allows them to report gross margins between 60% and 70%.
Developers like OpenAI and Anthropic face a split, however. By selling directly to enterprises, they can charge prices that fully cover the costly compute, enabling similar gross margins. But usage routed through cloud partners often turns unprofitable. Steep volume discounts for tokens leave little room for the cost of renting AI infrastructure. Each new query adds to the headache.
That trade-off is already reshaping AI strategy. Even Microsoft, which receives help from OpenAI traffic through its partnership with the company led by Sam Altman, is pouring capital into expanding its own AI footprint. As users demand longer, more complex outputs, the infrastructure burden will only grow. OpenAI, for its part, is developing its own chip and building dedicated data centers.
Optimists argue that better engineering will eventually fix this problem. Developers have already made models more efficient by using simpler, faster math, avoiding repeated calculations when generating long responses, and processing multiple user requests at once. These tweaks let the same hardware handle more requests while using less energy.
Newer chips promise further gains. Nvidia’s latest Blackwell processors are expected to deliver two-and-a-half times better energy efficiency and four times faster inference performance than the H100, according to company presentations. However, hardware gains also enable more complex models, boosting demand and cancelling out any gains.
Others argue that business models will adapt. Over time, the industry may discourage users from using AI for low-value queries, like writing tweets or summarizing web pages, while prioritising high-value tasks like reviewing compute code, which customers are willing to pay for. That could allow revenue to better reflect the cost of serving complex, high-demand queries.
OpenAI, meanwhile, is trying to move beyond just providing AI-based answers. It’s close to launching a web browser, is building a payments system, and is offering enterprise consulting. These initiatives are designed to generate revenue from other activities while embedding the company more deeply into users’ workflows.
In this world, however, the winners are likely to be the ones who already own the entire information stack. Amazon and Alphabet, with their custom chips and power contracts, can contain unit costs and reinvest at scale across their large existing businesses. By contrast, firms that do not control their own infrastructure, like Anthropic, Perplexity and even OpenAI face narrower margins and limited pricing leverage.
Even for the giants, a payoff is not guaranteed. They are spending tens of billions of dollars on a bet there’s a deep, durable market for AI queries that generate more revenue than they cost to serve up. From railroads to telecoms, history is littered with examples of companies mispricing a new service or miscalculating demand for it. AI may become ubiquitous, but ubiquity alone does not mean the investors who paid to build it will see a return.
Follow Sebastian Pellejero on LinkedIn.