MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
data
Search

GPU pricing, a bellwether for AI costs, could help IT leaders at budget time

Monday December 15, 2025. 08:00 AM , from ComputerWorld
AI is now as much a utility as any other ongoing business cost, and IT leaders setting out their AI budgets for 2026 need to consider the costs of the underlying resources — the GPUs in modern data centers that are unlocking AI’s potential.

In the three years since ChatGPT arrived, the push for ever more — and better — generative AI tools has continued at a rapid clip. That growth has come at a cost, however: spiraling AI budgets amid low GPU availability and limited energy capacity to run those data centers.

Efforts are now underway to reduce the cost of using GPUs, and the attendant cost of using genAI tools, with smaller data centers, billing tools, software tools, and alternative hardware leading the charge.

Traditional AI budgeting is heavily reliant on GPU pricing, hours, and instance rates. GPU instances are “eye-wateringly expensive” at $30+ per hour for high-end configurations on-demand, said Corey Quinn, chief cloud economist at Duckbill, which provides cost analysis tools for cloud providers.

“For serious AI workloads, GPU costs often become the dominant line item, which is why you’re seeing companies scramble for reserved capacity and spot instances,” he said, adding that AI billing through cloud services “is a mess.”

IT leaders can’t commit to fixed computing resources because of the unpredictability of AI workloads. Hyperscalers muddy the waters further with managed GPU services, AI credits, and committed-use discounts.

Then there are “shadow costs everyone forgets — data transfer, storage for training data, and the engineering time to make any of it work,” Quinn said.

At the same time, smaller cloud providers — also called neoclouds — are getting their hands on more GPUs and making them available to IT users. Those companies include CoreWeave, Lambda Labs, and Together AI.

“They’re picking up meaningful market share by focusing exclusively on GPU workloads and often undercutting hyperscaler pricing by 30% to 50%,” Quinn said.

Neoclouds focus more on discounted GPUs within a smaller geographic footprint, something some companies can live with, Quinn said.

IT leaders don’t need the latest and shiniest GPUs from Nvidia or AMD for their AI workloads, said Laurent Gil, cofounder of Cast AI. Older generations of GPUs perform equally well on certain AI workloads, and IT leaders need to know where to find them to save money.

“AWS spot pricing for Nvidia’s A100 and H100 has decreased by 80% between last year and this year — just not everywhere,” Gil said.

Cast AI offers the necessary software tools and AI agents to move workloads to cheaper GPUs across cloud providers and regions. “Our agents do what a human does once a month, except they do it every second,” Gil said.

Cast AI’s tools also optimize for CPUs, which consume far less power than GPUs. (Energy consumption is becoming a major bottleneck for AI workloads, Gil said.)

Some companies are also looking to make pricing and GPU availability more transparent.

One startup, Internet Backyard, allows data center providers to provide real-time quotes, billing, payments, and reconciliation for GPU capacity. The white-label software is embedded in data center providers’ systems.

“From the tenant side of the data center, we have the tenant portal, where you can see real-time GPU pricing and energy matching with your actual consumption,” said Mai Trinh, CEO of Internet Backyard.

The startup isn’t yet collaborating with hyperscalers; for now, it focuses more on emerging data centers that need to standardize billing, quoting, and payment processing. “When we talk to people who build a data center, they tell us that everything’s happening on Excel — there’s no real-time pricing there,” Trinh said.

Since AI is related to performance, the company is exploring a performance-based pricing model rather than GPU-specific pricing. “It is extremely important for us to base pricing on performance, because that’s what you’re really paying for,” Trinh said. “You’re not paying for someone else’s depreciating asset.”

The startup’s backers include Jay Adelson, a co-founder of Equinix, one of the world’s largest data center companies.

Energy is also an important driver in GPU pricing. GPU demand for AI computing is overwhelming grids, which have power ceilings, and pushing up utility prices.

U.S. data centers could account for 12% of total energy consumption by 2030, according to a 2024 McKinsey study. Meanwhile, electricity prices are soaring in the data center frenzy. Multiple groups last week sent a letter to US Congress requesting a moratorium on building data centers.

Energy requirements such as those required by the largest AI providers for future data centers are not sustainable, said Peng Zou, CEO of PowerLattice. “High-density AI clusters are forcing CIOs to rethink their infrastructure roadmap and economics,” Zou said.

PowerLattice makes technology for modern chips to become more power efficient. The company’s technology is among a slew of AI-era chip technologies designed to eke more compute from systems while reducing power consumption.

“The reliability and uptime of AI and GPU servers are critical, and these are things CIOs care deeply about,” Zou said.
https://www.computerworld.com/article/4104332/gpu-pricing-a-bellwether-for-ai-costs-could-help-it-le...

Related News

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network
Current Date
Dec, Mon 15 - 12:05 CET