Navigation
Search
|
OpenAI, Google AI data centers are under stress after new genAI model launches
Friday March 28, 2025. 05:20 PM , from ComputerWorld
New generative AI (genAI) models introduced this week by Google and OpenAI have put the companies’ data centers under stress — and both companies are trying to catch up to demand.
OpenAI’s CEO Sam Altman on Thursday tweeted that his company was temporarily restricting the use of GPUs after overwhelming demand for its image generation service on ChatGPT. The move came one day after OpenAI introduced the 4o image-generation tool. “It’s super fun seeing people love images in ChatGPT. But our GPUs are melting,” Altman wrote in a post on X. OpenAI primarily relies on Nvidia GPUs to power its ChatGPT service, and in the past has run into issues with its AI infrastructure being overwhelmed. Altman said OpenAI would introduce rate limits — which limits the use of AI creation on GPUs — until the system becomes more efficient. Similarly, Google also is dealing with a surge in demand for its Gemini 2.5 AI model, which rolled out Tuesday. “We are seeing a huge amount of demand for Gemini 2.5 Pro right now and are laser focused on getting higher rate limits into the hands of developers ASAP,” Logan Kilpatrick, product lead for Google’s AI Studio developer tools, said in a post on X. Google has built its AI infrastructure on its homegrown TPUs (Tensor Processing Units) — custom-built chips tuned to run Gemini. The TPUs are different from GPUs, which can run a wide range of AI, graphics and scientific applications. The problems with surging demand are a reminder for enterprises to secure stable computing capacity to prevent AI downtimes, said Jim McGregor, principal analyst at Tirias Research. “The shift to images, video, agents…, it’s going to drive the demand for more AI compute resources for the foreseeable future,” he said. OpenAI and Google are widely used by individuals and enterprises. Typically, it takes time for the hardware to catch up to efficiently operate new AI software, and unintended interruptions can affect productivity of companies, analysts said. OpenAI has always had capacity issues when new models are launched, said Dylan Patel, founder of semiconductor consulting firm SemiAnalysis. “The demand for AI is insatiable,” Patel said. OpenAI’s image creation tool is more compute intensive than text creation, and it also demands more computing power from GPUs, said Bob O’Donnell, principal analyst at Technalysis. “That’s just classic system overload,” he said. Nvidia’s GPUs consume massive amounts of power and can throttle down performance if overloaded or overheated. GPUs also operate at lower temperatures, which affects performance. CentML, which provides AI services on Nvidia GPUs, has experienced significant spikes in demand, particularly when supporting new models, said Gennady Pekhimenko, CEO of the Toronto-based company. The company saw a spike in sign-ups within the first few days after it started serving DeepSeek, which was released earlier this year. CentML has plans in place that guarantee uptimes, reserved instances, and guaranteed outputs, all of which are part of its service-level agreements. There are many things OpenAI could do to catch up with demand, including reducing the size of the model or optimizing code, said Pekhimenko, who is also an associate professor for computer science at the University of Toronto. For some commercial use cases, the large language models (LLMs) used by OpenAI and Google Gemini may be too heavy; smaller or open-source language models that require fewer computing resources and cost less might be enough, Pekhimenko said. Enterprises can also buy genAI computing capacity from different companies, which provides protection against downtime from industry behemoths, Pekhimenko said. CentML also provides options to get compute capacity from majr cloud vendors. But there’s no lack of computing capacity, unlike the previous years, when GPU shortages hobbled AI scaling, Pekhimenko said. Altman’s evocative take on GPUs “burning” may have been a way to promote the new image-generation models. “Probably [OpenAI] also liked to generate a little bit more hype around it. So, they tried to frame it this way,” Pekhimenko said. Major cloud providers are investing billions in new data centers to keep up with the growing demand. US President Donald J. Trump recently touted a private-sector investment of $500 billion to build out the AI infrastructure from companies including OpenAI, SoftBank, and Oracle. But the release of the DeepSeek model from China proved AI could be done at a more reasonable cost with software optimizations. It undercut the notion that more hardware is always needed to scale AI. Recent reports indicated that OpenAI may be looking to build its own data centers, as Microsoft pulls out of data center projects in the US and Europe. That indicates a potential oversupply of AI computing capacity.
https://www.computerworld.com/article/3856435/openai-google-ai-data-centers-are-under-stress-after-n...
Related News |
25 sources
Current Date
Mar, Mon 31 - 12:07 CEST
|