Navigation
Search
|
IBM CEO: Smaller, domain-specific genAI models are the future
Tuesday May 6, 2025. 07:30 PM , from ComputerWorld
Only 1% of enterprise data has so far been accessed by generative AI (genAI) models because of a lack of integration and coordination between numerous data centers, cloud services and edge environments, according to IBM CEO Arvind Krishna. And for that to change, smaller, special-purpose genAI models tailored to specific domain tasks such as HR, sales, retail and manufacturing, will needed.
Speaking at IBM’s Think 2025 conference in Boston on Tuesday, Krishna laid out his company’s focus for the future: integrating both open-source large language models (LLMs) and small language models that can be easily deployed and customized by whatever enterprise is using them. “Smaller models are incredibly accurate,” Krishna said. “They’re much, much faster. They’re much more cost effective to run. And you can choose to run them where you want. It’s not a substitute for larger [AI] models, it’s an ‘and’ with the larger models you can now tailor … to enterprise needs.” As well as being simpler to deploy and customize, smaller AI models are as much as 30 times less expensive to run than more conventional LLMs, he said. Just as the cost of storage and computing have dropped dramatically since the 1990s, AI technology will also become significantly cheaper over time, Krishna said. “As that happens, you can throw [AI] at a lot more problems,” he said. “There’s no law in computer science that says AI must remain expensive and large. That’s the engineering challenge we’re taking on.” Krishna highlighted IBM’s Granite family of open-source AI models – smaller models with between 3 billion and 20 billion parameters — and how they compare to LLMs such as GPT-4, which has more than 1 trillion parameters. (OpenAI, Meta and other AI model builders are also focused on creating “mini” models of their larger platforms, such as GPT o3 and GPT o4 mini, and Llama 2 and Llama 3, all of which are reported to have 8 billion or fewer parameters.) IBM’s latest Granite 3.0 models are integrated into its WatsonX platform, the company’s AI and data platform that’s designed to help enterprises build, train, tune, and deploy AI models at scale — especially for specific business applications. Granite 3.0 was introduced last October and is part of IBM’s broader strategy to provide scalable, efficient, and customizable AI solutions for business “The era of AI experimentation is over,” Krishna said. “Success is going to be defined by integration and business outcomes. That’s what we’re announcing today. With our WatsonX Orchestrate family of products, you can build your own agent in less than five minutes.” WatsonX Orchestrate also comes with 150 pre-built AI models for various purposes. To enable AI-embedded networking to connect geographically dispersed data sources, IBM and telecom company Lumen Technologies announced a partnership during Think. The two will focus on creating real-time AI inferencing closer to where data is generated, which should reduce cost and latency and address security barriers as companies scale up genAI adoption. Lumen CEO Kate Johnson said her company is launching its largest network upgrade and expansion in decades; Lumen’s networks will now run WatsonX at the edge, enabling more secure access to data where it’s being created, overcoming the latency issues that can arise on more traditional networks. “We bring the power of proximity to companies that are trying to get the most out of their AI,” she said. “Imagine working with your AI models and constantly sending all that data back to the cloud and waiting for it. It’s costly, it’s slow, it’s not nearly as secure. Our combined capabilities with WatsonX at the edge enables real-time inferencing. “All the edge locations are connected to the fabric,” Johnson said. “It’s ubiquitous and covers all the use cases.” For example, genAI can be used in clinical settings for real-time diagnostics of patient records. As a patient is examined, that data is fed into a local database, which can be accessed by genAI and combined with historical data from another location – a hospital’s data center. “That’s game-changing and potentially lifesaving,” Johnson said. Johnson also illustrated how AI will work at the edge with a lights-out manufacturing facility, run almost completely by robotics and generating terabytes of data in order to run. “Every millisecond matters. What we’re seeing is factories are looking for proximity data centers, from networking to power and cooling, and our combined solution gives them something pretty powerful right out of the box,” she said.
https://www.computerworld.com/article/3978675/ibm-ceo-smaller-domain-specific-genai-models-are-the-f...
Related News |
25 sources
Current Date
May, Tue 6 - 22:02 CEST
|