Intel unveils its AI roadmap, chips to rival Nvidia

Tuesday April 9, 2024. 10:04 PM , from ComputerWorld

Intel on Tuesday formally introduced its Gaudi 3 processor — aimed at accelerating enterprise generative artificial intelligence (genAI) workloads — at its Vision 2024 conference and unveiled a range of next-gen products and strategic collaborations to grow genAI adoption.

The chipmaker’s strategy enveloped hardware and cloud services roadmaps for everything from the data centers to edge devices, including AI-enabled PCs.

During a keynote speech, Intel CEO Pat Gelsinger heralded the age of AI, which includes PCs that will be using a new family of Intel Core Ultra processors. The chipmaker expects to ship 40 million AI PC processors in 2024 and 100 million next year.

In December, Intel originally announced the upcoming release of its Gaudi 3 processor for data center AI workloads, and previewed its 14th-Gen Core Ultra “Meteor Lake” data center processors and 5th-Gen Xeon Scalable CPUs. The company made the official announcements of the two latter processors Tuesday.

Intel also announced that its next-generation Granite Ridge and Sierra Forest processors will be branded “Xeon 6,” replacing older marketing language that used generational terms, such as “Fifth-Gen Xeon Scalable” models.

Intel CEO Pat Gelsinger holds an upcoming Xeon 6 processor wafer.
Intel

The new Xeon 6 processors will incorporate software support for the MXFP4 data format, which reduces next-token latency by up to 6.5 times compared to 4th-generation Xeon using FP16, with the ability to run 70-billion-parameter Llama-2 large language models.

During its onstage presentation, Intel offered new details about the Gaudi 3 architecture, performance, and the OEMs committed to bringing it market and touted a number of new customers. The company cited more than a dozen “partners” using its Gaudi 3 accelerators, including Naver Corp., Bosch, NielsenIQ, and Seekr.

Historically, Nvidia has led the AI hardware market with it GPUs (graphics processing units) and TPUs [tensor processing units], created to power and train large language models and AI applications. Intel positioned its Gaudi 3 as a direct competitor to Nvidia’s H100 GPU.

The Gaudi 3 delivers 50% on average better inference and 40% on average better power efficiency compared to the Nvidia H100 – “at a fraction of the cost,” Gelsinger said. According to Intel, the Gaudi 3 accelerators can deliver four times AI compute for computer memory systems suing the BF16 floating point format and 1.5 times the in-memory bandwidth over Gaudi 2; it also offers twice the networking bandwidth compared to its predecessor.

Intel used TSMC’s 5nm process to build the Gaudi 3 chips, which are now available to original equipment manufacturers (OEMs) including Dell, HPE, Lenovo and Supermicro for AI data center market. The chip is designed to be strung together with thousands of others in racks within data centers.

Last year, Nvidia controlled about 83% of the data center chip market, with much of the remaining 17% dominated by Google’s custom tensor processing units (TPUs).

Benjamin Lee, a professor at the University of Pennsylvania’s School of Engineering and Applied Science, said Intel’s trajectory isn’t an easy one and it has challenges to being competitive with Nvidia.

“Intel long dominated the design and manufacture of high-performance CPUs, but recent challenges reflect fundamental changes in the computing landscape,” Lee said. “Data centers will continue to deploy CPUs in large numbers to support Internet services and cloud computing, but are increasingly deploying GPUs to support AI, and Intel has struggled to design competitive GPUs.”

Intel’s unique advantage is that it’s the only domestic chip fabrication provider that could possibly compete with TSMC in manufacturing the most advanced chips, “giving it an upper hand against competitors like Nvidia and AMD, which are fabless,” Lee said. “Intel has not yet succeeded in establishing and growing a foundry business like TSMC. This will be essential to its future, given so many technology companies now design their own high-performance processors.”

Intel also has not kept pace with TSMC’s advances in transistor technology or the ability to satisfy contracts with the precision and efficiency to match TSMC’s foundry, Lee said. And Intel currently lacks the fabrication capacity to serve both its own manufacturing needs and a larger customer base.

Intel’s roadmap as laid out by its CEO is sensible, Lee noted, yet “the million-dollar question is whether it can execute it effectively using a fresh injection of federal funding from the CHIPS Act.”

In August 2022, Congress passed the CHIPS and Science Act (CHIPS Act) to address processor shortages exposed by the Covid-19 pandemic. The legislation provided the US Department of Commerce (DoC) with $52.7 billion for a suite of programs under the CHIPS for America program to “revitalize” the US position in semiconductor research, development, and manufacturing. Intel is poised to get about $8.5 billion of those funds.

Intel’s Gelsinger heralded the CHIPS Act as enabling the company’s first chips to emerge from its $20 billion Ocotillo fabrication facility in Chandler, Ariz., last year.

At present, however, the CHIPS Act provides little direct support for chip designers such as Nvidia’s GPUs, Apple’s NPUs, and Google’s TPUs, all of which have historically flourished in the US.

During its Vision conference, Intel also provided updates on its next-gen products and services across all segments of enterprise AI, including its new Intel Xeon 6 processors, which can run retrieval augmented generation processes, or “RAG” for short

RAG creates a more customized and accurate genAI model by using an organization’s proprietary data and information; that can greatly reduce known AI problems such as erroneous outputs and hallucinations.

Gelsinger illustrated how unreliable genAI is using data scraped from the Internet that’s not updated in real time.

With standard LLMs, “maybe if you’re really good you’re updating and retraining…maybe once a week, maybe once a month?” he said. “When you’re combining [an LLM] with real-time data coming through your vector databases, your streaming unstructured databases — as well and bringing both of those together in real time — we think that’s extraordinarily powerful.”

Intel also said that this quarter it will release a new brand for its next-generation processors for data centers, cloud and edge purposes. The Intel Xeon 6 processors with Efficient-cores (E-core — formerly code-named Sierra Forest), will offer up to 2.4 times the performance per watt and 2.7 times better rack density compared to 2nd-gen Intel Xeon processors.

He described the past decade of Intel’s innovation as mundane, saying the company made PCIe a little bit faster, incrementally upgraded DDR memory, and added “a few more cores” to chips before shipping them out the door.

“Boring,” Gelsinger said. “AI is making everything exciting like we haven’t seen. The fundamental direction computing is taking is the biggest change in technology since the Internet, and it’s going to reshape every aspect of our business and yours.”

The total addressable market for semiconductor is expected to grow from $600 billion now to more than $1 trillion by the end of the decade, he said.

To that end, Gelsinger also announced that the company’s next-generation Core Ultra client processor family (code-named Lunar Lake) will be launching later this year. The processors will have than 100 platform tera operations per second (TOPS) and more than 45 neural processing unit (NPU) TOPS for next-generation AI PCs.

“Intel’s on a mission to bring AI everywhere,” Gelsinger told a packed auditorium in Phoenix, Ariz. “I’m quite excited about the next platform. You know, before competitors shipped their first [AI] chips, we’re launching our second — the Lunar Lake with 3X the AI performance. And, the third generation is in [fabrication].”

Gelsinger compared AI-enabled PCs to Wi-Fi, saying the day will come when a PC without AI capabilities will be considered passé. “Microsoft Copilot, AI developers, Zoom and Teams summarization, translation, contextualization,” he said. “Every application is going through an AI makeover. You’re going to miss out. Simply put, it’s time to refresh your PCs.”

Intel is also working on creating an open Ethernet networking model for AI fabrics, and introduced an array of AI-optimized Ethernet solutions. The company is working through the Ultra Ethernet Consortium (UEC) to design large scale-up and scale-out AI fabrics.

“These innovations enable training and inferencing for increasingly vast models, with sizes expanding by an order of magnitude each generation,” Intel said in a statement. “The lineup includes the Intel AI NIC (network interface card), AI connectivity chiplets for integration into XPUs, Gaudi-based systems, and a range of soft and hard reference AI interconnect designs for Intel Foundry.”
Artificial Intelligence, CPUs and Processors, Emerging Technology, Industry, Intel, Vendors and Providers