Can Intel’s new chips compete with Nvidia in the AI universe?

Wednesday June 5, 2024. 12:00 PM , from ComputerWorld

Intel is aiming its next-generation X86 processors at artificial intelligence (AI) tasks, even though the chips won’t actually run AI workloads themselves.

At Computex this week, Intel announced its Xeon 6 processor line, talking up what it calls Efficient-cores (E-cores) that it said will deliver up to 4.2 times the performance of Xeon 5 processors. The first Xeon 6 CPU is the Sierra Forest version (6700 series) a more performance-oriented line, Granite Rapids with Performance cores (P-cores or 6900 series), will be released next quarter.

The upgraded Xeon processors can enable 3:1 data center rack consolidation while delivering the same performance and up to 2.6 times performance-per-watt gains over their predecessors, according to Intel.

“The data center AI market is hyper focused on the impact of AI power consumption with increasing concerns around the environmental impact and impact on the power grid,” said Reece Hayden, a principal analyst for ABI Research. “Intel Xeon 6 will be used as the CPU head node within Gaudi-powered AI systems. Improved performance per watt and density will reduce the AI systems’ power consumption, which will be positive for AI’s total energy footprint.”

Greater rack density allows for data center consolidation, freeing up room to deploy AI-focused hardware to support training or inferencing, Hayden said.

A worker in the Intel Assembly Test facility in Kulim, Malasia, inspects Intel Xeon 6, Sierra Forrest processors with E-Cores.
Intel Corp.

Intel also took the wraps off its Lunar Lake line of client processors, which are aimed at the AI PC industry. The x86 chips use up to 40% lower system-on-chip (SoC) power compared with the previous generation, according to Intel.

The Lunar Lake Core Ultra processor line is expected to be available in the third quarter of this year; with neural processing units (NPUs) on board, the chips will have more than 100 platform tera operations per second (TOPS) and more than 45 NPU TOPS and are aimed at a new generation of PCs enabled for generative AI (genAI) tasks.

Intel recently detailed its chip strategy, outlining plans for processor lines that run AI from data centers to edge devices. Within two years, 100% of enterprise PC purchases will be AI computers, according to IDC.

“Intel is one of the only companies in the world innovating across the full spectrum of the AI market opportunity — from semiconductor manufacturing to PC, network, edge and data center systems,” Intel CEO Pat Gelsinger said in a statement from the Computex Conference in Taiwan this week.

Intel also announced pricing for its Gaudi 2 and Intel Gaudi 3 AI accelerator kits — deep learning accelerators aimed at supporting training and inference of artificial intelligence large language models (LLMs). The Gaudi 3 accelerator kit, which includes eight of the AI chips, sells for about $125,000; the earlier generation Gaudi 2 has a list price of $65,000.

Accelerator microprocessors handle two primary purposes for genAI: training and inference. Chips that handle AI training use vast amounts of data to train neural network algorithms that then are expected to make accurate predictions, such as the next word or phrase in a sentence or the next image, for example. So chips are required to speedily infer what that answer to a prompt (query) will be.

But LLMs must be trained before they can begin to infer a useful answer to a query. The most popular LLMs provide answers based on massive data sets ingested from the Internet, but can sometimes be inaccurate or offer downright bizarre results, as is the case with genAI hallucinations.

Shane Rau, IDC’s research vice president for computing semiconductors, said Intel’s introduction of Xeon 6 with P-cores and E-cores acknowledges that end-user workloads continue to diversify and, depending on what workload an end-user has, they may need primarily performance (P-cores) or to balance performance and power consumption (E-cores).

“For example, workloads run primarily in a core data center, where there are fewer power constraints and more need for raw performance, can use more P-cores,” Rau said. “In contrast, workloads run primarily in edge systems, like edge servers, need to work within more constrained environments where power consumption and heat output must be limited,” and therefore benefit from E-cores.

“If you think of AI has mimicking what humans do and humans do a lot of different tasks requiring different combinations of capabilities, then it stands to reason that AI will need different capabilities depending on the task,” Rau continued. “Further, not every task requires maximum performance and so needs maximum acceleration (e.g. server GPUs), many tasks can be run on microprocessors only, or on other kinds of specialized accelerators. In this way, AI, like a new market, is maturing and segmenting as it matures.”

Intel Corp.

Intel vs. Nvidia; who wins this fight?

Intel is hoping its Gaudi line of accelerators can rival Nvidia’s more expensive GPUs, which have rocketed the chipmaker into the status of leader in the AI processor marketplace. Nvidia has reported soaring revenue as its chips continue to dominate AI cloud services and data center rollouts of genAI applications.

Last year, Nvidia controlled about 83% of the data center chip market, with much of the remaining 17% claimed by Google’s custom tensor processing units (TPUs).

“Most estimates suggest these [Intel’s] prices are between one-third and two-thirds [the price] of competitors. This is highly indicative of their approach to the AI data center market — focusing on affordability and undercutting competitors,” Hayden said.

AMD and Nvidia do not discuss pricing of their chips, but according to custom server vendor Thinkmate, a comparable HGX server system with eight Nvidia H100 AI chips can cost more than $300,000.

Intel claims its Gaudi AI accelerators are a third less expensive compared to “competitive platforms” — namely Nvidia’s GPUs.

“Intel is certainly a direct competitor against NVIDIA in the AI market with products across data center/cloud, edge, devices,” Hayden said. “However, NVIDIA is hyper focused on data center accelerators and supporting training and inference at scale. They dominate this market with very high market share. Increasingly, as this market grows, Intel will grow its overall share, but NVIDIA is still likely to dominate.”

Conversely, Intel’s goal is to enable “AI everywhere” with a major focus on edge and device, especially PC AI (Intel Core Ultra). NVIDIA has been less focused in that area as data center/cloud GPUs are still a fast-expanding market, Heyden noted.

Smaller LLMs — an opening for Intel?

The LLMs used for genAI tools can consume vast amounts of processor cycles and be costly to use. Smaller, more industry- or business-focused models can often provide better results tailored to business needs, and many end-user organizations and vendors have signaled this is their future direction.

“Intel [is] very bullish around the opportunity of smaller LLMs and [is] looking to embed this within their ‘AI everywhere’ strategy. ABI Research agrees that enabling AI everywhere requires lower power consumption, less cost-intensive genAI models, coupled with power efficient, low TCO hardware,” Hayden said.

While its x86 line of chips won’t actually run AI processes, the combination of Xeon 6 processors with Gaudi AI accelerators in a system can make AI operations faster, cheaper and more accessible, according to Intel.

Forrester Senior Analyst Alvin Nguyen agreed that Intel’s strategy of targeting smaller LLMs and edge devices is a smarter bet than attempting to go head-to-head with Nvidia in cloud data centers, where increasingly larger LLMs are prevalent.

“The approach Intel is taking with AI gives them a reasonable chance of competing with Nvidia: they are not trying to replicate what Nvidia is doing,” Nguyen said. “Instead, [Intel is] taking advantage of their breadth of technology coverage and current generative AI approaches to provide an alternative approach that will be more appealing to the enterprise.

“Inference is where enterprises are at and Intel is well positioned with the supply chain issues to entrench themselves here: ‘good enough’ performance, lower costs, and ubiquity of products help them here,” Nguyen said.

The server and storage infrastructure needed for training extremely large LLMs is expected to take up an increasing portion of the AI infrastructure market, according to IDC. The research firm projects that the worldwide AI hardware market (server and storage) will grow from $18.8 billion in 2021 to $41.8 billion in 2026, representing close to 20% of the broader server and storage infrastructure market.