Intel, Ampere show running LLMs on CPUs isn't as crazy as it sounds

Wednesday May 1, 2024. 01:24 PM , from TheRegister

If you lower you expectations, of course. Think more Llama2-7B, less GPT-4
Popular generative AI chatbots and services like ChatGPT or Gemini mostly run on GPUs or other dedicated accelerators, but as smaller models are more widely deployed in the enterprise, CPU-makers Intel and Ampere are suggesting their wares can do the job too – and their arguments aren't entirely without merit.…

Read more at TheRegister

https://go.theregister.com/feed/www.theregister.com/2024/05/01/intel_ampere_show_running_llms/

Current Date

Dec, Sun 28 - 03:49 CET