MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
llms
Search

Benchmarks show even an old Nvidia RTX 3090 is enough to serve LLMs to thousands

Friday August 23, 2024. 11:00 PM , from TheRegister
For 100 concurrent users, the card delivered 12.88 tokens per second—just slightly faster than average human reading speed
If you want to scale a large language model (LLM) to a few thousand users, you might think a beefy enterprise GPU is a hard requirement. However, at least according to Backprop, all you actually need is a four-year-old graphics card.…
https://go.theregister.com/feed/www.theregister.com/2024/08/23/3090_ai_benchmark/

Related News

News copyright owned by their original publishers | Copyright © 2004 - 2024 Zicos / 440Network
Current Date
Nov, Tue 5 - 11:42 CET