You Can Now Run a GPT-3 Level AI Model On Your Laptop, Phone, and Raspberry Pi

Tuesday March 14, 2023. 02:00 PM , from Slashdot

An anonymous reader quotes a report from Ars Technica: On Friday, a software developer named Georgi Gerganov created a tool called 'llama.cpp' that can run Meta's new GPT-3-class AI large language model, LLaMA, locally on a Mac laptop. Soon thereafter, people worked out how to run LLaMA on Windows as well. Then someone showed it running on a Pixel 6 phone, and next came a Raspberry Pi (albeit running very slowly). If this keeps up, we may be looking at a pocket-sized ChatGPT competitor before we know it.

Typically, running GPT-3 requires several datacenter-class A100 GPUs (also, the weights for GPT-3 are not public), but LLaMA made waves because it could run on a single beefy consumer GPU. And now, with optimizations that reduce the model size using a technique called quantization, LLaMA can run on an M1 Mac or a lesser Nvidia consumer GPU. After obtaining the LLaMA weights ourselves, we followed [independent AI researcher Simon Willison's] instructions and got the 7B parameter version running on an M1 Macbook Air, and it runs at a reasonable rate of speed. You call it as a script on the command line with a prompt, and LLaMA does its best to complete it in a reasonable way.

There's still the question of how much the quantization affects the quality of the output. In our tests, LLaMA 7B trimmed down to 4-bit quantization was very impressive for running on a MacBook Air -- but still not on par with what you might expect from ChatGPT. It's entirely possible that better prompting techniques might generate better results. Also, optimizations and fine-tunings come quickly when everyone has their hands on the code and the weights -- even though LLaMA is still saddled with some fairly restrictive terms of use. The release of Alpaca today by Stanford proves that fine tuning (additional training with a specific goal in mind) can improve performance, and it's still early days after LLaMA's release. A step-by-step instruction guide for running LLaMA on a Mac can be found here (Warning: it's fairly technical).

Read more of this story at Slashdot.