@__matthew__

__matthew__@lemmy.world · 5 months ago

Sorry but has anyone in this thread actually tried running local LLMs on CPU? You can easily run a 7B model at varying levels of quantization (ie. 5 bit quantization) and get a generalized prompt-able LLM. Yeah, of course it’s going to take ~4GB of RAM (which is mem-mapped and paged into memory), but you can easily fine tune smaller more specific models (like the translation one mentioned above) and have surprising intelligence at a fraction of the resources.

Take, for example, phi-2 which performs as well as 13B param models but with 2.7B params. Yeah, that’s still going to take 1.5GB RAM which Firefox wouldn’t reasonably ship, but many lighter weight specialized tasks could easily use something like a fine tuned 0.3B model with quantization.

__matthew__@lemmy.world · 6 months ago

Yeah, but that doesn’t prevent the author from selling their extension to an untrusted buyer like in the case of Nano Adblocker.

__matthew__@lemmy.world · edit-2 7 months ago

While Linus went overboard (as he has a history of doing, and as has also caused negativity to the community), this post is still very well liked because it appears to be a strong example of someone calling out the BS that a lot of developers like to throw around. No one’s going to join in a circle celebrating Linus picking on some first time contributor who didn’t know any better, but that’s how it sounds like you’re interpreting the post.

To add some context, there’s a toxic superiority complex that many developers have where they jump to blame others for issues that actually relate to their code. You can see this anywhere from developers who immediately blame users without investigating to software developers within companies who are quick to pass off issues as not their team’s problem.

So, in this example Linus is actually calling one of these developers out, which is why the post is very well-received.