What are your thoughts on #privacy and #itsecurity regarding the #LocalLLMs you use? They seem to be an alternative to ChatGPT, MS Copilot etc. which basically are creepy privacy black boxes. How can you be sure that local LLMs do not A) “phone home” or B) create a profile on you, C) that their analysis is restricted to the scope of your terminal? As far as I can see #ollama and #lmstudio do not provide privacy statements.
I run Ollama with Open WebUI at home.
A) the containers they run in by default can’t access the Internet, but they are provided access if we turn on web search or want to download new models. Ollama and Open WebUI are fairly popular products and I haven’t seen any evidence of nefarious activity so far.
B) they create a profile on me and my family members that use them, by design. We can add sensitive documents that the models can use.
C) they are restricted by what we type and the documents we provide.
To add to this, I run the same setup, but add Continue to VSCode. It makes an interface similar to Cursor that uses the Ollama instance.
One thing to be careful of, the Ollama port has no authentication (ridiculous, but it is what it is).
You’ll need either a card with 12-16GB VRAM for the recommended models for code generation and auto complete, or you may he able to get away with an 8GB card if it’s a second card in the system. You can also run on CPU, but it’s very slow that way.
@ShotDonkey@lemmy.world
Thank you. As far as I can see these models are for free. Doing data mining on users would be a tempting thing, right? Ollama does not specify this on their homepage, no payed plans, no ‘free for private use’ etc. How do they pay their staff and electricity and harware bills for model training? Do you know anything on the underlying business models?
Ollama and Open WebUI, as far as I know, are just open source software projects created to run pre-trained models, and have the same business model as many other open source projects on Github.
The models themselves come from Google, Meta and others. Have a look at all the models available on Hugging Face. The models themselves are just binary files. They’ve been trained and there are no ongoing costs to use them apart from energy your computer uses to run them.
Thank you!
The english word “free” actually carries two meanings: “free as in free food” (gratis) and “free as in free speech” (libre).
Ollama is both gratis and libre.
And about the money stuff: Ollama used to be Facebook’s proprietary model, an answer to ChatGPT and Bing Chat/Copilot. Facebook lagged behind the other players and they just said “fuck it, we’re going open-source”. That’s how and why it’s free.
Due to it being open-source, even though models are by design binary blobs, the code that interacts with them and runs them is open-source. If they were connecting to the Internet and phoning home to Facebook, chances are this would’ve been found out by the community due to the open nature of the project.
Even if it weren’t open-source, since it runs locally you could at least block (or view) Internet access.
Basically, even though this is from Facebook, one of the big bads of privacy on the Internet, it’s all good in the end.
Just to be clear, llama is the facebook model, ollama is the software that lets you run llama (along with many other models.
Ollama has internet access (otherwise how could it download models?), the only true privacy solution is to run in a container with no internet access after downloading models, or air gap your computer.
Thank you for the correction!
Could you not just monitor/block outgoing traffic?
Great, thanks for this background!
Did you do any research at all?
It’s fbs model. They made it free as a PR move. If youre actually worried about it phoning home, you could easily monitor the traffic leaving your PC and see if it’s collecting data.
It’s facebook, they pay their staff with the astronomical amount of money they have. This is a simpler model, and their goal is to look like the good guy by making this one free, and selling later ones like all the other AI companies are doing. Except FB has fuck you money.
How fast are response times and how useful are the answers of these open source models that you can run on a low end GPU? I know this will be a “depends” answer, but maybe you can share more of your experience. I often use Claude sonnet newest model and for my use cases it is a real efficiency boost if used right. I once mid of last year tested briefly an open source model from meta and it just wasn’t it. Or do we rather have to conclude that we’ll have to wait for another year until smaller open source models are more proficient?
I’m still trying out combinations of hardware and models, but even my old Intel 8500T CPU will run around reading speed with a stock version of Meta’s Llama 3.2 3b (maybe the one you tried) with mostly good output—fine for rewriting content, answering questions about uploaded document stores etc.
There are thousands of models tuned for various purposes, so one of the key questions is your purpose. If you want to use your setup for something specific (e.g., coding SQL) you are going to be able to find a much more efficient model.