▲ | trilogic 7 days ago | ||||||||||||||||||||||
With all respect you do seem to not understand much of how privacy works. Llama-server is working in Http. And yes the app is a bit heavy as is loading llm models using llama.cpp cli and multimodal which in itself are quite heavy, also just the dlls for cpu/gpu are huge, (just the one for the nvidial gpu is 500mb if I don't go wrong). | |||||||||||||||||||||||
▲ | kgeist 7 days ago | parent | next [-] | ||||||||||||||||||||||
Unless you expose random ports on the local machine to the Internet, running apps on localhost is pretty safe. Llama-server's UI stores conversations in the browser's localStorage so it's not retrievable even if you expose your port. To me, downloading 500 MB from some random site feels far less safe :) >the app is a bit heavy as is loading llm models using llama.cpp cli So it adds an unnecessary overhead of reloading all the weights to VRAM on each message? On some larger models it can take up to a minute. Or you somehow stream input/output from an attached CLI process without restarting it? | |||||||||||||||||||||||
▲ | rcakebread 7 days ago | parent | prev | next [-] | ||||||||||||||||||||||
Says the guy with a link to a broken privacy policy on their website. | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | giantrobot 7 days ago | parent | prev [-] | ||||||||||||||||||||||
> With all respect you do seem to not understand much of how privacy works. Llama-server is working in Http. What in the world are you trying to say here? llama.cpp can run completely locally and web access can be limited to localhost only. That's entirely private and offline (after downloading a model). I can't tell if you're spreading FUD about llama.cpp or are just generally misinformed about how it works. You certainly have some motivated reasoning trying to promote your app which makes your replies seem very disingenuous. | |||||||||||||||||||||||
|