Remix.run Logo
yjftsjthsd-h 7 months ago

You might also try https://github.com/Mozilla-Ocho/llamafile , which may have better CPU-only performance than ollama. It does require you to grab .gguf files yourself (unless you use one of their prebuilts in which case it comes with the binary!), but with that done it's really easy to use and has decent performance.

For reference, this is how I run it:

  $ cat ~/.config/systemd/user/llamafile@.service
  [Unit]
  Description=llamafile with arbitrary model
  After=network.target
  
  [Service]
  Type=simple
  WorkingDirectory=%h/llms/
  ExecStart=sh -c "%h/.local/bin/llamafile -m %h/llamafile-models/%i.gguf --server --host '::' --port 8081 --nobrowser --log-disable"
  
  [Install]
  WantedBy=default.target
And then

  systemctl --user start llamafile@whatevermodel
but you can just run that ExecStart command directly and it works.
chatmasta 7 months ago | parent | next [-]

Be careful running this on work machines – it will get flagged by Crowdstrike Falcon and probably other EDR tools. In my case the first time I tried it, I just saw “Killed” and then got a DM from SecOps within two minutes.

broknbottle 7 months ago | parent | next [-]

the irony, preventing and killing something that is actually useful, while we let crowdcrap hum along consuming tons of memory and bottlenecking IO so it can do snakeoil things...

yjftsjthsd-h 7 months ago | parent | prev [-]

Are they specifically flagging LLMs, or do they not like Cosmopolitan Libc / APE?

chatmasta 7 months ago | parent [-]

Nah nothing to do with LLMs, it’s just because the method of Llamafile is very similar to malware - basically zip up an executable, concatenate it with some stuff, throw it in /tmp and execute it with a randomly generated high entropy name.

(That said, after I explained it to SecOps they did tell me I would need to “consult legal” if I wanted to use a local LLM, but I’ll give them the benefit of the doubt there…)

SahAssar 7 months ago | parent | prev [-]

Is that `--host` listening on non-local addresses? Might be good to default to local-only.

yjftsjthsd-h 7 months ago | parent [-]

Good call out; in my context yes I do want it listening for use by other machines in its subnets and deliberately set that option (including using the IPv6 form), but most people are probably better off binding to loopback. Thanks