Remix.run Logo
smcleod a year ago

Neat to see more folks writing blogs on their experiences. This however does seem like it's an over-complicated method of building llama.cpp.

Assuming you want to do this iteratively (at least for the first time) should only need to run:

  ccmake .
And toggle the parameters your hardware supports or that you want (e.g. if CUDA if you're using Nvidia, Metal if you're using Apple etc..), and press 'c' (configure) then 'g' (generate), then:

  cmake --build . -j $(expr $(nproc) / 2)

Done.

If you want to move the binaries into your PATH, you could then optionally run cmake install.

lukev a year ago | parent | next [-]

Actually I think even this makes it look scarier than it is if you're on an M-series Apple.

In that case, the steps to building llama.cpp are:

1. Clone the repo.

2. Run `make`.

To start chatting with a model all you need is to:

1. Download the model you want in gguf format that will fit into your hardware (probably the hardest step, but readily available on HuggingFace)

2. Run `./llama-server -m model.gguf`.

3. Visit localhost:8080

int_19h a year ago | parent [-]

On a Mac, if all you want is to just use it directly, it is also readily available from Homebrew.

SteelPh0enix a year ago | parent | prev | next [-]

Wow, i did not know about ccmake. I'll check it out and edit the post if it's really that easy to use, thanks.

moffkalast a year ago | parent | prev | next [-]

Yeah the mingw method on windows is a ludicrous thing to even think about, and llama.cpp still has that as the suggested option in the readme for some weird reason. Endless sourcing of paths that never works quite right. I literally couldn't get it to work when I first tried it last year.

Meanwhile Cmake is like two lines and somehow it's the backup fallback option? I don't get it. And well on linux it's literally just one line with make.

blharr a year ago | parent | next [-]

Building anything on windows without cmake is just... I don't know why anyone would use anything else. I used to spend hours wrestling with build failures, but after setting up cmake, it just works with everything.

SteelPh0enix a year ago | parent | prev [-]

>Yeah the mingw method on windows is a ludicrous thing to even think about

Building most stuff on Windows is ludicrous, that's not something uncommon. I've chosen MSYS there as it's the easiest and least paninful way of installing deps for Vulkan build.

xyc a year ago | parent | prev [-]

You can get a release binary from https://github.com/ggerganov/llama.cpp/releases too.

freehorse a year ago | parent | next [-]

Does it autoupdate? I get it from github so I just have to pull and build again every time I want to update it.

smcleod a year ago | parent | prev [-]

Yes, but that's not building it for your system, that's a relatively generic build.