| ▲ | LargoLasskhyfv 2 days ago | |
Have you tried anything with https://codeberg.org/ikawrakow/illama https://github.com/ikawrakow/ik_llama.cpp and their 4Bit-quants? Or maybe even Microsofts Bitnet? https://github.com/microsoft/BitNet https://github.com/ikawrakow/ik_llama.cpp/pull/337 https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf ? That would be an interesting comparison for running local LLMs on such low-end/edge-devices. Or common office machines with only iGPU. | ||