Remix.run Logo
iugtmkbdfil834 9 hours ago

Seconded. Currently on ollama for local inference, but I am curious how it compares.

LumielGR 8 hours ago | parent [-]

Lemonade is using llama.cpp for text and vision with a nightly ROCm build. It can also load and serve multiple LLMs at the same time. It can also create images, or use whisper.cpp, or use TTS models, or use NPU (e.g Strix Halo amdxdna2), and more!