| ▲ | wilkystyle 7 hours ago | ||||||||||||||||||||||||||||||||||
Curious to hear more. My experience is limited to llama.cpp on Apple silicon so far, but have been eyeing AMD ecosystem from afar. | |||||||||||||||||||||||||||||||||||
| ▲ | craftkiller 6 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||
FWIW I run llama.cpp on AMD hardware using Vulkan. I've got no complaints but also nothing else to compare against. | |||||||||||||||||||||||||||||||||||
| ▲ | nevi-me 7 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
Perhaps not a good example, I tried running local models a few times, to much disappointment (actually made me skeptical of LLMs in general for a while). My last experiment in January was trying to run a Qwen model locally (RTX 4080; 128GB RAM; 9950X3D). I must have been doing it extremely wrong because the models that I tried either hallucinated severely or got stuck in a loop. The funniest one was stuck in a "but wait, ..." loop. I fortunately had started experimenting with Claude, so I opted to pay Anthropic more money for tokens (work already covers the bill, this was for personal use). That whole experience + a noisy GPU, put me off the idea of running/building local agents. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| ▲ | verdverm 6 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||
The main thing to consider is that how you run the models does not need to be coupled to the what you send models (and how you orchestrate agents). I've used several agent frameworks and they all support many different providers from cloud to local. These are orthogonal responsibilities. I'm using VertexAI for cloud and ollama on a minisforum with rocm locally. There is a dropdown to change between them. | |||||||||||||||||||||||||||||||||||