How can I use ByteShape to run LLMs faster on my 32GB MacBook M1 Max? Or has Ollama already optimized that?
don't use ollama. llama.cpp is better because ollama has an outdated llama.cpp