▲ | psychoslave 5 days ago | ||||||||||||||||||||||||||||||||||
Which one? I tried a few months ago, and it was like one word every few seconds. I didn't dig far though, just installing the llm tool which apparently is doing what 'mise' is doing for programming environment, and went with first localy runnable suggestion I could found. | |||||||||||||||||||||||||||||||||||
▲ | _1 5 days ago | parent [-] | ||||||||||||||||||||||||||||||||||
You might need to play around with the default settings. One of the first models I tried running on my Mac was really slow.. Turns out it was preallocating a long context window that wouldn't fit in the GPU memory, so it ran on the CPU. | |||||||||||||||||||||||||||||||||||
|