Remix.run Logo
TrajansRow 4 days ago

Unfortunately, IDE integration like this tends to be very prefill intensive (more math than memory). That puts Apple Silicon at a disadvantage without the feature that we’re talking about. Presumably the upcoming M5 will also have dedicated matmul acceleration in the GPU. This could potentially change everything in favor of local AI, particularly on mobile devices like laptops.

evilduck 4 days ago | parent [-]

Cline has a new "compact" prompt enabled for their LM Studio integration which greatly alleviates the long system prompt prefill problem, especially for Macs which suffer from low compute (though it disables MCP server usage, presumably the lost part of the prompt is what made that work well).

It seems to work better for me when I tested it and Cline's supposedly adding it to the Ollama integration. I suspect that type of alternate local configuration will proliferate into the adjacent projects like Roo, Kilo, Continue, etc.

Apple adding hardware to speed it up will be even better, the next time I buy a new computer.