I noticed the performance issues too. I started using Jan recently and tried running the same model via llama.cpp vs local ollama, and the llama.cpp one was noticeably faster.