| ▲ | XCSme 4 hours ago | |||||||
In my tests[0] it does only slightly better than Kimi K2.5. Kimi K2.6 seems to struggle most with puzzle/domain-specific and trick-style exactness tasks, where it shows frequent instruction misses and wrong-answer failures. It is probably a great coding model, but a bit less intelligent overall than SOTAs [0]: https://aibenchy.com/compare/moonshotai-kimi-k2-6-medium/moo... | ||||||||
| ▲ | deepsquirrelnet 3 hours ago | parent [-] | |||||||
I tried it on openrouter and set max tokens to 8192, and every response is truncated, even in non-thinking mode. Maybe there's an issue with the deployment, but in your link also shows it generates tons of output tokens. | ||||||||
| ||||||||