| ▲ | deepsquirrelnet 3 hours ago | |
I tried it on openrouter and set max tokens to 8192, and every response is truncated, even in non-thinking mode. Maybe there's an issue with the deployment, but in your link also shows it generates tons of output tokens. | ||
| ▲ | XCSme 2 hours ago | parent [-] | |
Oh yeah, I just noticed, like 3x the reasoning tokens. | ||