| ▲ | sailingparrot a day ago | |
Thinking is implemented as regular autoregressive generations by everyone, meaning its just regular tokens, but they appear between <thinking></thinking> special tokens which are then programmatically removed from what the user can actually see. Idea somewhat similar to what you describe exist but they make steering/post-training/interpretation much harder. | ||