| ▲ | pka 2 days ago | |
It seems models are pre-planning though: > How does Claude write rhyming poetry? Consider this ditty: > He saw a carrot and had to grab it, > His hunger was like a starving rabbit > To write the second line, the model had to satisfy two constraints at the same time: the need to rhyme (with "grab it"), and the need to make sense (why did he grab the carrot?). Our guess was that Claude was writing word-by-word without much forethought until the end of the line, where it would make sure to pick a word that rhymes. We therefore expected to see a circuit with parallel paths, one for ensuring the final word made sense, and one for ensuring it rhymes. > Instead, we found that Claude plans ahead. Before starting the second line, it began "thinking" of potential on-topic words that would rhyme with "grab it". Then, with these plans in mind, it writes a line to end with the planned word. [https://www.anthropic.com/research/tracing-thoughts-language...] | ||