▲ | jug 5 hours ago | |||||||||||||||||||||||||||||||||||||
> there are some new capabilities that are big, but they are still fundamentally next-token predictors Anthropic recently released research where they saw how when Claude attempted to compose poetry, it didn't simply predict token by token and "react" to when it thought it might need a rhyme and then looked at its context to think of something appropriate, but actually saw several tokens ahead and adjusted for where it'd likely end up, ahead of time. Anthropic also says this adds to evidence seen elsewhere that language models seem to sometimes "plan ahead". Please check out the section "Planning in poems" here; it's pretty interesting! https://transformer-circuits.pub/2025/attribution-graphs/bio... | ||||||||||||||||||||||||||||||||||||||
▲ | percentcer 5 hours ago | parent [-] | |||||||||||||||||||||||||||||||||||||
Isn't this just a form of next token prediction? i.e. you'll keep your options open for a potential rhyme if you select words that have many associated rhyming pairs, and you'll further keep your options open if you focus on broad topics over niche | ||||||||||||||||||||||||||||||||||||||
|