Remix.run Logo
graemefawcett 3 days ago

It's not just an issue of tokenization, it's almost a category error. Lisp, accounting and the number of r's in strawberry are all operations that require state. Balancing ((your)((lisp)(parens))) requires a stack, count r's in strawberry requires a register, counting to 5 requires an accumulator to hold 4.

An LLM is a router and completely stateless aside from the context you feed into it. Attention is just routing the probability distribution of the next token, and I'm not sure that's going to accumulate much in a single pass.

BoredomIsFun 3 days ago | parent [-]

> An LLM is a router and completely stateless aside from the context you feed into it.

Not the latest SSM and hybrid attention ones.

graemefawcett 2 days ago | parent [-]

Stateless router to router with lossy scratchpad is a step up, still not going to ask it to check my Lisp. That's what linters are for