Remix.run Logo
astrange 7 hours ago

BF involves a lot of repeated symbols, which is hard for tokenized models. Same problem as r's in strawberry.

bwestergard 7 hours ago | parent [-]

Interesting. So why do the models seem to handle deeply nested Lisp expressions just fine?

kgeist 6 hours ago | parent [-]

Probably because there's a ton of code that deals with nested parentheses across languages in the training data, and models have learned how to work around tokenization limitations, when it comes to parentheses.