| ▲ | orbital-decay 3 days ago | |
>entirely-opaque-to-the-LLM token groupings This is clearly not the case, any modern (non-reasoning) model easily decomposes words into individual token-characters (try separating them with e.g. Braille spaces...) and does arbitrary tokenization variants if forced with a sampler. It's way deeper than tokenization, and models struggle exactly with counting items in a list, exact ordering, retrieving scattered data, etc. LLM context works a lot more like associative memory than a sequence that can be iterated over. There are also fundamental biases and specific model quirks that lead to this. | ||