| ▲ | dist-epoch 3 days ago | |
it's not strictly a counting task, the LLM sees same-sized-tokens, but a token corresponds to a variable number of characters (which is not directly fed into the model) like the difference between Unicode code-points and UTF-8 bytes, you can't just count UTF-8 bytes to know how many code-points you have | ||
| ▲ | omnicognate 3 days ago | parent [-] | |
There's an aspect of figuring out what to count, but that doesn't make this task visual/spatial in any sense I can make out. | ||