| ▲ | 101008 6 days ago |
| It's an example that shows that if these models aren't trained in a specific problem, they may have a hard time solving it for you. |
|
| ▲ | altruios 6 days ago | parent | next [-] |
| An analogy is asking someone who is colorblind how many colors are on a sheet of paper. What you are probing isn't reasoning, it's perception. If you can't see the input, you can't reason about the input. |
| |
| ▲ | 9rx 5 days ago | parent [-] | | > What you are probing isn't reasoning, it's perception. Its both. A colorblind person will admit their shortcomings and, if compelled to be helpful like an LLM is, will reason their way to finding a solution that works around their limitations. But as LLMs lack a way to reason, you get nonsense instead. | | |
| ▲ | altruios 3 days ago | parent | next [-] | | What tools does the LLM have access to that would reveal sub-token characters to it? This assumes the colorblind person both believes it is true that they are colorblind, in a world where that can be verified, and possesses tools to overcome these limitations. You have to be much more clever to 'see' an atom before the invention of a microscope, if the tool doesn't exist: most of the time you are SOL. | |
| ▲ | 5 days ago | parent | prev [-] | | [deleted] |
|
|
|
| ▲ | Uehreka 6 days ago | parent | prev | next [-] |
| No, it’s an example that shows that LLMs still use a tokenizer, which is not an impediment for almost any task (even many where you would expect it to be, like searching a codebase for variants of a variable name in different cases). |
| |
| ▲ | 8note 6 days ago | parent [-] | | the question remains: is the tokenizer going to be a fundamental limit to my task? how do i know ahead of time? | | |
| ▲ | worldsayshi 6 days ago | parent [-] | | Would it limit a person getting your instructions in Chinese? Tokenisation pretty much means that the LLM is reading symbols instead of phonemes. This makes me wonder if LLMs works better in Chinese. |
|
|
|
| ▲ | victorbjorklund 6 days ago | parent | prev [-] |
| No, it is the issue with the tokenizer. |