▲ | blutfink 5 days ago | |||||||||||||||||||||||||
Of course an LLM uses more space internally for a token. But so do humans. My point was that you compared how the LLM represents a token internally versus how “English” transmits a word. That’s a category error. | ||||||||||||||||||||||||||
▲ | amelius 5 days ago | parent [-] | |||||||||||||||||||||||||
But humans we can feed ascii, whereas LLMs require token inputs. My original question was about that: why can't we just feed the LLMs ascii, and let it figure out how it wants to encode that internally, __implicitly__? I.e., we just design a network and feed it ascii, as opposed to figuring out an encoding in a separate step and feeding it tokens in that encoding. | ||||||||||||||||||||||||||
|