| ▲ | omcnoe 2 hours ago | |
I saw a curious post recently that explored this idea, and showed that it isn’t really the case. The internal layers of the model aren’t really reasoning in English, or in any human language. Translation in/out of human languages only happens at the edges of the model. Internal layer activations for the same concept are similar regardless of language, while activations at the top/bottom layers diverge. Meanwhile the pattern is reversed for same language different content. | ||
| ▲ | ekropotin 2 hours ago | parent [-] | |
So we do at least agree on the fact that quality of human language <-> embeddings transition depends on how good target language is represented in the training dataset? Even if it happens at the edge, on every conversation turn, I may assume non captured small subtleties of meaning over time can accumulate into significant error. | ||