| ▲ | hmokiguess 3 hours ago | |
I'm trying to understand what you said, can you please correct me if I'm wrong here. Would this be sort of like saying the way embeddings of different primitives across languages end up distributed in a vector space all follow the same principles and "laws"? For example, if I train a large corpus of english and, separately, a large corpus of spanish, in both cases the way language constructs that are equivalent across both will end up represented using the same vector space patterns? | ||
| ▲ | canjobear 18 minutes ago | parent [-] | |
This does seem to happen, at least close enough that it's possible to align embedding spaces across languages and do some translation without training on parallel texts. | ||