| ▲ | cyanydeez 7 hours ago | |||||||||||||||||||||||||
There was a paper recently that demonstrated that you can input different human languages and the middle layers of the model end up operating on the same probabilistic vectors. It's just the encoding/decoding layers that appear to do the language management. So the conclusion was that these middle layers have their own language and it's converting the text into this language and this decoding it. It explains why sometime the models switch to chinese when they have a lot of chinese language inputs, etc. | ||||||||||||||||||||||||||
| ▲ | DrewADesign 7 hours ago | parent | next [-] | |||||||||||||||||||||||||
Ok — that sounds more like a theory rather than an open-and-shut causal explanation, but I’ll read the paper. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | skydhash 5 hours ago | parent | prev [-] | |||||||||||||||||||||||||
Pretty obvious when you think that neural networks operate with numbers and very complex formulas (by combining several simple formulas with various weights). You can map a lot of things to number (words, colors, music notes,…) but that does not means the NN is going to provide useful results. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||