▲ | agentcoops 5 days ago | |
Apologies if this comes across as too abstract, but I think your comment raises really important questions. (1) While studying the properties of the mathematical objects produced is important, I don't think we should understand the situation you describe as a problem to be solved. In old supervised machine learning methods, human beings were tasked with defining the rather crude 'features' of relevance in a data/object domain, so each dimension had some intuitive significance (often binary 'is tall', 'is blue' etc). The question now is really about learning the objective geometry of meaning, so the dimensions of the resultant vector don't exactly have to be 'meaningful' in the same way -- and, counter-intuitive as it may seem, this is progress. Now the question is of the necessary dimensionality of the mathematical space in which semantic relations can be preserved -- and meaning /is/ in some fundamental sense the resultant geometry. (2) This is where the 'Platonic hypothesis' research [1] is so fascinating: empirically we have found that the learned structures from text and image converge. This isn't saying we don't need images and sensor robots, but it appears we get the best results when training across modalities (language and image, for example). This is really fascinating for how we understand language. While any particular text might get things wrong, the language that human beings have developed over however many thousands of years really does seem to do a good job of breaking out the relevant possible 'features' of experience. The convergence of models trained from language and image suggests a certain convergence between what is learnable from sensory experience of the world and the relations that human beings have slowly come to know through the relations between words. [1] https://phillipi.github.io/prh/ and https://arxiv.org/pdf/2405.07987 | ||
▲ | k__ 5 days ago | parent | next [-] | |
1) Fair. I did some experiments with recommendation systems 15 years ago and we basically stopped using dimensions generated by the system, because nobody could make anything of them. The human-made dimensions were much easier to create user archetypes from. | ||
▲ | niam 5 days ago | parent | prev [-] | |
Re: #2 I've never really challenged that text is a suitable stand-in for important bits of reality. I worry instead about meta-limitations of text: can we reliably scale our training corpus without accreting incestuous slop from other models? Sensory bots would seem to provide a convenient way out of this problem but I'm not read-enough to know. |