| |
| ▲ | fl7305 3 days ago | parent | next [-] | | > The internal model of a LLM is statistical text. Which is linear and fixed. Not great other than generating text similar to what was ingested. The internal model of a CPU is linear and fixed. Yet, a CPU can still generate an output which is very different from the input. It is not a simple lookup table, instead it executes complex algorithms. An LLM has large amounts of input processing power. It has a large internal state. It executes "cycle by cycle", processing the inputs and internal state to generate output data and a new internal state. So why shouldn't LLMs be capable of executing complex algorithms? | | |
| ▲ | skydhash 3 days ago | parent [-] | | It probably can, but how will those algorithms be created? And the representation of both input and output. If it’s text, the most efficient way is to construct a formal system. Or a statistical model if ambiguous and incorrect result are ok in the grand scheme of things. The issue is always inout consumption, and output correctness. In a CPU, we take great care with data representation and protocol definition, then we do formal verification on the algorithms, and we can be pretty sure that the output are correct. So the issue is that the internal model (for a given task) of LLMs are not consistent enough and the referential window (keeping track of each item in the system) is always too small. | | |
| ▲ | fl7305 3 days ago | parent [-] | | Neural networks can be evolved to do all sorts of algorithms. For example, controlling an inverted pendulum so that it stays balanced. > In a CPU, we take great care with data representation and protocol definition, then we do formal verification on the algorithms, and we can be pretty sure that the output are correct. Sure, intelligent design makes for a better design in many ways. That doesn't mean that an evolved design doesn't work at all, right? |
|
| |
| ▲ | hackinthebochs 3 days ago | parent | prev [-] | | >The internal model of a LLM is statistical text. Which is linear and fixed. Not at all. Like seriously, not in the slightest. | | |
| ▲ | skydhash 3 days ago | parent [-] | | What does it encode? Images? Scent? Touch? Some higher dimensional qualia? | | |
| ▲ | hackinthebochs 3 days ago | parent [-] | | Well, a simple description is that they discover circuits that reproduce the training sequence. It turns out that in the process of this, they recover relevant computational structures that generalize the training sequence. The question of how far they generalize is certainly up for debate. But you can't reasonably deny that they generalize to a certain degree. After all, most sentences they are prompted on are brand new and they mostly respond sensibly. Their representation of the input is also not linear. Transformers use self-attention which relies on the softmax function, which is non-linear. |
|
|
|