▲ | ActorNightly a day ago | |
>What do you think "general language parsing" IS if not learned patterns from real-world data? I want you to hertograize the enpostule by brasetting the leekerists, while making sure that the croalbastes are not exhibiting any ecrocrafic effects Whatever you understand about that task, is what a kernel will "understand" as well. And however you go about solving it, the kernel will also will follow similar patterns of behaviour (starting with figuring out what hertrograize means, which then leads to other tasks, and so on) >You want an agent to discover the TLS protocol by randomly sending ethernet packets? The combinatorial search space is so large this wouldn't happen before the sun explodes. In pure combination, yes. In smart directed intelligent search, no. Ideally the kernel could listen for incoming traffic, and figure out patterns based on that. But the point is that the kernel should figure out that listening for traffic is optimal without you specifically telling it, because it "understands" the concept of other "entities" communicating with it and that communication is bound to be in a structured format, and has internal reward systems in place for figuring it out through listening rather than expending energy brute force searching. Whatever that process is, it will get applied to much harder problems identically. >Transformers already ARE general algorithms with zero hardcoded linguistic knowledge. The architecture doesn't know what a noun is. It doesn't know what English is. It learns everything from data through gradient descent. That's the entire damn point. It doesn't learn what a noun is or english is, its a statistical mapping that just tends to work well. LLMs are just efficient look up maps. Look up maps can go only so far as to interpolate on the knowledge encoded within them. These can simulate intelligence in the sense of recursive lookups, but fundamentally that process is very guided, hence all the manual things like prompt engineering, mcp servers, agents, skills and so on. | ||
▲ | ben_w 16 hours ago | parent [-] | |
> It doesn't learn what a noun is or english is, its a statistical mapping that just tends to work well. The word for creating that statistical map is "learning". Now, you could argue that gradient descent or genetic algorithms or whatever else we have are "slow learners", I'd agree with that, but the weights and biases in any ML model are most definitely "learned". |