| ▲ | zk_haider 2 hours ago | |
I think there’s a huge problem when we need another model to interpret the activations inside the network and translate (which can be a hallucination in it of itself) and then _that_ is fed again to another model. Clearly we haven’t built and understood these models properly from the ground up to evaluate them 100% correctly. This isn’t the human brain we’re operating it’s code we create and run ourselves we should be able to do better | ||
| ▲ | semiquaver 9 minutes ago | parent | next [-] | |
The models cannot be “built from the ground up” in the way you are expecting. The weights are learned from gradient descent of a very high dimensional loss surface, not added by human hands. We simply dont know how to make a model that works like you seem to want. Sure, we could start over from scratch but there’s an incredibly strong incentive to build on the capability breakthroughs achieved in the last 10 years instead of starting over from scratch with the constraint that we must perfectly understand everything that’s happening. | ||
| ▲ | sfvisser 2 hours ago | parent | prev [-] | |
Humans maybe wrote the code, but not the network of weights on top. And that’s where the magic happens. Even if we’d understand precisely how every neuron in our brains work at a molecular level there is no reason to believe we’d understand how we think. We can’t simply reduce one layer into another and expect understanding. | ||