| ▲ | yorwba 2 hours ago | |
If you want to experiment with hardcoding small programs into transformer weights, maybe try ALTA: https://arxiv.org/abs/2410.18077v2 | ||
| ▲ | ACCount37 2 hours ago | parent [-] | |
I'm less interested in turning programs into transformers and more interested in turning programs into subnetworks within large language models. Which the blog post brings up as a research direction, but never actually elaborates upon. And the interface between the two is a hard problem. I'll check out the link though, thanks. | ||