| ▲ | pankajdoharey 4 hours ago | |
Looks like a Tiny Analytic transformer, RNN is arguably a better choice if you are gonna handwire an architecture to mechanically do addition. Learning is about discovering the patterns and algorithm from data. Wiring a machine to follow a procedure defeats that purpose. | ||
| ▲ | dnautics an hour ago | parent [-] | |
it proves that the algorithm is embeddable in a bigger transformer of ~similar architecture. | ||