Remix.run Logo
Lerc 16 hours ago

You can still consider it logically from the point of view of in-order with optional looping and optional skipping. It stops being so combinationally explodey then but if you can always append an additional loop and and decide to skip based on worthiness of the layer with varying degrees of threshold then it could theoretically learn an arbitrary ordering where you skip all-bar-one layer per loop.

There's probably a number of common sequences of layers that are inevitable when working on a problem though. I think of it like a expression calculator which could do various parts of an expression tree merging leaf nodes on each iteration. I wouldn't expect it to be quite so explicit with neural nets, but I feel like the underlying principle of do the sub parts then do the same thing on the result of the subparts must be beneficial to some degree.

I think there's probably quite a lot to be revealed from study of representations in those middle layers. If there's a 'how-much-have-we-solved-so-far' signal to be detected from the data between layers, there would be quite a lot of options I think.