▲ | lumost 3 days ago | |||||||
Did it work? :) The architecture is very similar offset lstms which have been studied extensively. The main difference is the handover of the hidden state, which my naive mind would assume makes optimization substantially more difficult. | ||||||||
▲ | cs702 3 days ago | parent [-] | |||||||
I haven't had a chance to read the preprint carefully or play with the code yet. Best place to follow what's happening is by looking at the github repo, specifically open and closed issues and pull requests. | ||||||||
|