I am told that an interesting alternative is the Structured State Space for Sequence Modeling (S4). I don't personally know much about this technique, but didn't see anybody else mention this in this thread.
https://srush.github.io/annotated-s4/