I really like this usage of recurrent modules to augment attention-based models, and I think this is a really cool result and a fruitful avenue for future work