I'd start by watching these lectures:
https://ut.philkr.net/advances_in_deeplearning/
Especially the "Advanced Training" section to get some idea of tricks that are used these days.