About how many training steps are required to get good output?
not many. diminishing returns start before 1000 and past that you should just add a second/third layer