Remix.run Logo
msla an hour ago

About how many training steps are required to get good output?

b44 an hour ago | parent [-]

not many. diminishing returns start before 1000 and past that you should just add a second/third layer