▲ | imranhou 3 days ago | |
So basically preventing dead latents from occurring and whenever they do occur to possibly reviving them through the use of auxiliary loss term in the loss function? Thanks btw | ||
▲ | dontknowit a day ago | parent [-] | |
I imagine this kind of algorithm are like a derivative, they give a unit response, so you would need another filter to stabilize your system, that is some drop out to remove spurious revived latents. |