Remix.run Logo
imranhou 3 days ago

So basically preventing dead latents from occurring and whenever they do occur to possibly reviving them through the use of auxiliary loss term in the loss function? Thanks btw

dontknowit a day ago | parent [-]

I imagine this kind of algorithm are like a derivative, they give a unit response, so you would need another filter to stabilize your system, that is some drop out to remove spurious revived latents.