▲ | imranhou 3 days ago | ||||||||||||||||
Coming from a layman's perspective, a genuine question regarding: "Implements SAE training with auxiliary loss to prevent and revive dead latents, and gradient projection to stabilize training dynamics". I struggle to understand this phrase "to prevent and revive ", perhaps this is simple speak to those that understand the subject of SAEs, but it feels a bit self contradictory to me, could anyone elaborate? | |||||||||||||||||
▲ | PaulPauls 3 days ago | parent | next [-] | ||||||||||||||||
Just bad wording from me, trying to combine too much information in 1 sentence. The auxiliary loss is supposed to prevent dead latents from occuring in the first place - therefore "prevent dead latents" - and it is also supposed to revive the latents that are already dead - therefore "revive dead latents". Now that I review that sentence again I see that I used 2 verbs on the same subject that could be interpreted differently depending on the verb. Me culpa. I hope you still gained some insights into it =) | |||||||||||||||||
| |||||||||||||||||
▲ | versteegen 3 days ago | parent | prev [-] | ||||||||||||||||
A latent that is never active and hence doesn't (seem to) represent anything. A loss term to reduce the occurrence of that, and if it does happen, push it back to being active sometimes. | |||||||||||||||||
|