Nice! You did what I wanted. Have you tried to train SAE for vision encoder and language encoder? I am working on this idea. May we work together, let me initial an issue.