Remix.run Logo
Latent Introspection: Models Can Detect Prior Concept Injections(arxiv.org)
2 points by tosh 9 hours ago