| ▲ | cluckindan 11 hours ago | |
The obvious guardrail against this is to include defensive poetry in the system prompt. It would likely work, because the adversarial poetry is resonating within a different latent dimension not captured by ordinary system prompts, but a poetic prompt would resonate within that same dimension. | ||