Remix.run Logo
pamelafox 4 days ago

I ran an automated red-teaming against a RAG app using llama:3.18B, and it did really well under red-teaming, pretty similar stats to when the app was gpt-4o. I think they must have done a good at the RLHF of that model, based on my experiments. (Somewhat related to these kind of adversarial attacks)