| ▲ | nonethewiser 5 hours ago | |||||||
I wonder what hooks they have in place to be able to configure safeguards at runtime. | ||||||||
| ▲ | aleksiy123 5 hours ago | parent [-] | |||||||
Probably a mix of heuristics, keywords and simple ml model. Then maybe a second gate with a lightweight llm? Edit: actually Gcp, azure, and OpenAI all have paid apis that you can also use. But I don’t think they go into details about the exact implementation https://redteams.ai/topics/defense-mitigation/guardrails-arc... | ||||||||
| ||||||||