| ▲ | reassess_blind 5 hours ago | |
It looks as if this tool has traditional static rules to allow/deny requests, as well as a secondary LLM-as-a-judge layer for, I imagine, the kinds of rules that would be messy or too convoluted to implement using standard rules. | ||
| ▲ | stingraycharles 3 hours ago | parent [-] | |
I think the parent’s point is that this should be implemented using e.g. Bayesian statistics rather than an LLM, as the judge LLM is vulnerable to the exact same types of attacks that it’s trying to protect against. Most proper LLM guardrails products use both. | ||