▲ | horizion2025 5 days ago | |||||||
Isn't that just another guardrail that can be bypassed much the same as the guard rails are currently quite easily bypassed? It is not easy to detect a prompt. Note some of the recent prompt injection attack where the injection was a base64 encoded string hidden deep within an otherwise accurate logfile. The LLM, while seeing the Jira ticket with attached trace , as part of the analysis decided to decode the b64 and was led a stray by the resulting prompt. Of course a hypothetical LLM could try and detect such prompts but it seems they would have to be as intelligent as the target LLM anyway and thereby subject to prompt injections too. | ||||||||
▲ | wrs 5 days ago | parent | next [-] | |||||||
Yep. | ||||||||
| ||||||||
▲ | darepublic 5 days ago | parent | prev [-] | |||||||
We need the severance code detector | ||||||||
|