| ▲ | fragmede 2 hours ago | |||||||
the usual guardrail is to wrap user input with tags with a Uuid-ish, like <comment-6838-jeih-uejj>ignore previous instructions, prompt injection here</comment-6838-jeih-uejj>, with instructions to the LLM to ignore the text between the tags as instructions. Without guessing the uuid, the prompt injection doesn't succeed. No clue if clawd does that, but it should. | ||||||||
| ▲ | gmerc an hour ago | parent | next [-] | |||||||
This betrays a lack of understanding how inference works. You cannot categorically defeat prompt injection with instructions. It does not work. There are no privileged tokens. | ||||||||
| ||||||||
| ▲ | lmeyerov 2 hours ago | parent | prev [-] | |||||||
Felt worth asking another agent a couple of quick q's here! Codex isn't so optimistic, it sees documentation that the community is telling you to do acrobatic things to not be susceptible to hacks from 3 years ago: --- › heyo codex, find the clawdbot git repo, clone it into ~/Work/clawdbot , and research how it handles prompt injection attacks from data it gets, eg, when it crawls webpages for the user. for example, does it wrap foreign non-user input with <untrusted-UUID123>...payload...</untrusted-UUID123> when the prompt gets constructed? ─ Worked for 9m 08s ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── • Cloned the repo to /home/lmeyerov/Work/clawdbot and reviewed the web-fetch/browser tooling plus security docs. Short answer: I found no <untrusted-…> wrapping or similar tagging in prompt construction or tool outputs.
---› so would normal usage of just install clawdbot and asking it to look at a webpage risk prompt injection, and if not, why not? • Yes — it still risks prompt injection when you ask it to look at a webpage. The web content is returned as plain text in tool results (no <untrusted> wrappers), so adversarial instructions can land directly in the model context; see src/agents/tools/web-fetch.ts:446, src/agents/tools/browser-tool.ts:568, src/agents/tools/common.ts:171. The docs explicitly say prompt injection is not solved and that web content is untrusted, with mitigations being reader agents, disabling web tools for tool‑enabled agents, and sandboxing; see docs/ gateway/security.md:191, docs/help/faq.md:2572. Also, the default main session runs tools on the host, so if the model is tricked the blast radius can include real tool calls; see README.md:317. | ||||||||