| ▲ | schipperai 4 hours ago | |
good challenges! xargs falls to unknown -> ask, and find -exec goes thru a flag classifier that detects the inner command like: find / -exec rm -rf {} + is caught as filesystem_delete outside the project. The npm test is a good one - content inspection catches rm -rf or other sketch stuff at write time, but something more innocent could slip through. That said, a realistic threat model here is accidental damage or prompt injection, not Claude deliberately poisoning its own package.json. But I hear you.. two improvements are coming to address this class of attack: - Script execution inspection: when nah sees python script.py, read the file and run content inspection + LLM analysis before execution - LLM inspection for Write and Edit: for content that's suspicious but doesn't match any deterministic pattern, route it to the LLM for a second opinion Won't close it 100% (a sandbox is the answer to that) but gets a lot better. | ||