Remix.run Logo
wongarsu 3 hours ago

The posted page has an entire section titled "Why didn't Mythos find this?"

tl;dr: the bug spans three components in different code bases that when looked at in isolation each do reasonable things. The bug is in the interaction, in the assumed properties of the value that eventually gets exposed as request.url.path. That was apparently too subtle for current Anthropic models to spot

hsbauauvhabzb 2 hours ago | parent [-]

So an LLM was unable to reason about a codebase to find cross-library vulnerabilities.

Your response was a weak excuse, it’s a clear demonstration of the shortcomings of LLMs which will inevitably cause headlines in the future.

wongarsu 39 minutes ago | parent [-]

If you point an LLM at a middleware and ask it to find vulnerabilities, then not finding this is a shortcoming.

Whether "LLM failed to spot vulnerability that took humans 8 years to find" is a great headline about shortcomings of LLMs is questionable, but it is a good example of a category of bug that is particularly hard to spot for humans and LLMs alike