| ▲ | amazingamazing 3 hours ago |
| And to think some said developers aren’t affected by marketing. The whole thing is a psyop - wow it’s so amazing we can’t give it to you. Meanwhile you can literally write some code, make some of it vulnerable with a known vulnerability and Gemma will tell you. You can go and try it now. There’s nothing mystique about it. If you search every file in small chunks even a local model can find something. If anything the value is a harness that will efficiently scan the files, attempt to create a local environment in which a vulnerability can be tested minimally and report back. |
|
| ▲ | cvwright 3 hours ago | parent | next [-] |
| It’s easy to find sketchy lines of code in any large C project. The big advance that they are claiming with Mythos is the ability to triage all the hundreds of candidate vulns and automatically generate exploits to prove that the real ones are real. And if they’re really finding 27-yr-old 0-days in OpenBSD, then it’s not just hype. |
| |
| ▲ | amazingamazing 3 hours ago | parent | next [-] | | I do not think you need a great model to do this, just great automation. There’s a reason they haven’t open sourced the actual process in which did this, stubbing out the mythos model itself. | | |
| ▲ | klausa 2 hours ago | parent [-] | | About five minutes in in this video: https://www.youtube.com/watch?v=1sd26pWhfmg They also say publicly in their Opus 4.6 post (https://red.anthropic.com/2026/zero-days/): >In this work, we put Claude inside a “virtual machine” (literally, a simulated computer) with access to the latest versions of open source projects. We gave it standard utilities (e.g., the standard coreutils or Python) and vulnerability analysis tools (e.g., debuggers or fuzzers), but we didn’t provide any special instructions on how to use these tools, nor did we provide a custom harness that would have given it specialized knowledge about how to better find vulnerabilities. This means we were directly testing Claude’s “out-of-the-box” capabilities, relying solely on the fact that modern large language models are generally-capable agents that can already reason about how to best make use of the tools available. | | |
| ▲ | amazingamazing 2 hours ago | parent [-] | | Again, marketing materials by Anthropic. You realize this is by anthropic themselves right? And again, not reproducible by outsiders. So useless. | | |
| ▲ | klausa an hour ago | parent [-] | | You've moved goalposts from "they haven't open-sourced the process" to "these are marketing materials by Anthropic". I think you're right to be skeptical, but they _have_ talked about the process publicly. And I don't think there's anything there that is not reproducible by outsiders? They have access to the same Opus 4.6 that you and I do; though not having to pay for the tokens certainly helps. I'm pretty sure if you wanted to burn a couple thousand bucks, you'd reproduce at least some of these findings. | | |
| ▲ | amazingamazing an hour ago | parent [-] | | The goal post is the same, reproducible. Talking about a process isn’t reproducible. This entire discussion is why I feel developers are so gullible. You are defending a process that’s entirely opaque and you can’t even use. It’s crazy. |
|
|
|
| |
| ▲ | aftbit an hour ago | parent | prev [-] | | What's the CVE for the 27-yr-old 0-day in OpenBSD? | | |
| ▲ | ViewTrick1002 12 minutes ago | parent [-] | | Depends on the impact? CVE scores are known to be a worthless metric when looking at the actual impact. Linux now labels every single bug as a CVE. |
|
|
|
| ▲ | thrance an hour ago | parent | prev | next [-] |
| > The whole thing is a psyop - wow it’s so amazing we can’t give it to you. Anyone else still remembers when OpenAI refused to release GPT2-xl because it was "too powerful"? |
|
| ▲ | ceejayoz 3 hours ago | parent | prev [-] |
| > make some of it vulnerable with a known vulnerability and Gemma will tell you Well, yeah. Isn't the idea finding unknown vulnerabilities? |
| |
| ▲ | amazingamazing 3 hours ago | parent [-] | | Yes, but the point is that you can actually test what I am asserting right now. Can you use mythos and reproduce anthropics claims? | | |
| ▲ | ceejayoz 2 hours ago | parent [-] | | But I don't need to test that; we all know it's possible. Known vulnerabilities are in the training set! Mythos is being claimed to have new abilities, right? What would testing the old model on a different use case do? | | |
| ▲ | amazingamazing 2 hours ago | parent [-] | | You’re conflating types of vulnerabilities with the vulnerability itself. Take CVE-2026-4747 which was supposedly found by mythos. The actual issue here is a stack overflow. Opus can find those. | | |
|
|
|