| ▲ | hedora 4 hours ago | |||||||||||||
Isn't OpenAI's public flagship already beating Mythos on penetration testing? I get the impression Mythos is just valuation-juicing for IPO more than anything else. The fact that they haven't released it yet suggests a cost/margins issue to me more than anything else. Short term, I'll probably keep using Antrhopic, but my long-term bet is that locally-served models win, if only because the quest for profitability will probably lead to intentionally-nerfed / enshittified frontier models. At other vendors, ad placement within LLM responses is either coming or already here. Anthropic's handling of OpenClaw shows they're willing to engage in anti-competitive behavior, and the courts are not in a hurry to stop them. Why would I pay them $200 a month for such treatment when a $2K box does what I need locally? | ||||||||||||||
| ▲ | ameliaquining 2 hours ago | parent | next [-] | |||||||||||||
Mythos is dramatically better specifically at finding zero-day vulnerabilities and developing exploits for them, that being what it was designed to do. On other cybersecurity tasks, GPT-5.5 is at least as good, but finding and exploiting zero-days is a particularly scary capability, which is why Mythos is a big deal. See, e.g., https://forum.effectivealtruism.org/posts/8yztpbjuPkyXsmA6n/.... | ||||||||||||||
| ||||||||||||||
| ▲ | srmatto 3 hours ago | parent | prev | next [-] | |||||||||||||
What benchmarks are you referencing that show a comparison of the models for penetration testing? | ||||||||||||||
| ▲ | senordevnyc an hour ago | parent | prev [-] | |||||||||||||
Please link to the $2k box that gives Opus level performance! | ||||||||||||||