Remix.run Logo
rvz 4 hours ago

> Open models are promising and cost a fraction of what they proprietary models cost which the big two are vulnerable to when companies start to feel the cost of tokens.

Anthropic are scared of open weight models and need to fear-monger towards you to continue paying for their models.

That's the whole point of their 'safety' marketing narrative, account bans, and Dario being the AI scarecrow scaremongering everyone about nonsense like 'Mythos' towards the world.

'Mythos' is already here in the form of open-weight models that also found the same vulnerabilities as Anthropic did.

danieldoesbio 3 hours ago | parent [-]

Genuine question here about the open-weight models finding the same vulnerabilities as mythos thing: is it just a matter of false negatives/positives? I’ve seen a few cases where people show other models (even opus) can find the same vulnerabilities given many passes. Is there some disadvantage to the extra passes that give the claimed Mythos performance extra value (assuming it finds them in less)?

intothemild 2 hours ago | parent [-]

The thing is, mythos found those with multiple passes, thousands of passes... So using thousands of passes or perhaps the same budgets, yes, cheaper open weight models could potentially (and have) found the same/similar vulnerabilities.

Mythos screams of marketing hype, and nothing more. Opus 4.7 isn't really a meaningful upgrade in any sense, other than being more expensive.

Once you can see what something like Qwen3.6-35B-A3B can do... with just a FRACTION of the size of the larger models, You'll understand that the future is open weight models you can run yourself.

Same goes for companies, bringing inference onsite isn't hard, I'm actively building tooling to orchestrate it.

danieldoesbio 21 minutes ago | parent [-]

What is the failure state for a pass that doesn't find a real vulnerability? Do the models report no issues or hallucinate issues that aren't real? I'm trying to run open weight local models and finding them really impressive... Just also trying to understand the cybersecurity side of all this.