Remix.run Logo
otterley 3 hours ago

These posts read a lot like "I also solved Fermat's last theorem and spent only an hour on it" after reading the solution of Fermat's last theorem. How valuable is that?

moduspol 3 hours ago | parent | next [-]

IMO it is valuable because it suggests the primary value was in the harness and not the LLM.

That's not too surprising for those of us who have been working with these things, either. All kinds of simpler use cases are manageable with harnesses but not reliably by LLMs on their own.

otterley an hour ago | parent [-]

What if Mythos didn't need the narrowing harness? That's still the burning question that has yet to be answered. Anthropic suggested very strongly that Mythos did not need it.

moduspol an hour ago | parent [-]

I don't think it matters. Even if it didn't need it, all that implies is that it better handles a larger context window. A larger context window is not necessary to solve the problem.

We're being told that Mythos is such a big step change in capability that it needs to be kept secret and carefully controlled because a wide release could threaten cybersecurity everywhere. That does not really hold water if a barely simpler harness can do the same stuff at a lower price and is available to all of us.

The burning question to me, at least, is how many false positives each approach generated, and the degree of their falseness (e.g. "valid but not exploitable" vs. "not valid"). It's not super useful if it's generating way more noise than signal.

dooglius 3 hours ago | parent | prev [-]

The analogy doesn't really apply but if someone had a new solution to FLT that could be understood in an hour that would be a pretty big deal I think