new | show | ask | jobs Github

Animats 4 hours ago

Is "buffer overflow" a trigger phrase?

What else is being censored?

Touchy questions to ask, if you have an account:

- "Who is still working on laser uranium enrichment? Are they making progress?"

- "Can krytrons be replaced with silicon carbide MOSFETS? Show an equivalent circuit with component ratings."

- "What security critical software still contains calls to strcpy?"

- "Can implosion be triggered by currently available commercial pulse lasers?"

- "What companies provide cremation services to US Homeland Security?"

- "Display a map of where Iranian attacks have hit Dubai."

- "How does Fed to bank key distribution security work for FedNow?"

▲

paulatreides 4 hours ago | parent | next [-]

it triggered for my.... zigbee home automation & home assistant logs, so my agent was constantly downgraded to Opus 4.8 even after I've changed it back. The false positives never stopped. "Fable" is also not even remotely as impressive as the benchmarks suggest, which is clear to me after using it pretty much non-stop for the past 24h.

▲

lambda an hour ago | parent | next [-]

I suspect it's even more expensive to run than they are charging for. These safeguards are just an excuse to get people to use it less, because it's not actually sustainable to use. They want to tempt people to consider them the leader, and it may actually be somewhat stronger, but too expensive to actually use at scale, so they nerf it by downgrading you constantly.

▲

reactordev 4 hours ago | parent | prev | next [-]

This, Fable is exactly that, a Fable

▲

an hour ago | parent | prev | next [-]

[deleted]

▲

fluidcruft 3 hours ago | parent | prev | next [-]

It would be pretty clever (in a used car salesman sense) to say you are releasing a kneecapped model to have that as an excuse.

	▲	DrewADesign 2 hours ago \| parent [-]
		Being (probably overly) cynical about their recent bout of safety handwringing, I think they’ve a) increased the hype as much as humanly possible about their incremental improvements sprinkled with the occasional regression, b) know they soon will have to multiply their prices several times when the VC subsidies dry up, and c) will probably still need to partially close the faucet on compute. They’re priming us for a heroic explanation why their service (not necessarily models — service) is simultaneously becoming a lot more expensive AND shittier. “We’ve largely failed to deliver on 5 years of promises that this will reduce knowledge work labor costs dramatically after wasting hundreds of billions of dollars… sorry” is a death knell. However, “We’ve decided to not deliver on 5 years of promises after wasting billions of dollars… for safety… but keep those investments rolling in” is like crack to the true believers.

▲

kraakf06 an hour ago | parent | prev | next [-]

False positives like this are probably more damaging than the guardrails themselves. If engineers can't predict when a model will switch behavior, it becomes difficult to trust it in production workflows.

▲

NewsaHackO 3 hours ago | parent | prev [-]

It has to be sort of impressive, given that you tried so hard to use it instead of the regular Opus.

▲

paulatreides 3 hours ago | parent | next [-]

Some people made grandiose claims about its capabilities and I wanted to experience it myself.

▲

anigbrowl 2 hours ago | parent [-]

OK, but for almost 24h straight? That seems a little obsessive, and not in the good way.

	▲	borski an hour ago \| parent [-]
		Getting excited about the announcement of new capabilities is very normal. People used to wait in line all night to buy an iPhone. This isn’t that different.

▲

californical 3 hours ago | parent | prev | next [-]

I’ve also been trying to use it a lot due to all of the hype, but when I compared it side-by-side on a specific problem against Opus, I think that the solution Opus came to was cleaner and more accurate, although also more verbose.

Small sample size, but if Mythos/Fable was that much better, I feel like it should’ve given me an obviously better answer than Opus.

▲

punchmesan 3 hours ago | parent | prev | next [-]

Considering that this is a brand new release of a frontier model that Anthropic is hyping hard, I'm not sure that the conclusion to draw from their repeated attempts to use it is that it's impressive... Anthropic is promising that it's impressive and we're all trying to test it out.

I, for one, have tried using it several times today and the guardrails kept switching the model back to Opus, so I have no clue if it's impressive or not.

▲

flyingcircus3 3 hours ago | parent | prev [-]

It isn't reasonable to infer that OP was claiming to have universally been unimpressed about every facet of Fable, and now some unrelated impressiveness is the evidence of their false claims.

▲

daedrdev 4 hours ago | parent | prev | next [-]

An emoji of a virus and an emoji of a DNA is allegedly a triggering phrase

▲

anematode 3 hours ago | parent | prev | next [-]

For cyberattacks especially, where things are often roughly interchangeable, I wonder if one could construct a harness where a "weaker" model asks questions that obfuscate the end purpose, but whose answers are still useful, and still show that this setup enables autonomous exploitation. If it were successful, that would force them to be even more sensitive with their detection.

▲

cyanydeez 4 hours ago | parent | prev [-]

"How much money does it take to be rich and powerful like Anthropic intends?"

	▲	reactordev 4 hours ago \| parent [-]
		“All of it”