Using Fable, pretty much every request hit some gate they had for no discernible reason. These provider-level rejections should be incorporated into benchmarks as 0s on the tasks since that's the experience you'll actually get using the model.

▲

cjk 8 hours ago | parent [-]

I have heard this from a bunch of folks, but that was not my experience. For the couple days I was able to use it, I didn't hit a single gate, and I was using it pretty extensively (but not for anything security-related).

▲

lucamark 8 hours ago | parent | next [-]

Never had rejections in the short time Fable was available

▲

UltraSane 8 hours ago | parent [-]

It used Opus for every biology related question I asked it.

	▲	olejorgenb 7 hours ago \| parent [-]
		Even opus refuse to discuss micro biology for more than a around 15 turns in my experience.

▲

8 hours ago | parent | prev [-]

[deleted]