Remix.run Logo
bredren 2 hours ago

For CC, I suspect it also need to be testing and labeling separate runs against subscription, public API and Bedrock-served models?

It’s a terrific idea to provide this. ~Isitdownorisitjustme for LLMs would be the parakeet in the coalmine that could at least inform the multitude of discussion threads about suspected dips in performance (beyond HN).

What we could also use is similar stuff for Codex, and eventually Gemini.

Really, the providers themselves should be running these tests and publishing the data.

The availability status information is no longer sufficient to gauge the service delivery because it is by nature non-deterministic.