Remix.run Logo
cgorlla 3 days ago

I checked with the team and it may have been some temporary rate-limiting issue. We've rectified the results, it seems to be an isolated case.

https://www.ctgt.ai/benchmarks

rancar2 3 days ago | parent | next [-]

Thanks for the thoroughness! I look forward to the next steps as you all apply this approach in other unique ways to have even better results.

SomaticPirate 3 days ago | parent | prev [-]

Are these benchmarks correct that adding Anthropic's Constitutional AI system prompt lowered results across all the models?