| ▲ | the_gipsy 5 hours ago | |
With AIs, it seems like there never is a comparison that is useful. | ||
| ▲ | theptip 3 hours ago | parent | next [-] | |
You can build evals. Look at Harbor or Inspect. It’s just more work than most are interested in doing right now. | ||
| ▲ | jascha_eng 5 hours ago | parent | prev [-] | |
yup its all vibes. And anthropic is winning on those in my book still | ||