| ▲ | swyx 8 hours ago | |
yes well aware :) numbers shown are on "house" harnesses eg codex with gpt and claude code with opus. fwiw we have examples of each model doing better on NON-house harnesses too - speaking jsut for myself i think the "the labs are RLing on their own harnesses" narrative is kinda overstated if you think through wanting to have any meaningful api business (often eg the labs will give guidance on what is prefered and the agent labs can easily match tool contract to that, which is to say, the "home turf advantage" isnt as large as you think it is if you try a little bit) | ||
| ▲ | Bolwin 5 hours ago | parent | next [-] | |
What is the "house" harness for minimax? They haven't released any | ||
| ▲ | chris_st 7 hours ago | parent | prev [-] | |
What "non-house" harnesses have you found to work best? | ||