Remix.run Logo
tomtom1337 5 hours ago

Any suggestions for «orchestrating» this type of experiment?

And how does one compare the results in a way that is easy to parse? 7 models producing 1 PR each is one way, but it doesn’t feel very easy to compare such.