Remix.run Logo
yanis_t 6 hours ago

Simon, is your pelican test really captures differences among models or should you at least try like 10 times or something to average the random effects

simonw 6 hours ago | parent [-]

I've been meaning to do a "run 3 times and pick the best" version for quite a while, I should really pull the trigger on that one. Currently it's one-shot only.

xiphias2 5 hours ago | parent [-]

Best-of-3 would be cheating, ruin the test, middle of 3 makes more sense

nik736 5 hours ago | parent [-]

Why would you need the 3rd run if you pick the "one in the middle"?

jmaw 3 hours ago | parent [-]

Middle as in not the best, and not the worst. As opposed to the second generated in sequence.

But not the best/not the worst is somewhat subjective.. so not sure how well that would work.