▲ | mkarrmann a day ago | |
Horace He at Thinking Machines just dropped an awesome article describing exactly this: https://thinkingmachines.ai/blog/defeating-nondeterminism-in... TL;DR: assuming you've squashed all regular non-determinism (itself a tall ask), you either need to ensure you always batch requests deterministically, or ensure all kernels are "batch invariant" (which is absolutely not common practice to do). | ||
▲ | a day ago | parent [-] | |
[deleted] |