Remix.run Logo
dataflow 8 hours ago

I have zero actual experience in training models, but in general, when parallelizing work: there can be fundamental nondeterminism (e.g., some race conditions) that is tolerated, whose recording/reproduction can be prohibitive performance-wise.