|
| ▲ | jsheard a day ago | parent | next [-] |
| Doesn't that depend on the implementation? There's a trade-off between performance and determinism for sure, but if determinism is what you want then it should be possible. |
| |
| ▲ | jb1991 a day ago | parent [-] | | If you fix random seeds, disable dropout, and configure deterministic kernels, you can get reproducible outputs locally. But you still have to control for GPU non-determinism, parallelism, and even library version differences. Some frameworks (like PyTorch) have flags (torch.use_deterministic_algorithms(True)) to enforce this. |
|
|
| ▲ | geor9e a day ago | parent | prev | next [-] |
| what if you set top_p=1, temperature=0, and always run it on the same local hardware |
| |
| ▲ | mkarrmann a day ago | parent | next [-] | | Horace He at Thinking Machines just dropped an awesome article describing exactly this: https://thinkingmachines.ai/blog/defeating-nondeterminism-in... TL;DR: assuming you've squashed all regular non-determinism (itself a tall ask), you either need to ensure you always batch requests deterministically, or ensure all kernels are "batch invariant" (which is absolutely not common practice to do). | | | |
| ▲ | daemonologist a day ago | parent | prev | next [-] | | Maybe if you run it on CPU. (Maybe on GPU if all batching is disabled, but I wouldn't bet on it.) | |
| ▲ | mrheosuper a day ago | parent | prev [-] | | cosmic wave will get you |
|
|
| ▲ | worble a day ago | parent | prev | next [-] |
| Yes, that's the joke |
|
| ▲ | jb1991 a day ago | parent | prev [-] |
| This. I’m still amazed how many people don’t understand how this technology actually works. Even those you would think would have a vested interest in understanding it. |