Remix.run Logo
cyanydeez 6 days ago

just try to find some benchmark top_k, temp, etc parameters for llama.cpp. There's no consistent framing of any of these things. Temp should be effectively 0 so it's atleast deterministic in it's random probabilities.

Neywiny 5 days ago | parent | next [-]

Right. There are countless parameters and seeds and whatnots to tweak. But theoretically if all the inputs are the same the outputs should be within Epsilon of a known good. I wouldn't even mandate temperature or any other parameter be a specific value, just that it's the same. That way you can make sure even the pseudorandom processes are the same, so long as nothing pulls from a hardware rng or something like that. Which seems reasonable for them to do so idk maybe an "insecure rng" mode

andai 3 days ago | parent | prev [-]

>Temp should be effectively 0 so it's atleast deterministic in it's random probabilities.

Is this a thing? I read an article about how due to some implementation detail of GPUs, you don't actually get deterministic outputs even with temp 0.

But I don't understand that, and haven't experimented with it myself.

kingstnap 3 days ago | parent [-]

By default CUDA isn't deterministic because of thread scheduling.

The main difference comes from rounding order of reduction difference.

It does make a small difference. Unless you have an unstable floating point algorithm, but if you have an unstable floating point algorithm on a GPU at low precision you were doomed from the start.