Remix.run Logo
cma 3 days ago

> But I can tell the quality drops even when you do that

Dario said in a recent interview that they never switch to a lower quality model in terms of something with different parameters during times of load. But he left room for interpretation on whether that means they could still use quantization or sparsity. And then additionally, his answer wasn't clear enough to know whether or not they use a lower depth of beam search or other cheaper sampling techniques.

He said the only time you might get a different model itself is when they are A-B testing just before a new announced release.

And I think he clarified this all applied to the webui and not just the API.

(edit: I'm rate limited on hn, here's the source in reply to the below https://www.youtube.com/watch?v=ugvHCXCOmm4&t=42m19s )

dr_dshiv 3 days ago | parent | next [-]

Rate limited on hn! Share more please

cma 3 days ago | parent [-]

https://news.ycombinator.com/item?id=34129956

avarun 3 days ago | parent | prev [-]

Source?