Remix.run Logo
janalsncm an hour ago

I worked on it for a more specialized task (query rewriting). It’s blazing fast.

A lot of inference code is set up for autoregressive decoding now. Diffusion is less mature. Not sure if Ollama or llama cpp support it.

stavros 3 minutes ago | parent [-]

How was the quality?