Remix.run Logo
kgeist a day ago

What about constrained decoding (with JSON schemas)? I noticed my vLLM instance is using 1 CPU 100%.