| ▲ | osaariki 3 days ago | |
You're exactly right. The llguidance library [1,2] seems to have emerged as the go-to solution for this by virtue of being >10X faster than its competition. It's work from some past colleagues of mine at Microsoft Research based on theory of (regex) derivatives, which we perviously used to ship a novel kind of regex engine for .NET. It's cool work and AFAIK should ensure full adherence to a JSON grammar. llguidance is used in vLLM, SGLang, internally at OpenAI and elsewhere. At the same time, I also see a non-trivial JSON error rate from Gemini models in large scale synthetic generations, so perhaps Google hasn't seen the "llight" yet and are using something less principled. 1: https://guidance-ai.github.io/llguidance/llg-go-brrr 2: https://github.com/guidance-ai/llguidance | ||
| ▲ | red2awn 2 days ago | parent [-] | |
Cool stuff! I don't get how all the open source inference framework have this down but the big labs doesn't... Gemini [0] is falsely advertising this: > This capability guarantees predictable and parsable results, ensures format and type-safety, enables the programmatic detection of refusals, and simplifies prompting. [0]: https://ai.google.dev/gemini-api/docs/structured-output?exam... | ||