Remix.run Logo
gruez 3 days ago

>What about XML? The plugin can heal XML output as well - contact us if you’d like access.

Isn't this exactly how we got weird html parsing logic in the first place, with "autohealing" logic for mismatched closing tags or quotes?

AlexCoventry 3 days ago | parent [-]

This is probably a bit different. An LLM outputs a token at a time ("autoregressively") by sampling from a per-position token probability distribution, which depends on all the prior context so far. While the post doesn't describe OpenRouter's approach, most structured LLM output works by putting a mask over that distribution, so that any token which would break the intended structure has probability zero and cannot be sampled. So for instance, in the broken example from the post,

    {"name": "Alice", "age": 30
the standard LLM output would have stopped there because the LLM output an end-of-sequence (EOS) token. But because that would lead to a syntax error in JSON, the EOS token would have probability zero, and it would be forced to either extend the number "30", or add more entries to the object, or end it with "}".

I haven't played much with structured output, but I imagine the biggest risk is that you may force the model to work with contexts outside its training data, leading it to produce garbage, though hopefully syntactically-correct garbage.

I don't understand, though, why the probability of incorrect JSON wouldn't go to 0, under this framework (unless you hit the max sequence length before the JSON ended.) The post implies that JSON errors still happen, so it's possible they're doing something else.