▲ | juxtaposicion 4 days ago | ||||||||||||||||
This looks great; very useful for (example) ranking outputs by confidence so you can do human reviews of the not-confident ones. Any chance we can get Pydantic support? | |||||||||||||||||
▲ | themanmaran 3 days ago | parent | next [-] | ||||||||||||||||
Fyi logprobs !== confidence. If you run "bananas,fishbowl,phonebook," and get {"sponge": 0.76} It doesn't mean that "placemat" was the 76% correct answer. Just that the word "sponge" was the next most likely word for the model to generate. | |||||||||||||||||
▲ | ngrislain 3 days ago | parent | prev [-] | ||||||||||||||||
Actually, OpenAI provides Pydantic support for structured output (see client.beta.chat.completions.parse in https://platform.openai.com/docs/guides/structured-outputs). The library is compatible with that but does not use Pydantic further than that. | |||||||||||||||||
|