Remix.run Logo
juxtaposicion 4 days ago

This looks great; very useful for (example) ranking outputs by confidence so you can do human reviews of the not-confident ones.

Any chance we can get Pydantic support?

themanmaran 3 days ago | parent | next [-]

Fyi logprobs !== confidence.

If you run "bananas,fishbowl,phonebook," and get {"sponge": 0.76}

It doesn't mean that "placemat" was the 76% correct answer. Just that the word "sponge" was the next most likely word for the model to generate.

ngrislain 3 days ago | parent | prev [-]

Actually, OpenAI provides Pydantic support for structured output (see client.beta.chat.completions.parse in https://platform.openai.com/docs/guides/structured-outputs).

The library is compatible with that but does not use Pydantic further than that.

juxtaposicion 3 days ago | parent [-]

Right the hope was to go further. E.g. if the input is:

```

class Classification(BaseModel):

    color: Literal['red', 'blue', 'green']
```

then the output type would be:

```

class ClassificationWithLogProbs(BaseModel):

    color: Dict[Literal['red', 'blue', 'green'], float]
```

Don't take this too literally; I'm not convinced that this is the right way to do it. But it would provide structure and scores without dealing with a mess of complex JSON.

lyu07282 3 days ago | parent [-]

but this ultimately just converts to json schema, or the openai function calling definition format.

One question I always had was what about the descriptions you can attach to the class and attributes? ( = Field(description=...) in pydantic) is the model made aware of those descriptions?