Remix.run Logo
westoncb 6 hours ago

Interesting that compaction is done using an encrypted message that "preserves the model's latent understanding of the original conversation":

> Since then, the Responses API has evolved to support a special /responses/compact endpoint (opens in a new window) that performs compaction more efficiently. It returns a list of items (opens in a new window) that can be used in place of the previous input to continue the conversation while freeing up the context window. This list includes a special type=compaction item with an opaque encrypted_content item that preserves the model’s latent understanding of the original conversation. Now, Codex automatically uses this endpoint to compact the conversation when the auto_compact_limit (opens in a new window) is exceeded.

icelancer 5 hours ago | parent | next [-]

Their compaction endpoint is far and away the best in the industry. Claude's has to be dead last.

nubg 2 hours ago | parent | next [-]

Help me understand, how is a compaction endpoint not just a Prompt + json_dump of the message history? I would understand if the prompt was the secret sauce, but you make it sound like there is more to a compaction system than just a clever prompt?

kordlessagain 2 hours ago | parent | prev [-]

Yes, agree completely.

swalsh 4 hours ago | parent | prev | next [-]

Is it possible to use the compactor endpoint independently? I have my own agent loop i maintain for my domain specific use case. We built a compaction system, but I imagine this is better performance.

westoncb 4 hours ago | parent | next [-]

I would guess you can if you're using their Responses api for inference within your agent.

__jl__ 4 hours ago | parent | prev [-]

Yes you can and I really like it as a feature. But it ties you to OpenAI…

jswny 4 hours ago | parent | prev [-]

How does this work for other models that aren’t OpenAI models

westoncb 4 hours ago | parent [-]

It wouldn’t work for other models if it’s encoded in a latent representation of their own models.