Remix.run Logo
the_mitsuhiko an hour ago

Author here. I’m with you on the abstractions part. I dumped a lot of my though so this into a follow up post: https://lucumr.pocoo.org/2025/11/22/llm-apis/

thierrydamiba 14 minutes ago | parent [-]

Excellent write up. I’ve been thinking a lot about caching and agents so this was right ilup my alley.

Have you experimented with using semantic cache on the chain of thought(what we get back from the providers anyways) and sending that to a dumb model for similar queries to “simulate” thinking?