Remix.run Logo
embedding-shape 9 hours ago

Yeah, no way I'd do this if I paid per token. Next experiment will probably be local-only together with GPT-OSS-120b which according to my own benchmarks seems to still be the strongest local model I can run myself. It'll be even cheaper then (as long as we don't count the money it took to acquire the hardware).

mercutio2 6 hours ago | parent [-]

What toolchain are you going to use with the local model? I agree that’s a Strong model, but it’s so slow for be with large contexts I’ve stopped using it for coding.

embedding-shape 3 minutes ago | parent [-]

I have my own agent harness, and the inference backend is vLLM.