Remix.run Logo
otabdeveloper4 3 days ago

Most likely still 32k tokens under the hood, but with some context slicing/averaging hacks to make inference not error out on infinite input.

(That's what I do locally with llama.cpp)