Remix clone Hacker News

new | show | ask | jobs Github

	▲	otabdeveloper4 3 days ago
		Most likely still 32k tokens under the hood, but with some context slicing/averaging hacks to make inference not error out on infinite input. (That's what I do locally with llama.cpp)