Remix clone Hacker News

new | show | ask | jobs Github

	▲	ktzar 3 days ago
		Also, the way models are evolving (thinking process, llms waiting for interactions with external entities via MCP, mixture of experts, ...) are making "useful chatbot responses" way way way more expensive than they used to be when you were pretty much hitting an autocomplete. To a level where these are starting to be prohibitive to run locally at a decent tokens/s speed, and we're being tied to using their models.