A startup claims it broke through a bottleneck that's holding back LLMs

	▲	A startup claims it broke through a bottleneck that's holding back LLMs(technologyreview.com)
		3 points by zacharyozer 5 hours ago \| 1 comments

	▲	zacharyozer 5 hours ago \| parent [-]
		> According to Dangel, it costs $2,600 to run Anthropic’s LLM Opus 4.6 through RULER 128, a test developed by Nvidia to assess a model’s ability to retrieve information from large data sets. And SubQ? “It cost us eight dollars,” he says. > SubQ does seem to be able to handle a lot of text at once. The model has a context window (roughly akin to a working memory) up to 12 million tokens long. Most top models today have context windows one million tokens long. In a demo that Whedon ran for me, he asked SubQ to perform a task that required it to reason about information contained in 400 documents. It responded in seconds. When he gave Perplexity—a popular LLM-powered search engine—the same task, it failed to load all 400 documents.