Remix clone Hacker News

new | show | ask | jobs Github

	▲	metaopai 4 hours ago
		Datacenters are a hedge the investor class is pushing and it's not really needed. Once you realize: 1) LLM = CPU 2) Session Context = L1, L2, L3, CPU Cache the entire AI industry is operating within the CPU cache of the LLM provider this is why cost moves quadratically, increases noise - signal ration, regenerative feedback loop, it dilutes the user narration, and it actually creates an architectural induce hallucination. We've literally solved this problem in the 1960's with a memory architecture: the OS has a memory controller, tasked with taking data from persistant structure storage (HD) loading into CPU Cache and the CPU computers, the output is stored in RAM and then moved into HD. this is required on all AI applications, what the industry has done, is supplement a RAG which is summarizing context, however the entire context summarized chain is still being processed by the LLM. if you employ a well sustain memory architecture you can retrieve the context you only need to feed to the llm. reduce token cost and then reduce energy therefore less demand of datacenters. checkout my article and what i built metaop.ai it's a humble promotion but no one cares about ai memory. https://x.com/metaopai/status/2070187664192524528