Remix clone Hacker News

new | show | ask | jobs Github

	▲	jiggawatts an hour ago
		> exclusively search-based rewards so that the model isn't required to compress a large proportion of the internet into their weights. That just gave me an idea! I wonder how useful (and for what) a model would be if it was trained using a two-phase approach: 1) Put the training data through an embedding model to create a giant vector index of the entire Internet. 2) Train a transformer LLM but instead only utilising its weights, it can also do lookups against the index. Its like a MoE where one (or more) of the experts is a fuzzy google search. The best thing is that adding up-to-date knowledge won’t require retraining the entire model!