Remix clone Hacker News

new | show | ask | jobs Github

	▲	gwern 7 hours ago
		So if it's not using attention and it processes the entire input into an embedding to process in one go, I guess this is neither a Transformer nor a RNN but just a MLP?