Remix clone Hacker News

new | show | ask | jobs Github

	▲	jpcompartir an hour ago
		Fair push back, but I do think the LSTM vs Transformers point kinda supports my position in the limit, not refutes. Once the compute bottleneck is removed, LSTMs scale favourably. https://arxiv.org/pdf/2510.02228 (I believe there's similar work done on vanilla LSTMs, but I'd have to go digging) So the bottleneck was compute. Which is compatible with 'data or compute'. But to accept your point, at the time the algorothmic advances were useful/did unlock/remove the bottleneck. A wider point is that eventually (once compute and data are scaled enough) the algorithms are all learning the same representations: https://arxiv.org/pdf/2405.07987 And of course the canon: https://nonint.com/2023/06/10/the-it-in-ai-models-is-the-dat... http://www.incompleteideas.net/IncIdeas/BitterLesson.html Scaling compute & data > algorithmic cleverness