Remix clone Hacker News

new | show | ask | jobs Github

	▲	daveguy 3 days ago
		Well, then none of their model's numbers would be bold and that's not what they/AIs usually see in publications!
	▲	cubefox 3 days ago \| parent [-]
		They do look pretty good compared to the two other linear (non-Transformer) models. Conventional attention is hard to beat in benchmarks but it is quadratic in time and memory complexity.