Remix clone Hacker News

new | show | ask | jobs Github

	▲	RossBencina 5 months ago
		One claim from that podcast was that the xLSTM attention mechanism is (in practical implementation) more efficient than (transformer) flash attention, and therefore promises to significantly reduces the time/cost of test-time compute.
	▲	korbip 5 months ago \| parent [-]
		Test it out here: https://github.com/NX-AI/mlstm_kernels https://huggingface.co/NX-AI/xLSTM-7b