Remix clone Hacker News

new | show | ask | jobs Github

	▲	lambda 3 hours ago
		Yeah, I looked up some models I have actually run locally on my Strix Halo laptop, and its saying I should have much lower performance than I actually have on models I've tested. For MoE models, it should be using the active parameters in memory bandwidth computation, not the total parameters.