Remix clone Hacker News

new | show | ask | jobs Github

	▲	HPsquared 2 years ago
		I suppose the gold standard would be a multimodal model that also looks at the screen (maybe only if the captions aren't making much sense).