Remix clone Hacker News

new | show | ask | jobs Github

	▲	dilyevsky 2 hours ago
		> Because you somehow need a giant training set which describes images in natural language, no? That's definitely one way - they train a text encoder together with an image encoder on a labelled set of images. WL & 3b1b made a nice video on it: https://www.youtube.com/watch?v=iv-5mZ_9CPY
	▲	jcattle an hour ago \| parent [-]
		Thanks I'll check out that video