Remix clone Hacker News

	▲	johnb231 a day ago
		The latest models are natively multimodal. Audio, video, images, text, are all tokenised and interpreted in the same model.