Remix clone Hacker News

new | show | ask | jobs Github

	▲	XenophileJKO 2 hours ago
		It mostly depends on "how" the models work. Multi-modal unified text/image sequence to sequence models can do this pretty well, diffusion doesn't.