Remix clone Hacker News

new | show | ask | jobs Github

	▲	vunderba 7 days ago
		I think most of the SOTA models could probably handle this but you'd probably get better results using a pipeline: 1. Reduce article to a synopsis using an LLM 2. Generate 4-5 varying description prompts from the synopsis 3. Feed the prompts to an imagegen model Though I'd wager that gpt-image-1 (in the ChatGPT) being multimodal could probably managed it as well.