Remix.run Logo
NullCascade 4 days ago

Considering most SOTA LLMs are also multimodal/vision models, could they get better results if the LLM gets visual feedback with it?