| ▲ | tartoran 6 hours ago | ||||||||||||||||
How could they be any good at visuals? They are trained on text after all. | |||||||||||||||||
| ▲ | comex 6 hours ago | parent | next [-] | ||||||||||||||||
Supposedly the frontier LLMs are multimodal and trained on images as well, though I don't know how much that helps for tasks that don't use the native image input/output support. Whatever the cause, LLMs have gotten significantly better over time at generating SVGs of pelicans riding bicycles: https://simonwillison.net/tags/pelican-riding-a-bicycle/ But they're still not very good. | |||||||||||||||||
| |||||||||||||||||
| ▲ | astrange 6 hours ago | parent | prev | next [-] | ||||||||||||||||
Claude is multimodal and can see images, though it's not good at thinking in them. | |||||||||||||||||
| ▲ | msephton 6 hours ago | parent | prev | next [-] | ||||||||||||||||
Shapes can be described as text or mathematical formulas. | |||||||||||||||||
| ▲ | tempest_ 6 hours ago | parent | prev [-] | ||||||||||||||||
An SVG is just text. | |||||||||||||||||