Remix.run Logo
motoboi 6 days ago

You probably know this but it can already generate accurate diagrams. Just ask for the output in a diagram language like mermaid or graphviz

bangaladore 6 days ago | parent | next [-]

My experience is it often produces terrible diagrams. Things clearly overlap, lines make no sense. I'm not surprised as if you told me to layout a diagram in XML/YAML there would be obvious mistakes and layout issues.

I'm not really certain a text output model can ever do well here.

resters 6 days ago | parent | next [-]

FWIW I think a multimodal model could be trained to do extremely well with it given sufficient training data. A combination of textual description of the system and/or diagram, source code (mermaid, SVG, etc.) for the diagram, and the resulting image, with training to translate between all three.

bangaladore 6 days ago | parent [-]

Agreed. Even simply I'm sure a service like this already exists (or could easily exist) where the workflow is something like:

1. User provides information

2. LLM generates structured output for whatever modeling language

3. Same or other multimodal LLM reviews the generated graph for styling / positioning issues and ensure its matches user request.

4. LLM generates structured output based on the feedback.

5. etc...

But you could probably fine-tune a multimodal model to do it in one shot, or way more effectively.

behnamoh 6 days ago | parent | prev [-]

I had a latex tikz diagram problem which sonnet 3.7 couldn't handle even after 10 attempts. Gemini 2.5 Pro solved it on the second try.

gunalx 6 days ago | parent [-]

Had the same experience. o3-mini failing misreably, claude 3.7 as well, but gemini 2.5 pro solved it perfectly. (image of diagram without source to tikz diagram)

resters 6 days ago | parent | prev | next [-]

I've had mixed and inconsistent results and it hasn't been able to iterate effectively when it gets close. Could be that I need to refine my approach to prompting. I've tried mermaid and SVG mostly, but will also try graphviz based on your suggestion.

antman 6 days ago | parent | prev [-]

Plantuml (action) diagrams are my go to