Remix.run Logo
Legend2440 11 hours ago

I think there's a lot of potential for AI in 3D modeling. But I'm not convinced text is the best user interface for it, and current LLMs seem to have a poor understanding of 3D space.

bdcravens 11 hours ago | parent | next [-]

Text being a challenge is a symptom of the bigger problem: most people have a hard time thinking spatially, and so struggle to communicate their ideas (and that's before you add on modeling vocabulary like "extrude", "chamfer", etc)

LLMs struggle because I think there's a lot of work to be done with translating colloquial speech. For example, someone might describe a creating a tube is fairly ambiguous language, even though they can see it in their head: "Draw a circle and go up 100mm, 5mm thick" as opposed to "Place a circle on the XY plane, offset the circle by 5mm, and extrude 100mm in the z-plane"

numpad0 6 hours ago | parent | next [-]

I don't get the text obsession beyond LLMs being immensely useful that you might as well use LLM for <insert tasks here>. I believe that some things live in text, some in variable size n-dimensional array, or in fixed set of parameters, and so on - I mean, our brains don't run on text alone.

guhidalg 5 hours ago | parent [-]

But our brains do map high-dimensionality input to dimensions low enough to be describable with text.

You can represent a dog as a specific multi-dimensional array (raster image), but the word dog represents many kinds of images.

numpad0 3 hours ago | parent [-]

Yeah, so, that's a lossy/ambiguous process. That represent_in_text(raster_image) -> "dog" don't contain a meaningful amount of the original data. The idea of LLM aided CAD sounds to me like, a sufficiently long hash should contain data it represents. That doesn't make a lot of sense to me.

nitwit005 6 hours ago | parent | prev | next [-]

But, you need the ambiguity, or the AI isn't really a help. If you know the exact coordinates and dimensions of everything, you've already got an answer.

gmueckl 4 hours ago | parent [-]

Not necessarily. Sometimes, the desired final shape is clear, but the path there isn't when using typical parametric modeling steps with the desire to get a clean geometry.

arjie 7 hours ago | parent | prev [-]

When I use Claude to model I actually just speak to it in common English and it translates the concepts. For example, I might say something like this:

    I'm building a mount for our baby monitor that I can attach to the side of the changing table. The pins are x mm in diameter and are y mm apart. [Image #1] of the mounting pins. So what needs to happen is that the pin head has to be large, and the body of the pin needs to be narrow. Also, add a little bit of a flare to the bottom and top so they don't just knocked off the rest of the mount.
And then I'll iterate.

    We need a bit of slop in the measurements there because it's too tight.
And so on. I'll do little bits that I want and see if they look right before asking the LLM to union it to the main structure. It knows how to use OpenSCAD to generate preview PNGs and inspect it.

Amusingly, I did this just a couple of weeks ago and that's how I learned what a chamfer is: a flat angled transition. The adjustment I needed to make to my pins where they are flared (but at a constant angle) is a chamfer. Claude told me this as it edited the OpenSCAD file. And I can just ask it in-line for advice and so on.

skybrian 4 hours ago | parent | prev | next [-]

I think a good UI would be to prompt it with something like "how far is that hole from the edge?" and it would measure it for you, and then "give me a slider to adjust it," and it gives you a slider that moves it in the appropriate direction. If there were already a dimension for that, it wouldn't help much, but sometimes the distance is derived.

I'd love to have that kind of UI for adjusting dimensions in regular (non-CAD) images. Or maybe adjusting the CSS on web pages?

WillNickols 2 hours ago | parent [-]

I think that would make a lot of sense for non-CAD images, but the particular task you described there is do-able in just a few clicks in most CAD systems already. I think the AI would almost always take a longer time to do those kinds of actions than if you did it yourself.

skybrian 2 hours ago | parent [-]

For experts maybe, but beginners would probably find asking questions about how to do things useful.

bob1029 9 hours ago | parent | prev | next [-]

> LLMs seem to have a poor understanding of 3D space.

This is definitely my experience as well. However, in this situation it seems we are mostly working in "local" space, not "world" space wherein there are a lot of objects transformed relative to one another. There is also the massive benefit of having a fundamentally parametric representation of geometry.

I've been developing something similar around Unity, but I am not making competence in spatial domains a mandatory element. I am more interested in the LLM's ability to query scene objects, manage components, and fully own the scripting concerns behind everything.

carshodev 9 hours ago | parent | prev | next [-]

Opus 4.5 seems to be a step above every other model in terms of creating SVGs. Before most models couldn't make something that looked half decent.

But I think this shows that these models can improve drastically on specific domains.

I think if three was some good datasets/mappings for spacial relation and CAD files -> text then a fine tune/model with this in its training data could improve the output a lot.

I assume this project is using a general LLM model with unique system prompt/context/MCP for this.

knicholes 6 hours ago | parent | prev | next [-]

So there's OpenSCAD, which is basically programming the geometry parametrically. But... I'd liken it to generating an SVG of a pelican on a bicycle at the current levels of LLMs.

abdullahkhalids 5 hours ago | parent | next [-]

Will echo sibling. I have tried using Claude Sonnet for OpenSCAD to design a simple soap mold and it failed terribly in getting the rounded shape I wanted. (1) It's really difficult to explain 3d figures in text, and I doubt there is a lot of training material out there. (2) OpenSCAD is limited in what it can do. So the combination is pretty bad.

moffkalast 6 hours ago | parent | prev [-]

I needed some gears generated recently, and figured I could just get it done with Claude or Chatgpt in OpenSCAD in a few minutes... but oh man was I wrong. I was so wrong.

Wasted half an hour generating absolute nonsense if it even compiled and ended up going with one of those svg gear generators instead lmao.

CamperBob2 5 hours ago | parent [-]

You'd probably have been better off giving it a basic summary of OpenSCAD grammar and asking for a C or Python program to emit the code.

WillNickols 11 hours ago | parent | prev | next [-]

Curious what you think is the best interface for it? We thought about this ourselves and talked to some folks but it didn't seem there was a clear alternative to chat.

nancyminusone 10 hours ago | parent | next [-]

Solidworks's current controls are the best interface for it. "Draw a picture" is something you're going to find really difficult to beat. Most of the other parametric CADs work the same way, and Solidworks is widely known as one of the best interfaces on top of that. They've spent decades building one that is both unambiguous and fast.

thesuitonym 10 hours ago | parent [-]

Maybe it's just the engineers I've worked with, but I've never heard anyone describe Solidworks as "fast."

(Of course, you and I know it is, it's just that you're asking it to do a lot)

jwagenet an hour ago | parent | next [-]

Modelling isn't the slow part. If one is copying a drawing and have exact dimensions its pretty straightforward in most software even if the software is bloated.

nancyminusone 10 hours ago | parent | prev [-]

Haha yes, I've never heard any engineer discribe any CAD package as anything other than slow and full of bugs. But of the alternatives, I think most would still pick Solidworks.

gmueckl 4 hours ago | parent | next [-]

I wonder how many of these bugs are actually situations where the underlying algorithms are simply confronted with situations outside their valid input domains. This can happen easily with 3d surface representations of geometries.

FuriouslyAdrift 8 hours ago | parent | prev [-]

All of our production engineers that use CATIA think SolidWorks is fast...

I guess it's all in the perspective

strobe 9 hours ago | parent | prev | next [-]

maybe some combination of visual representation with text. For example it's not easy to come up intuitive with names of operations which could be applied to some surface. But if you could say something like 'select top side of cylinder' and it will show you some possible names of operations (with illustrations/animations) which could be applied to it then it's easy to say back what it need to do without actually knowing what actually possible. So as result it maybe just much quicker way to interact with CAD that we are using currently.

fragmede 9 hours ago | parent | prev [-]

The clear alternative is VR. You put on hand trackers and then physical describe the part to the machine. It should be rid m this wide, gestures, and moves hands that wide.

https://shapelabvr.com/

fragmede 10 hours ago | parent | prev [-]

It's the star trek future way of interfacing with things. I don't know SOLIDWORKS at all. I'm a total noob at Fusion 360, but I've made a couple of things with it. Sketch and extrude. But what I can do is English. Using a combination of Claude and openSCAD and my knowledge of programming, I was able to make something that I could 3d print, without having to learn SOLIDWORKS. Same with Abelton for music. It's frustrating when Claude does the wrong thing, but where it shines is when you give it the skill to render the object to a png for it to look at the scad that it's generating, so it can iterate until it actually makes what you're looking for. It's the human out of the loop where leaps are being made.