Remix.run Logo
poisonborz 5 days ago

Software should still come with a documentation that LLMs can train on, plus they have all the learnings from interactions with developers asking about it - who will more and more just go this route (and following whatever guidance they get) and not thinking of searching for other material, let alone write guides for others. I'm not saying this is all that good, but that's the reasonable outcome.

creesch 17 hours ago | parent [-]

Given it has been a few days it might be unlikely that you read it. But I figured I'd reply anyway in case you do.

I mean this with no hostile intend, but have you honestly stopped and thought about what you did type down here?

What I mean by that is, have you looked at the complete picture to see if what you are saying makes sense in relation to what you initially said.

You questioned the need for documentation. Now you are saying there needs to be good documentation for LLMs to train on. Good documentation for LLMs to train on is actually much more extensive than than the documentation written for humans to begin with. So, you are effectively saying there needs to be more documentation.

Secondly, how can developers ask about something when they don't have decent documentation to start with.

poisonborz 10 hours ago | parent [-]

There are some services that notify on replies :)

Despite your intent your comment is kinda meanly worded, and it is perhaps you who did not read it, or at least mix up terms. In my parent comment I did not mention documentation, but tutorials, as in guide articles like devs not associated with a project writing about how to achieve some goal.

To be more specific, I think there are two distinct type of text that gets written about a project during in its lifecycle, in parallel:

#1 A documentation, written by the maintainers. This will always contain all functions, methods, API endpoints, components, whatever. It's the complete description to the full extent of features. It may or may not also contain the second type. An LLM can theoretically interpret and use the whole project based on this info.

#2 Guides, tutorials, reviews, forum posts, describing or giving tips on the whole project or specific features, or describing methods using that project ("Use x and y to process queries 20x faster on z!"). These writings were essential for the spreading and marketing of a mentioned project. I think this is what the OP article was about. My argument was that these would not be seeked anymore, devs would just ask LLMs "how to achieve x with this tool".

creesch 4 hours ago | parent [-]

> LLM can theoretically interpret and use the whole project based on this info.

That's the thing though, LLMs really can't. At least not to a degree that they are able to act on it at a same level as when trained on everything else including tutorials and such.

Languages and technologies that LLMs excel at are those that are widely spread with numerous examples.

Just plain documentation with just the api calls isn't enough to train a LLM on. They effectively learn from example.

So with just #1 and no longer #1 aimed at humans you will never get to a point where you can ask an LLM about the technology.

This is what prompted me to remark that I feel you haven't thought this through. Which you might have, but that makes me think you have a overly optimistic view of what data is enough to reliably train LLMs on.

Again, to stress the point, just documentation isn't enough. So you really do need humans adapting the technology first, widening the base of examples to train on.