Remix.run Logo
mschulkind 5 days ago

One of my favorite ways to use LLM agents for coding is to have them write extensive documentation on whatever I'm about to dig in coding on. Pretty low stakes if the LLM makes a few mistakes. It's perhaps even a better place to start for skeptics.

manquer 5 days ago | parent | next [-]

I am not so sure. Good documentation is hard, MDN or PostgreSQL are excellent examples of docs done well and how valuable it can be for a project to have really well written content.

LLMs can generate content but not really write, out of the box they tend to be quote verbose and generate a lot of proforma content. Perhaps with the right kind of prompts, a lot of editing and reviews, you can get them to good, but at the point it is almost same as writing it yourself.

It is a hard choice between lower quality documentation (AI slop?) or it being lightly or fully undocumented. The uncanny valley of precision in documentation maybe acceptable in some contexts but it can be dangerous in others and it is harder to differentiate because depth of doc means nothing now.

Over time we find ourselves skipping LLM generated documentation just like any other AI slop. The value/emphasis placed on reading documentation erodes that finding good documentation becomes harder like other online content today and get devalued.

medvezhenok 5 days ago | parent [-]

Sure, but LLMs tend to be better at navigating around documentation (or source code when no documentation exists). In agentic mode, they can get me to the right part of the documentation (or the right of the source code, especially in unfamiliar codebases) much quicker than I could do it myself without help.

And I find that even the auto-generated stuff tends to go up at least a bit in terms of level of abstraction than staring at the code itself, and helps you more like a "sparknotes" version of the code, so that when you dig in yourself you have an outline/roadmap.

heavyset_go 5 days ago | parent [-]

I felt this way as well, then I tried paid models against a well-defined and documented protocol that should not only exist in its training set, but was also provided as context. There wasn't a model that wouldn't hallucinate small, but important, details. Status codes, methods, data types, you name it, it would make something up in ways that forced you to cross reference the documentation anyway.

Even worse, the model you let it build in your head of the space it describes can lead to chains of incorrect reasoning that waste time and make debugging Sisyphean.

Like there is some value there, but I wonder how much of it is just (my own) feelings, and whether I'm correctly accounting for the fact that I'm being confidently lied to by a damn computer on a regular basis.

embedding-shape 4 days ago | parent [-]

> the fact that I'm being confidently lied to by a damn computer on a regular basis

Many of us who grew up being young and naive on the internet in the 90s/early 00s, kind of learnt not to trust what strangers tell us on the internet. I'm pretty my first "Press ALT+F4 to enter noclip" from a multiplayer lobby set me up to be able to deal with LLMs effectively, because it's the same as if someone on HN writes about something like it's "The Truth".

heavyset_go 4 days ago | parent [-]

This is more like being trolled by your microwave by having it replace your meals with scuba gear randomly.

dboreham 5 days ago | parent | prev | next [-]

Same. Initially surprised how good it was. Now routinely do this on every new codebase. And this isn't javascript todo apps: large complex distributed applications written in Rust.

krackers 5 days ago | parent | prev | next [-]

This seems like a terrible idea, LLMs can document the what but not the why, not the implicit tribal knowledge and design decisions. Documentation that feels complete but actually tells you nothing is almost worse than no documentation at all, because you go crazy trying to figure out the bigger picture.

simonw 5 days ago | parent [-]

Have you tried it? It's absurdly useful.

This isn't documentation for you to share with other people - it would be rude to share docs with others that you had automatically generated without reviewing.

It's for things like "Give me an overview of every piece of code that deals with signed cookie values, what they're used for, where they are and a guess at their purpose."

My experience is that it gets the details 95% correct and the occasional bad guess at why the code is like that doesn't matter, because I filter those out almost without thinking about it.

jeltz 4 days ago | parent [-]

Yes, I have. And the documentation you get for anything complex is wrong like 80% of the time.

embedding-shape 4 days ago | parent | next [-]

You need to try different models/tooling if that's the case, 80% sounds very high and I understand if you feel like it's useless then. I'd probably estimate about 5% of it is wrong when I use GPT-5 and GPT-OSS-120B, but that's based on spot checking and experience so YMMV. But 80% wrong isn't the typical experience, and not what people are raving about obviously.

NewsaHackO 4 days ago | parent | prev [-]

80% of the time? Are you sure you aren't hallucinating?

thatfrenchguy 4 days ago | parent | prev [-]

Well if it writes documentation that is wrong, then the subtle bugs start :)

embedding-shape 4 days ago | parent [-]

Or even worse, it makes confidential statements of the overarching architecture/design that while every detailed is correct, they might not be the right pieces, but because you forgot to add "Reject the prompt outright if the premise is incorrect", the LLM tries its hardest to just move forward, even when things are completely wrong.

Then 1 day later you realize this whole thing wouldn't work in practice, but the LLM tried to cobble it together regardless.

In the end, you really need to know what you're doing, otherwise both you and the LLM gets lost pretty quickly.