| ▲ | 20k 10 hours ago |
| If I have to treat LLMs as a fallible source of information, why wouldn't I just go right to the source though? Having an extra step in between me and the actual truth seems pointless WinAPI docs are pretty accurate and up to date |
|
| ▲ | simonw 10 hours ago | parent | next [-] |
| Because it's faster. If the WinAPI docs are solid you can do things like copy and paste pages of them into Claude and ask a question, rather then manually scan through them looking for the answer yourself. Apple's developer documentation is mostly awful - try finding out how to use the sips or sandbox-exec CLI tools for example. LLMs have unlocked those for me. |
| |
| ▲ | 20k 8 hours ago | parent [-] | | But you have to check the answer against the documentation anyway though, to validate that its actually correct! Unless you're just taking the LLM answers at face value? | | |
| ▲ | simonw 7 hours ago | parent [-] | | For most code stuff you don't check the answer against the documentation - you write the code and run it and see if it works. That's always a better signal than anything that official documentation might tell you. | | |
| ▲ | 20k 6 hours ago | parent [-] | | That seems like a strong error, you have no idea if it works or if it just happens to work | | |
| ▲ | simonw 6 hours ago | parent [-] | | If you're good at programming you can usually tell exactly why it worked or didn't work. That's how we've all worked before coding agents came along too - you don't blindly assume the snippet you pasted off StackOverflow will work, you try it and poke at it and use it to build a firm mental model of whether it's the right thing or not. | | |
| ▲ | 20k 6 hours ago | parent [-] | | Sure. A big part of how I'd know that the function I'm calling does what I think it does, is by reading the source documentation associated with it Does it have any threading preconditions? Any weird quirks? Any strange UB? That's stuff you can't find out just by testing. You can ask the LLM, but then you have to read the docs anyway to check its answer | | |
| ▲ | simonw 4 hours ago | parent [-] | | I envy you for the universally high quality of documentation that the code you are working with has! |
|
|
|
|
|
|
|
| ▲ | mgraczyk 10 hours ago | parent | prev [-] |
| Because it will take you years to read all the information you can get funneled through an LLM in a day |
| |
| ▲ | 20k 8 hours ago | parent [-] | | Except you have no idea if what the LLM is telling you is true I do a lot of astrophysics. Universally LLMs are wrong about nearly every astrophysics questions I've asked them - even the basic ones, in every model I've ever tested. Its terrifying that people take these at face value For research at a PhD level, they have absolutely no idea what's going on. They just make up plausible sounding rubbish | | |
| ▲ | cdetrio 7 hours ago | parent | next [-] | | Astrophysicist David Kipping had a podcast episode a month ago reporting that LLMs are working shockingly well for him, as well as for the faculty at the IAS.[1] It's curious how different people come to very different conclusions about the usefulness of LLMs. https://youtu.be/PctlBxRh0p4 | | |
| ▲ | 20k 7 hours ago | parent [-] | | The problem with these long videos is that what I really want to see is what questions were asked of it, and the accuracy of the results Every time I ask LLMs questions I know the answers to, its results are incomplete, inaccurate, or just flat out wrong much of the time The idea that AI is an order of magnitude superior to coders is flat out wrong as well. I don't know who he's talking to |
| |
| ▲ | mgraczyk 8 hours ago | parent | prev [-] | | Somehow we went from writing software apps and reading API docs to research level astrophysics Sure it's not there yet. Give it a few months | | |
| ▲ | 20k 7 hours ago | parent [-] | | It doesn't even work for basic astrophysics I asked chatgpt the other day: "Where did elements heavier than iron come from?" The answer it gave was totally wrong. Its not a hard question. I asked it this question again today, and some of it was right (!). This is such a low bar for basic questions |
|
|
|