Remix.run Logo
simonw 9 hours ago

Human teachers make mistakes too. If you aren't consuming information with a skeptical eye you're not learning as effectively as you could be no matter what the source is.

The trick to learning with LLMs is to treat them as one of multiple sources of information, and work with those sources to build your own robust mental of how things work.

If you exclusively rely on official documentation you'll miss out on things that the documentation doesn't cover.

20k 9 hours ago | parent [-]

If I have to treat LLMs as a fallible source of information, why wouldn't I just go right to the source though? Having an extra step in between me and the actual truth seems pointless

WinAPI docs are pretty accurate and up to date

simonw 9 hours ago | parent | next [-]

Because it's faster.

If the WinAPI docs are solid you can do things like copy and paste pages of them into Claude and ask a question, rather then manually scan through them looking for the answer yourself.

Apple's developer documentation is mostly awful - try finding out how to use the sips or sandbox-exec CLI tools for example. LLMs have unlocked those for me.

20k 8 hours ago | parent [-]

But you have to check the answer against the documentation anyway though, to validate that its actually correct!

Unless you're just taking the LLM answers at face value?

simonw 7 hours ago | parent [-]

For most code stuff you don't check the answer against the documentation - you write the code and run it and see if it works.

That's always a better signal than anything that official documentation might tell you.

20k 6 hours ago | parent [-]

That seems like a strong error, you have no idea if it works or if it just happens to work

simonw 6 hours ago | parent [-]

If you're good at programming you can usually tell exactly why it worked or didn't work. That's how we've all worked before coding agents came along too - you don't blindly assume the snippet you pasted off StackOverflow will work, you try it and poke at it and use it to build a firm mental model of whether it's the right thing or not.

20k 5 hours ago | parent [-]

Sure. A big part of how I'd know that the function I'm calling does what I think it does, is by reading the source documentation associated with it

Does it have any threading preconditions? Any weird quirks? Any strange UB? That's stuff you can't find out just by testing. You can ask the LLM, but then you have to read the docs anyway to check its answer

simonw 4 hours ago | parent [-]

I envy you for the universally high quality of documentation that the code you are working with has!

mgraczyk 9 hours ago | parent | prev [-]

Because it will take you years to read all the information you can get funneled through an LLM in a day

20k 8 hours ago | parent [-]

Except you have no idea if what the LLM is telling you is true

I do a lot of astrophysics. Universally LLMs are wrong about nearly every astrophysics questions I've asked them - even the basic ones, in every model I've ever tested. Its terrifying that people take these at face value

For research at a PhD level, they have absolutely no idea what's going on. They just make up plausible sounding rubbish

cdetrio 6 hours ago | parent | next [-]

Astrophysicist David Kipping had a podcast episode a month ago reporting that LLMs are working shockingly well for him, as well as for the faculty at the IAS.[1]

It's curious how different people come to very different conclusions about the usefulness of LLMs.

https://youtu.be/PctlBxRh0p4

20k 6 hours ago | parent [-]

The problem with these long videos is that what I really want to see is what questions were asked of it, and the accuracy of the results

Every time I ask LLMs questions I know the answers to, its results are incomplete, inaccurate, or just flat out wrong much of the time

The idea that AI is an order of magnitude superior to coders is flat out wrong as well. I don't know who he's talking to

mgraczyk 7 hours ago | parent | prev [-]

Somehow we went from writing software apps and reading API docs to research level astrophysics

Sure it's not there yet. Give it a few months

20k 6 hours ago | parent [-]

It doesn't even work for basic astrophysics

I asked chatgpt the other day:

"Where did elements heavier than iron come from?"

The answer it gave was totally wrong. Its not a hard question. I asked it this question again today, and some of it was right (!). This is such a low bar for basic questions