Don't think that analogy works unless you could write a script that automatically removes incorrect medical advice, because then you would indeed have an LLM-with-a-script that was an expert doctor (which you can do for illegal chess move, but obviously not for evaluating medical advice)

▲

wavemode a year ago | parent | next [-]

You can write scripts that correct bad math, too. In fact most of the time ChatGPT will just call out to a calculator function. This is a smart solution, and very useful for end users! But, still, we should not try to use that to make the claim that LLMs have a good understanding of math.

▲

afro88 a year ago | parent | next [-]

If a script were applied that corrected "bad math" and now the LLM could solve complex math problems that you can't one-shot throw at a calculator, what would you call it?

	▲	sixfiveotwo a year ago \| parent \| next [-]
		It's a good point. But this math analogy is not quite appropriate: there's abstract math and arithmetic. A good math practitioner (LLM or human) can be bad at arithmetic, yet good at abstract reasoning. The later doesn't (necessarily) requires the former. In chess, I don't think that you can build a good strategy if it relies on illegal moves, because tactics and strategies are tied.
	▲	danparsonson a year ago \| parent \| prev \| next [-]
		If I had wings, I'd be a bird. Applying a corrective script to weed out bad answers is also not "one-shot" solving anything, so I would call your example an elaborate guessing machine. That doesn't mean it's not useful, but that's not how a human being does maths, when they understand what they're doing - in fact you can readily program a computer to solve general maths problems correctly the first time. This is also exactly the problem with saying that LLMs can write software - a series of elaborate guesses is undeniably useful and impressive, but without a corrective guiding hand, ultimately useless, and not demonastrating generalised understanding of the problem space. The dream of AI is surely that the corrective hand is unnecessary?
	▲	at_a_remove a year ago \| parent \| prev [-]
		Then you could replace the LLM with a much cheaper RNG and let it guess until the "bad math filter" let something through. I was once asked by one of the Clueless Admin types if we couldn't just "fix" various sites such that people couldn't input anything wrong. Same principle.

▲

vunderba a year ago | parent | prev | next [-]

Agreed. It's not the same thing and we should strive for precision (LLMs are already opaque enough as it is).

An LLM that recognizes an input as "math" and calls out to a NON-LLM to solve the problem vs an LLM that recognizes an input as "math" and also uses next-token prediction to produce an accurate response ARE DIFFERENT.

▲

henryfjordan a year ago | parent | prev [-]

At what point does "knows how to use a calculator" equate to knowing how to do math? Feels pretty close to me...

	▲	Tepix a year ago \| parent [-]
		Well, LLMs are bad at math but they're ok at detecting math and delegating it to a calculator program. It's kind of like humans.

▲

kcbanner a year ago | parent | prev [-]

It would be possible to employ an expert doctor, instead of writing a script.

	▲	ben_w a year ago \| parent [-]
		Which is cheaper: 1. having a human expert creating every answer or 2. having an expert check 10 answers each of which have a 90% chance of being right and then manually redoing the one which was wrong Now add a complications that: • option 1 also isn't 100% correct • nobody knows which things in option 2 are correlated or not and if those are or aren't correlated with human errors so we might be systematically unable to even recognise the errors • even if we could, humans not only get lazy without practice but also get bored if the work is too easy, so a short-term study in efficiency changes doesn't tell you things like "after 2 years you get mass resignations by the competent doctors, while the incompetent just say 'LGTM' to all the AI answers"