> had to iterate with them for 4/5 times each. Gemini got it right but then used deprecated methods

How hard would it be to automate these iterations?

How hard would it be to automatically check and improve the code to avoid deprecated methods?

I agree that most products are still underwhelming, but that doesn't mean that the underlying tech is not already enough to deliver better LLM-based products. Lately I've been using LLMs more and more to get started with writing tests on components I'm not familiar with, it really helps.

▲

jaccola 3 months ago | parent | next [-]

How hard can it be to create a universal "correctness" checker? Pretty damn hard!

Our notion of "correct" for most things is basically derived from a very long training run on reality with the loss function being for how long a gene propagated.

	▲	hiq 3 months ago \| parent [-]
		You don't need a full correctness checker to get a useful product though. New code generated by the current generation of LLMs, which also compiles and passes existing tests, is likely to be somewhat useful in my experience. The problem is that we still get too much code that doesn't pass these basic requirements.

▲

henryjcee 3 months ago | parent | prev | next [-]

> How hard would it be to automate these iterations?

The fact that we're no closer to doing this than we were when chatgpt launched suggests that it's really hard. If anything I think it's _the_ hard bit vs. building something that generates plausible text.

Solving this for the general case is imo a completely different problem to being able to generate plausible text in the general case.

▲

HDThoreaun 3 months ago | parent [-]

This is not true. The chain of logic models are able to check their work and try again given enough compute.

	▲	lelandbatey 3 months ago \| parent [-]
		They can check their work and try again an infinite number of times, but the rate at which they succeed seems to just get worse and worse the further from the beaten path (of existing code from existing solutions) that they stray.

▲

9dev 3 months ago | parent | prev [-]

How hard would it be, in terms of the energy wasted for it? Is everything we can do worth doing, just for the sake of being able to?