My point still stands. I don't know what the LLM is doing so my guess is it's cheating unless there is evidence to the contrary.

▲

red75prime 2 hours ago | parent | next [-]

I guess your answer to "Try to run Claude Code on your own 'ill-defined' problem" would be "I'm not interested." Correct? I think we can stop here then.

▲

KeplerBoy 42 minutes ago | parent | prev | next [-]

Well that's certainly a challenge when you use LLMs for this test driven style of programming.

▲

saagarjha 3 hours ago | parent | prev [-]

Why do you assume it’s cheating?

▲

measurablefunc an hour ago | parent [-]

Because it's a well know failure mode of neural networks & scalar valued optimization problems in general: https://www.nature.com/articles/s42256-020-00257-z

	▲	saagarjha 14 minutes ago \| parent \| next [-]
		Again, you can just read the code
	▲	red75prime 23 minutes ago \| parent \| prev [-]
		And? Anthropic is not aware of this 2020 paper? The problem is not solvable?