Remix.run Logo
measurablefunc 3 hours ago

My point still stands. I don't know what the LLM is doing so my guess is it's cheating unless there is evidence to the contrary.

red75prime 2 hours ago | parent | next [-]

I guess your answer to "Try to run Claude Code on your own 'ill-defined' problem" would be "I'm not interested." Correct? I think we can stop here then.

KeplerBoy 42 minutes ago | parent | prev | next [-]

Well that's certainly a challenge when you use LLMs for this test driven style of programming.

saagarjha 3 hours ago | parent | prev [-]

Why do you assume it’s cheating?

measurablefunc an hour ago | parent [-]

Because it's a well know failure mode of neural networks & scalar valued optimization problems in general: https://www.nature.com/articles/s42256-020-00257-z

saagarjha 14 minutes ago | parent | next [-]

Again, you can just read the code

red75prime 23 minutes ago | parent | prev [-]

And? Anthropic is not aware of this 2020 paper? The problem is not solvable?