I just tested this with Claude Code and Opus 4.6, with the following prompt:

"I have an arbitrary width rectangle that needs to be broken into smaller random width rectangles (maintaining depth) within a given min/max range. The solution needs to be highly performant from an algorithmic standpoint, well-tested using TDD and Red/Green testing, written in python, and not have any subtle errors."

It got the answer you ended up with (if I'm understanding you correctly) the first time in just over 2 minutes of working, and included a solid test suite examining edge cases and with input validation.

▲

YesBox 11 hours ago | parent [-]

How can we verify if you dont post the code?

I appreciate you testing, even though it's not a great comparison:

- My feedback cycle of LLM prompting forced me to be more explicit with each call, which benefited your prompt since I gave you exactly what to look for with fewer nuances.

- Maybe GPT 5.1 is old or kneecapped for newer versions of GPT

- Maybe Opus/Claud is just a way better model :P

Please post the code!

Edit: Regarding "exactly what to look for", when solving a new problem, rarely is all the nuance available for the first iteration.

	▲	gilbetron an hour ago \| parent [-]
		I didn't prompt anything odd, just standard prompt "etiquette", actually I significantly prompted less than I would usually do, trying to do a simple prompt like you did.