Remix.run Logo
willmadden a day ago

Build a new feature. If you aren't bogged down in bureaucracy it will happen much faster.

YesBox 19 hours ago | parent | next [-]

I dont use LLMs much. When I do, the experience always feels like search 2.0. Information at your fingertips. But you need to know exactly what you're looking for to get exactly what you need. The more complicated the problem, the more fractal / divergent outcomes there are. (Im forming the opinion that this is going to be the real limitations of LLMs).

I recently used copilot.com to help solve a tricky problem for me (which uses GPT 5.1):

   I have an arbitrary width rectangle that needs to be broken into smaller 
   random width rectangles (maintaining depth) within a given min/max range. 
The first solution merged the remainder (if less than min) into the last rectangle created (regardless if it exceeded the max).

So I poked the machine.

The next result used dynamic programming and generated every possible output combination. With a sufficiently large (yet small) rectangle, this is a factorial explosion and stalled the software.

So I poked the machine.

I realized this problem was essentially finding the distinct multisets of numbers that sum to some value. The next result used dynamic programming and only calculated the distinct sets (order is ignored). That way I could choose a random width from the set and then remove that value. (The LLM did not suggest this). However, even this was slow with a large enough rectangle.

So I poked my brain.

I realized I could start off with a greedy solution: Choose a random width within range, subtract from remaining width. Once remaining width is small enough, use dynamic programming. Then I had to handle the edges cases (no sets, when it's okay to break the rules.. etc)

So the LLMs are useful, but this took 2-3 hours IIRC (thinking, implementation, testing in an environment). Pretty sure I would have landed on a solution within the same time frame. Probably greedy with back tracking to force-fit the output.

gilbetron 9 hours ago | parent | next [-]

I just tested this with Claude Code and Opus 4.6, with the following prompt:

"I have an arbitrary width rectangle that needs to be broken into smaller random width rectangles (maintaining depth) within a given min/max range. The solution needs to be highly performant from an algorithmic standpoint, well-tested using TDD and Red/Green testing, written in python, and not have any subtle errors."

It got the answer you ended up with (if I'm understanding you correctly) the first time in just over 2 minutes of working, and included a solid test suite examining edge cases and with input validation.

YesBox 8 hours ago | parent [-]

How can we verify if you dont post the code?

I appreciate you testing, even though it's not a great comparison:

- My feedback cycle of LLM prompting forced me to be more explicit with each call, which benefited your prompt since I gave you exactly what to look for with fewer nuances.

- Maybe GPT 5.1 is old or kneecapped for newer versions of GPT

- Maybe Opus/Claud is just a way better model :P

Please post the code!

Edit: Regarding "exactly what to look for", when solving a new problem, rarely is all the nuance available for the first iteration.

redhale 12 hours ago | parent | prev [-]

> I don't use LLMs much

Sorry to be so blunt, but it's not surprising that you aren't able to get much value from these tools, considering you don't use them much.

Getting value from LLMs / agents is a skill like any other. If you don't practice it deliberately, you will likely be bad at it. It would be a mistake to confuse lack of personal skill for lack of tool capability. But I see people make this mistake all the time.

YesBox 10 hours ago | parent [-]

Would be helpful if you pointed out what I did wrong :).

If it's "you didn't explain the problem clearly enough", then that aligns with my original comment.

windward 8 hours ago | parent [-]

If you ask the chatbot for best practices it will tell you, including that you don't use a chatbot.

bandrami a day ago | parent | prev | next [-]

Most of these are new features, but then they have to integrate with the existing software so it's not really greenfield. (Not to mention that our clients aren't getting any faster at approving new features, either.)

willmadden a day ago | parent [-]

Did you train a self-hosted/open source LLM on your existing software and documentation? That should make it far more useful. It's not claude code, but some of those models are 80% there. In 6 months they'll be today's claude code.

bandrami a day ago | parent [-]

What would that help us with?

willmadden 8 hours ago | parent [-]

The LLM needs to understand your existing codebase if it's going to be useful building features that integrate with said codebase seamlessly without breaking things or assuming things that don't exist. That's not something you want to give away to a private AI company, so self-host an open source model.

sdf2df a day ago | parent | prev [-]

Its this kind of thinking that tells me people cant be trusted with their comments on here re. "Omg I can produce code faster and it'll do this and that".

No simply 'producing a feature' aint it bud. That's one piece of the puzzle.