Remix.run Logo
parhamn 7 hours ago

On first principles it would seem that the "harness" is a myth. Surely a model like Opus 4.6/Codex 5.3 which can reason about complex functions and data flows across many files would trip up over top level function signatures it needs to call?

I see a lot of evidence to the contrary though. Anyone know what the underlying issue here is?

znnajdla 5 hours ago | parent | next [-]

How hard is it to for you to assemble a piece of IKEA furniture without an allen wrench, screwdriver, and clear instructions, vs with those 3?

0x457 4 hours ago | parent | next [-]

Well, I assembled Alex once without instruction and with impact driver and hammer last year. Hardest part was to make tools fit.

parhamn 5 hours ago | parent | prev [-]

You didn't read the article it seems (or the analogy is a bad one). The differences are much more subtle than having a screwdriver or not.

znnajdla 4 hours ago | parent [-]

I did read the article quite enthusiastically and my practical experience confirms the same. Sure the difference is more subtle. But my point was, an "agent" whether human or AI can be a lot more productive with better tools. This guy found a better screwdriver than the most commonly used one. That's amazing and nothing from "first principles" denies that a better tool harness would mean better/faster/more correct AI agents.

3371 5 hours ago | parent | prev | next [-]

If you agree that current LLMs (Transformers) are naturally very susceptible to context/prompt, then you can go on to ask agents for a "raw harness dump" "because I need to understand how to better present my skills and tools in the harness", you maybe will see how "Harness" impact model behavior.

6 hours ago | parent | prev | next [-]
[deleted]
robotresearcher 6 hours ago | parent | prev | next [-]

Humans have a demonstrated ability to program computers by flipping switches on the front panel.

Like a good programming language, a good harness offers a better affordance for getting stuff done.

Even if we put correctness aside, tooling that saves time and tokens is going to be very valuable.

madeofpalk 6 hours ago | parent | prev | next [-]

Isn't 'the harness' essentially just prompting?

It's completely understandable that prompting in better/more efficient means would produce different results.

furyofantares 5 hours ago | parent [-]

No, it's also a suite of tools beyond what's available in bash, tailored to context management.

manbash 6 hours ago | parent | prev [-]

The models generalized "understanding" and "reasoning" is the real myth that makes us take a step back and offload the process deterministic computing and harnesses.