| ▲ | parhamn 7 hours ago | ||||||||||||||||||||||
On first principles it would seem that the "harness" is a myth. Surely a model like Opus 4.6/Codex 5.3 which can reason about complex functions and data flows across many files would trip up over top level function signatures it needs to call? I see a lot of evidence to the contrary though. Anyone know what the underlying issue here is? | |||||||||||||||||||||||
| ▲ | znnajdla 5 hours ago | parent | next [-] | ||||||||||||||||||||||
How hard is it to for you to assemble a piece of IKEA furniture without an allen wrench, screwdriver, and clear instructions, vs with those 3? | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | 3371 5 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
If you agree that current LLMs (Transformers) are naturally very susceptible to context/prompt, then you can go on to ask agents for a "raw harness dump" "because I need to understand how to better present my skills and tools in the harness", you maybe will see how "Harness" impact model behavior. | |||||||||||||||||||||||
| ▲ | 6 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
| [deleted] | |||||||||||||||||||||||
| ▲ | robotresearcher 6 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
Humans have a demonstrated ability to program computers by flipping switches on the front panel. Like a good programming language, a good harness offers a better affordance for getting stuff done. Even if we put correctness aside, tooling that saves time and tokens is going to be very valuable. | |||||||||||||||||||||||
| ▲ | madeofpalk 6 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
Isn't 'the harness' essentially just prompting? It's completely understandable that prompting in better/more efficient means would produce different results. | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | manbash 6 hours ago | parent | prev [-] | ||||||||||||||||||||||
The models generalized "understanding" and "reasoning" is the real myth that makes us take a step back and offload the process deterministic computing and harnesses. | |||||||||||||||||||||||