Remix.run Logo
baalimago 2 days ago

>if the issue with deterministic generation would be solved

This can be achieved by utilizing tests. So the SWE agent will write up a set of tests as it understands the task. These are the functional requirements, which should/could be easily inspected by the BI (biological intelligence).

Once the functional requirements have been set, the SWE agent can iterate over and over again until the tests pass. At this point it doesn't really matter what the solution code looks like or how it's written, only that the functional requirements as defined via the tests are upheld. New requirements? Additional tests.

kadhirvelm 2 days ago | parent [-]

Totally agree - I'd bet there will be a bigger emphasis on functional testing to prevent degradation of previously added features. And I'd bet the scope of tests we'll need to write will also go up. For example, I'd bet we'll need to add latency based unit tests to make sure as the LLM compiler is iterating, it doesn't make the user perceived performance worse