| ▲ | folex 8 hours ago | ||||||||||||||||
> The executables in our benchmark often have hundreds or thousands of functions — while the backdoors are tiny, often just a dozen lines buried deep within. Finding them requires strategic thinking: identifying critical paths like network parsers or user input handlers and ignoring the noise. Perhaps it would make sense to provide LLMs with some strategy guides written in .md files. | |||||||||||||||||
| ▲ | godelski 2 hours ago | parent | next [-] | ||||||||||||||||
Depends what your research question is, but it's very easy to spoil your experiment. Let's say you tell it that there might be small backdoors. You've now primed the LLM to search that way (even using "may"). You passed information about the test to test taker! So we have a new variable! Is the success only due to the hint? How robust is that prompt? Does subtle wording dramatically change output? Does "may", "does", "can", "might" work but "May", "cann", or anything else fail? Have you the promoter unintentionally conveyed something important about the test? I'm sure you can prompt engineer your way you greater success but by doing so you also greatly expand the complexity of the experiment and consequently make your results far less robust. Experimental design is incredibly difficult due to all the subtleties. It's a thing most people frequently fail at (including scientists) and even more frequently fool themselves into believing stronger claims than the experiment can yield. And before anyone says "but humans", yeah, same complexity applies. It's actually why human experimentation is harder than a lot of other things. There's just far more noise in the system. But could you get success? Certainly. I mean you could tell it exactly where the backdoors are. But that's not useful. So now you got to decide where that line is and certainly others won't agree. | |||||||||||||||||
| ▲ | Arech 6 hours ago | parent | prev | next [-] | ||||||||||||||||
That's what I thought of too. Given their task formulation (they basically said - "check these binaries with these tools at your disposal" - and that's it!) their results are already super impressive. With a proper guidance and professional oversight it's a tremendous force multiplier. | |||||||||||||||||
| |||||||||||||||||
| ▲ | selridge 6 hours ago | parent | prev | next [-] | ||||||||||||||||
That’s hard. Sometimes you will do that and find it prompts the model into “strategy talk” where it deploys the words and frame you use in your .md files but doesn’t actually do the strategy. Even where it works, it is quite hard to specify human strategic thinking in a way that an AI will follow. | |||||||||||||||||
| ▲ | 3 hours ago | parent | prev [-] | ||||||||||||||||
| [deleted] | |||||||||||||||||