Remix.run Logo
theptip 4 days ago

I think the more fundamental attribute of interest is how easy it is to verify the work.

Much red team work is easily verifiable; either the exploit works or it doesn’t. Whereas more blue-team work is not easily verifiable; it might take judgement to figure out if a feature is promising.

LLMs are extremely powerful (and trainable) on tasks with a good oracle.