▲ | _cs2017_ 5 days ago | |
Even if this bug never existed, models can still see lookahead commits during pretraining. Do we expect this bug to have a greater impact than the pretraining leakage? Obviously having something available during test time is more valuable than buried somewhere in the pretraining mixture. But in pretraining it happens presumably with high probability (why wouldn't coding models pretrain on the entire github), while in test time it apparently happened only very occasionally? |