▲ | aramend 2 days ago | |
> For ARC-AGI challenge, we start with all input-output example pairs in the training and the evalua- tion sets ... At test time, we proceed as follows for each test input in the evaluation set: ... Very often I see people misuse the ARC-AGI data when training. The input examples in the evaluation set are not intended for training your AI system. It is a downside of ARC that its data is (somehow?) complicated enough for the clever people building AI systems to miss the point, and people report and compare results as a single percentage where the data mix used for training may not make the comparison applicable. |