How many examples did each system need to get good at the task too? It's currently a lot less for humans and we don't know why.