I think they mean performance with the same, rational, task.
Measuring "degredation" for the nonsense task, like you gave, would be difficult.