| ▲ | kostaj 17 minutes ago | |
Good point. Processing the substance of the answer might be too labor-consuming (1,000 claims x 5 models), but "thinking out loud" might improve the quality of the answers indeed. And we can still force/ask them to respond with a clear verdict at the end of their reasoning, as per the chosen rubric. | ||
| ▲ | airstrike 6 minutes ago | parent [-] | |
[delayed] | ||