▲ | recursivecaveat 3 days ago | |
The turing test basically subsumes all tests that can be text-encoded, no? Like if you feel that LLMs are abnormally bad at a kind of writing like an All Souls essay, you just ask the other chair to write you such an essay as one of your questions. To be clear, I'm not aware of anyone actually running any serious turing tests today because it's very expensive and tedious. There's one being passed around where each conversation is only 4(!) little SMS-sized messages long per side, and chat gpt gets judged to be the human side twice as often as the actual human. |