Remix.run Logo
jcattle 4 hours ago

I thought this part especially was quite ingenious.

If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.

If you do not use LLMs and just misunderstood something, you will have an (flawed) justification for why you wrote this. If there's something flawed in an LLM, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.

One shows a misunderstanding, the other doesn't necessarily show any understanding at all.

Zababa 3 hours ago | parent [-]

>If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.

You could say the same about what people find on the web, yet LLMs are penalized more than web search.

>If you do not use LLMs and just misunderstood something, you will have an (flawed) justification for why you wrote this. If there's something flawed in an LLM, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.

Swap "LLMs" for "websites" and you could say the exact same thing.

The author has this in their conclusions:

>One clear conclusion is that the vast majority of students do not trust chatbots. If they are explicitly made accountable for what a chatbot says, they immediately choose not to use it at all.

This is not true. What is true is that if the students are more accountable for their use of LLMs than their use of websites, they prefer using websites. What is "more" here? We have no idea, the author didn't say so. It could be that an error from a website or your own mind is -1 point and from a LLM is -2, so LLMs have to make two times less mistakes than websites and your mind. It could be -1 and -1.25. It could be -1 and -10.

The author even says themselves:

>In retrospect, my instructions were probably too harsh and discouraged some students from using chatbots.

But they don't note the bias they introduced against LLMs with their notation.