Remix clone Hacker News

new | show | ask | jobs Github

	▲	rustystump 5 hours ago
		I wont touch how profoundly i disagree with everything you said on reasoning (u clearly already have it figured out) but a fun test i have done with most of the big models is to give it some text input, maybe a short story, and have it rate it. That is, the prompt is, rate this from 1-10. For Gemini and gpt, it almost always will give very similar scores for everything. As long as grammar isnt off u cannot get below a 7. X ai on the other hand will rarely give anything above a 7. Now when u prompt with, rate 1-10 with 5 being average, all the sudden the scores of openai and gemini drop and x ai remains roughly the same. All of them will eventually give you a 10 if u keep making tiny edits “fixing” whatever they complain about. Humans do not do this. Or more specifically, my experience with humans.