| |
| ▲ | gwern 3 days ago | parent | next [-] | | It is lacking in URLs or references. (The systematic error in the self-reference blog post URLs is also suspicious: outdated system prompt? If nothing else, shows the human involved is sloppy when every link is broken.) The assertions are broadly cliche and truisms, and the solutions are trendy buzzwords from a year ago or more (consistent with knowledge cutoffs and emphasizing mainstream sources/opinions). The tricolon and unordered bolded triplet lists are ChatGPT. The em dashes (which you should not need to be told about at this point) and it's-not-x-but-y formulation are extremely blatant, if not 4o-level, and lacking emoji or hyperbolic language; hence, it's probably GPT-5. (Sub-GPT-5 ChatGPTs would also generally balk at talking about a 'GPT-5' because they think it doesn't exist yet.) I don't know if it was 100% GPT-5-written, but I do note that when I try the intro thesis paragraph on GPT-5-Pro, it dislikes it, and identifies several stupid assertions (eg. the claim that power law scaling has now hit 'diminishing returns', which is meaningless because all log or power laws always have diminishing returns), so probably not completely-GPT-5-written (or least, sub-Pro). | | |
| ▲ | mdp2021 3 days ago | parent [-] | | > when I try the intro thesis paragraph on GPT-5-Pro, it dislikes it I don't know about GPT-5-Pro, but LLMs can dislike their own output (when they work well...). | | |
| ▲ | gwern 3 days ago | parent [-] | | They can, but they are known to have a self-favoring bias, and in this case, the error is so easily identified that it raises the question of why GPT-5 would both come up with it & preserve it when it can so easily identify it; while if that was part of OP's original inputs (whatever those were) it is much less surprising (because it is a common human error and mindlessly parroted in a lot of the 'scaling has hit a wall' human journalism). | | |
| ▲ | Foreignborn 3 days ago | parent [-] | | do you have a source? when i’ve done toy demos where GPT5, sonnet 4 and gemini 2.5 pro critique/vote on various docs (eg PRDs) they did not choose their own material more often than not. my setup wasn’t intended to benchmark though so could be wrong over enough iterations. | | |
| ▲ | gwern 3 days ago | parent [-] | | I don't have any particularly canonical reference I'd cite here, but self-preference bias in LLMs is well-established. (Just search Arxiv.) |
|
|
|
| |
| ▲ | alangou 3 days ago | parent | prev [-] | | My favorite tell-tale sign: > The gap isn’t just quantitative—it’s qualitative. > LLMs don’t have memory—they engage in elaborate methods to fake it... > This isn’t just database persistence—it’s building memory systems that evolve the way human memory does... > The future isn’t one model to rule them all—it’s hundreds or thousands of specialized models working together in orchestrated workflows... > The future of AGI is architectural, not algorithmic. |
|