Remix.run Logo
logicprog 9 hours ago

I mean, you can literally clone my repo, run the Python that rebuilds the database and does the whole data analysis and to end from scratch, and verify that the numbers are accurate. I made the code for this analysis public for that exact reason. This wasn't just an LLM running unsupervised in a loop. I came up with the methodologies and metrics and data scraping strategies precisely myself, iterated on it to try to be as honest with what the data could show as possible.

sanitycheck 9 hours ago | parent | next [-]

I think the point people are making is that when the text has an "AI smell" (it does), we immediately lose trust in the veracity of any claim being made and feel like continuing to read what is possibly a hallucinated fiction is a complete waste of time.

At this point we're all used to skimming through thousands of AI-generated sentences every working day and constantly thinking "this is likely to be 20% bullshit", it's hard to turn that off even if I try.

logicprog 9 hours ago | parent [-]

Do you think it would help if I went through and manually rewrote all of the prose? If it would get people to listen, I'd be totally willing to do it. It's not like I don't like writing. I just was focused on something else when I was making this, namely trying to find a good methodology that isn't insane for this low amount of data.

JasonSage 8 hours ago | parent | next [-]

When there's no discernable human filter on the text output, reading the text suggests it's what the LLM produced and not what a human considered.

This is low-quality--every single day I witness Codex and Claude misunderstand, mislead, and hallucinate responses based on "assumptions" and I have to fact-check them.

If I wanted a statistical analysis and to be the human in the loop, I would ask the LLM myself, and I would definitely NOT read an article that just dumps the LLM output as-is.

bradrn 9 hours ago | parent | prev | next [-]

Yes, that would help considerably.

(Also, I suggest clearly acknowledging where AI was/wasn’t used. I like CuriosityC’s suggestion: https://news.ycombinator.com/item?id=48411968)

logicprog 8 hours ago | parent [-]

Alright, I'll do that. Although, sadly, I already posted it here, so I won't be able to post it again — I'll be stuck with this trash comments section that doesn't deal with any of the actual claims, just the aesthetics.

sanitycheck 8 hours ago | parent | prev [-]

I'm pretty sure more people would read it to the end if it didn't seem like AI output, yes.. At the very least you would have fewer (maybe not 0!) comments here saying it's AI slop.

BigTTYGothGF 8 hours ago | parent | prev [-]

> I mean, you can literally

You didn't care enough to make a good writeup, why should we believe that you cared enough to make a good analysis?

skeledrew 8 hours ago | parent [-]

You don't have to believe. The repository is there for anyone to attempt reproducing the results. Criticisms without proof when there's a pretty straightforward way toward that proof are pointless. Go run the experiment and rip that apart if it doesn't hold up. And until then, refrain from criticizing.