Remix.run Logo
krick 2 hours ago

No, seriously, people need to be punished for submitting LLM-generated garbage without specifying that it's LLM-generated garbage. 400+ points, oh my god, people, what's wrong with you...

tomhow 2 hours ago | parent | next [-]

We buried the post for seeming obviously-LLM-generated. But please email us about these (hn@ycombinator.com) rather than posting public accusations.

There are two reasons why emailing us is better:

First, the negative consequences of a false allegation outweigh the benefits of a valid accusation.

Second and more important: we'll likely see an email sooner than we'll see a comment, so we can nip it in the bud quickly, rather leaving it sitting on the front page for hours.

akulkarni 2 hours ago | parent | next [-]

You buried a popular post because of the public accusation or just your "hunch"?

Why not let your audience decide what it wants to read?

I say this as a long time HN reader, who feels like the community has become grumpier over the years. Which I feel like is a shame. But maybe that's just me.

sangeeth96 an hour ago | parent | next [-]

I’d be grumpy over wasting my time on an HN post that’s LLM generated which doesn’t state that it is. If I wanted this, I could be prompting N number of chat models available to me instead of meandering over here.

akulkarni 2 hours ago | parent | prev | next [-]

There are also 200+ comments on here and a good discussion IMO which is now unfortunately buried.

Feels like a net negative for the HN community.

tomhow an hour ago | parent | prev [-]

You're welcome to email us about this.

It's my job to read HN posts and comments all day, every day, and these days that means spending a lot of time evaluating whether a post seems LLM-generated. In this case the post seems LLM-generated or heavily LLM-edited.

We have been asking the community not to publicly call out posts for being LLM-generated, for the reasons I explained in the latest edit of the comment you replied to. But if we're going to ask the community that, we also need to ask submitters to not post obviously-LLM-influenced articles. We've been asking that ever since LLMs became commonplace.

> I say this as a long time HN reader, who feels like the community has become grumpier over the years. Which I feel like is a shame. But maybe that's just me.

We've recently added this line to the guidelines: Don't be curmudgeonly. Thoughtful criticism is fine, but please don't be rigidly or generically negative.

HN has become grumpier, and we don't like that. But a lot of it is in reaction to the HN audience being disappointed at a lot of what modern tech companies are serving up, both in terms of products and content, and it doesn't work for us to tell them they're wrong to feel that way. We can try, but we can't force anyone to feel differently. It's just as much up to product creators and content creators to keep working to raise the standards of what they offer the audience.

akulkarni an hour ago | parent [-]

Thanks Tom, I appreciate the openness. You are seemingly overriding the wishes of the community, but it your community and you have the right to do so. I still think it's a shame, but that's my problem.

tomhow an hour ago | parent [-]

> You are seemingly overriding the wishes of the community

That's false. The overwhelming sentiment of the community is that HN should be free of LLM-generated content or content that has obvious AI fingerprints. Sometimes people don't immediately realise that an article or comment has a heavy LLM influence, but once they realise it does, they expect us to act (this is especially true if they didn't realize it initially, as they feel deceived). This is clear from the comments and emails we get about this topic.

If you can publish a new version of the post that is human-authored, we'd happily re-up it.

akulkarni an hour ago | parent [-]

I'm just sharing my thoughts as a long-time reader. Again, it's your show. You don't have to defend your actions. Thanks for all that you do.

43 minutes ago | parent [-]
[deleted]
gchamonlive an hour ago | parent | prev [-]

Not sure if I should mail this question but, is there any chance those 400+ votes are artificially inflated?

tomhow 23 minutes ago | parent [-]

There’s no evidence of this, but a title that’s easy to agree with can often attract upvotes from people who don’t read the article.

Gagarin1917 2 hours ago | parent | prev | next [-]

They’re upvoting because they agree with the sentiment in the title.

That’s largely how these voting sites work.

ronsor 2 hours ago | parent [-]

Exposing the age-old truth of "commenters and voters don't read articles" I see

ronbenton 2 hours ago | parent | prev | next [-]

I just pasted the first paragraph in an "AI detector" app and it indeed came back as 100% AI. But I heard those things are unreliable. How did you determine this was LLM-generated? The same way?

delish 2 hours ago | parent | next [-]

Apart from the style of the prose, which is my subjective evaluation: This blog post is "a view from nowhere." Tiger Data is a company that sells postgres in some way (don't know, but it doesn't matter for the following): they could speak as themselves, and compare themselves to companies that sell other open source databases. Or they could showcase benchmarks _they ran_.

Them saying: "What you get: pgvectorscale uses the DiskANN algorithm (from Microsoft Research), achieving 28x lower p95 latency and 16x higher throughput than Pinecone at 99% recall" is marketing unless they give how you'd replicate those numbers.

Point being: this could have been written by an LLM, because it doesn't represent any work-done by Tiger Data.

tim-- 2 hours ago | parent | next [-]

For what it's worth, TigerData is the company that develops TimescaleDB, a very popular and performant time series database provided as a Postgres extension. I'm surprised that the fact that TigerData is behind it is not mentioned anywhere in the blog post. (Though, TimescaleDB is mentioned 14 times on the page).

akulkarni 2 hours ago | parent | prev [-]

I don't understand your example: pgvectorscale was built and is maintained by Tiger Data

willidiots 2 hours ago | parent | prev | next [-]

Just using LLMs enough I've developed a sense for the flavor of writing. Surely it could be hidden with enough work, but most of the time it's pretty blatant.

ronbenton 2 hours ago | parent [-]

Sometimes I get an "uncanny valley" vibe when reading AI-generated text. It can be pretty unnerving.

chamomeal 2 hours ago | parent | prev | next [-]

It’s got that LLM flow to it. Also liberal use of formatting. It’s like it cannot possibly emphasize enough. Tries to make every word hit as hard as possible. Theres no filler, nothing slightly tangential or off topic to add color. Just many vapid points rapid fire, as if they’re the hardest hitting truth of the decade lol

dddgghhbbfblk 2 hours ago | parent | prev [-]

ChatGPT has a pretty obvious writing style at the moment. It's not a question of some nebulous "AI detector" gotcha, it's more just basic human pattern matching. The abundant bullet points, copious bold text, pithy one line summarizing assertions ("In the AI era, simplicity isn’t just elegant. It’s essential."). There are so many more in just how it structures its writing (eg "Let’s address this head-on."). Hard to enumerate everything, frankly.

niobe an hour ago | parent | prev | next [-]

Was it leading with a bad analogy that gave it way?

chamomeal 2 hours ago | parent | prev | next [-]

I know everybody just wants to talk about Postgres but it’s still sad to see any sort of engagement with slop. Even though the actual article is essentially irrelevant lol

xyst 2 hours ago | parent | prev [-]

and it’s up voted 400+ on hn. This place has truly lost its way.

rglynn 2 hours ago | parent [-]

I mean to be fair, "Just use Postgres" will get 400 votes here without people even clicking TFA.