Remix clone Hacker News

The community already seems to have established a policy that copy pasting a block of LLM text into a comment will get you downvoted into oblivion immediately.

▲

aspenmayer 2 days ago | parent | next [-]

That rubric only works until sufficiently advanced LLM-generated HN posts are indistinguishable from human-generated HN posts.

It also doesn’t speak to the permission or lack thereof of training LLMs on HN content, which was another main point of OP.

▲

JavierFlores09 2 days ago | parent | next [-]

> That rubric only works until sufficiently advanced LLM-generated HN posts are indistinguishable from human-generated HN posts.

if a comment made by a LLM is indistinguishable from a normal one, it'd be impossible to moderate anyway unless one starts tracking people across comments and see the consistency of their replies and overall stance so I don't particularly think it is useful to worry about people who will go the extra length to go undetected

▲

aspenmayer 2 days ago | parent | next [-]

> if a comment made by a LLM is indistinguishable from a normal one, it'd be impossible to moderate anyway unless one starts tracking people across comments and see the consistency of their replies and overall stance so I don't particularly think it is useful to worry about people who will go the extra length to go undetected

The existence of rule-breakers is not itself an argument against a rules-based order.

▲

tredre3 2 days ago | parent | prev [-]

HN's guidelines aren't "laws" to be "enforced", they're a list of unwelcome behaviors. There is value in setting expectations for participants in a community, even if some will choose to break them and get away with it.

	▲	bawolff 2 days ago \| parent [-]
		If comments by LLMs were actually as valuable & insightful as human comments there would be no need for the rule. The rule is in place because they usually aren't. Relavent xkcd https://xkcd.com/810/

▲

2 days ago | parent | prev | next [-]

[deleted]

▲

redox99 2 days ago | parent | prev | next [-]

It's pretty trivial to finetune an LLM to output posts that are indistinguishable.

▲

majormajor 2 days ago | parent | prev [-]

That's assuming a certain outcome: indistinguishable posts.

Some would say LLM-generated posts will eventually be superior information-wise. In which case possibly the behavior will change naturally.

Or maybe they don't get there any time soon and stay in the uncanny valley for a long time.

I'm kinda fine with a "if you can't be bothered to even change the standard-corporate-BS-tone of your copypaste, you get downvoted" - for all I know some people might be more clever with their prompting to get something less crap-sounding, and then they'll just live or die on the coherence of the comment.

▲

fenomas 2 days ago | parent | prev | next [-]

Sure, and I think the reason is that whatever else they are, LLM outputs are disposable. Posting them here is like posting outputs from Math.random() - anyone who wants such outputs can easily generate their own.

▲

Der_Einzige 2 days ago | parent | prev [-]

Bold of you to assume that you will have any idea at all that an LLM generated a particular comment.

If I take a trick like those recommend by the authors of min_p (high temperature + min_p)[1], I do a great job of escaping the "slop" phrasing that is normally detectable and indicative of an LLM. Even more-so if I use the anti-slop sampler[2].

LLMs are already more creative than humans are today, they're already better than humans at most kinds of writing, and they are coming to a comment section near you.

Good luck proving I didn't use an LLM to generate this comment. What if I did? I claim that I might as well have. Maybe I did? :)

[1] https://openreview.net/forum?id=FBkpCyujtS

[2] https://github.com/sam-paech/antislop-sampler, https://github.com/sam-paech/antislop-sampler/blob/main/slop...

	▲	bjourne 2 days ago \| parent [-]
		Fascinating that very minor variations on established sampling techniques still generate papers. :) Afaik, neither top-p nor top-k sampling has conclusively been proven superior to good old-fashioned temperature sampling. Certainly, recent sampling techniques can make the text "sound different", but not necessarily read better. I.e., you're replacing one kind of bot generated "slop" with another.