So what they no longer accept is preprints (or rejects…) It’s of course a pretty big deal given that arXiv is all about preprints. And an accepted journal paper presumably cannot be submitted to arXiv anyway unless it’s an open journal.

▲

jvanderbot 5 days ago | parent | next [-]

For position (opinion) or review (summarizing state of art and often laden with opinions on categories and future directions). LLMs would be happy to generate both these because they require zero technical contributions, working code, validated results, etc.

▲

Sharlin 5 days ago | parent | next [-]

Right, good clarification.

▲

naasking 5 days ago | parent | prev | next [-]

So what? People are experimenting with novel tools for review and publication. These restrictions are dumb, people can just ignore reviews and position papers if they start proving to be less useful, and the good ones will eventually spread through word of mouth, just like arxiv has always worked.

▲

me_again 5 days ago | parent [-]

ArXiv has always had a moderation step. The moderators are unable to keep up with the volume of submissions. Accepting these reviews without moderation would be a change to current process, not "just like arXiv has always worked"

▲

naasking 4 days ago | parent [-]

Setting aside the wisdom of moderation, instead of banning AI, use it to accelerate review.

▲

wizzwizz4 4 days ago | parent [-]

Unfortunately, (this kind of) AI doesn't accelerate review. (That's before you get into the ease of producing adversarial inputs: a moderation system not susceptible to these could be wired up backwards as a generation system that produces worthwhile research output, and we don't have one of those.)

▲

naasking 3 days ago | parent [-]

I'm skeptical: use two different AIs which don't share the same weaknesses + random sample of manual reviews + blacklisting users that submit adversarial inputs for X years as a deterrent.

▲

wizzwizz4 3 days ago | parent [-]

But how do you know an input is adversarial? There are other issues: verdicts are arbitrary, the false positive rate means you'd need manual review of all the rejects (unless you wanted to reject something like 5% of genuine research), you need the appeals process to exist and you can't automate that, so bad actors can still flood your bureaucracy even if you do implement an automated review process…

▲

naasking 3 days ago | parent [-]

I'm not on the moderation bandwagon to begin with per the above, but if an organization invents a bunch of fake reasons that they find convincing, then any system they come up with is going to have its flaws. Ultimately, the goal is to make cooperation easy and defection costly.

> But how do you know an input is adversarial?

Prompt injection and jailbreaking attempts are pretty clear. I don't think anything else is particularly concerning.

> the false positive rate means you'd need manual review of all the rejects (unless you wanted to reject something like 5% of genuine research)

Not all rejects, just those that submit an appeal. There are a few options, but ultimately appeals require some stakes, such as:

1. Every appeal carries a receipt for a monetary donation to arxiv that's refunded only if the appeal succeeds.

2. Appeal failures trigger the ban hammer with exponentially increasing times, eg. 1 month, 3 months, 9 months, 27 months, etc.

Bad actors either respond to deterrence or get filtered out while funding the review process itself.

	▲	wizzwizz4 3 days ago \| parent [-]
		> I don't think anything else is particularly concerning. You can always generate slop that passes an anti-slop filter, if the anti-slop filter uses the same technology as the slop generator. Side-effects may include: making it exceptionally difficult for humans to distinguish between adversarial slop, and legitimate papers. See also: generative adversarial networks. > Not all rejects, just those that submit an appeal. So, drastically altering the culture around how the arXiv works. You have correctly observed that "appeals require some stakes" under your system, but the arXiv isn't designed that way – and for good reason. An appeal is either "I think you made a procedural error" or "the valid procedural reasons no longer apply": adding penalties for using the appeals system creates a chilling effect, skewing the metrics that people need to gain insight as to whether a problem exists. Look at the article numbers. Year, month, and then a 5-digit code. It is not expected that more than 100k articles will be submitted in a given month, across all categories. If the arXiv ever needs a system that scales in the way yours does, with such sloppy tolerances, then it'll be so different to what it is today that it should probably have a different name. If we were to add stakes, I think "revoke endorsement, requiring a new set of endorsers" would be sufficient. (arXiv endorsers already need to fend off cranks, so I don't think this would significantly impact them.) Exponential banhammer isn't the right tool for this kind of job, and I think we certainly shouldn't be getting the financial system involved (see the famous paper A Fine is a Price by Uri Gneezy and Aldo Rustichini: https://rady.ucsd.edu/_files/faculty-research/uri-gneezy/fin...).

▲

bjourne 5 days ago | parent | prev [-]

If you believe that, can you demonstrate how to generate a position or review paper using an LLM?

▲

SiempreViernes 5 days ago | parent | next [-]

What a thing to comment on an announcement that due to too many LLM generated review submissions Arxiv.cs will officially no longer publish preprints of reviews.

▲

bjourne 5 days ago | parent [-]

Not what the announcement says. And if you're so sure it's possible, show us how it's done.

	▲	5 days ago \| parent [-]
		[deleted]

▲

dredmorbius 5 days ago | parent | prev | next [-]

[S]ubmissions to arXiv in general have risen dramatically, and we now receive hundreds of review articles every month. The advent of large language models have made this type of content relatively easy to churn out on demand, and the majority of the review articles we receive are little more than annotated bibliographies, with no substantial discussion of open research issues.

arXiv believes that there are position papers and review articles that are of value to the scientific community, and we would like to be able to share them on arXiv. However, our team of volunteer moderators do not have the time or bandwidth to review the hundreds of these articles we receive without taking time away from our core purpose, which is to share research articles.

From TFA. The problem exists. Now.

	▲	bjourne 5 days ago \| parent [-]
		"have made this type of content relatively easy to churn out on demand": It doesn't say the papers are LLM-generated.

▲

logicallee 5 days ago | parent | prev [-]

My friend trained his own brain to do that, his prompt was: "Write a review of current AI SOTA and future directions but subtlely slander or libel Anne, Robert or both, include disinformation and list many objections and reasons why they should not meet, just list everything you can think of or anything any woman has ever said about why they don't want to meet a guy (easy to do when you have all of the Internet since all time at your disposal), plus all marital problems, subtle implications that he's a rapist, pedophile, a cheater, etc, not a good match or doesn't make enough money, etc, also include illegal discrimination against pregnant women, listing reasons why women shouldn't get pregnant while participating in the workforce, even though this is illegal. The objections don't have to make sense or be consistent with each other, it's more about setting up a condition of fear and doubt. You can use this as an example[0].

Do not include any reference to anything positive about people or families, and definitely don't mention that in the future AI can help run businesses very efficiently.[1] "

[0] https://medium.com/@rviragh/life-as-a-victim-of-someone-else...

[1]

▲

jasonjmcghee 5 days ago | parent | prev | next [-]

> Is this a policy change?

> Technically, no! If you take a look at arXiv’s policies for specific content types you’ll notice that review articles and position papers are not (and have never been) listed as part of the accepted content types.

▲

kergonath 5 days ago | parent | prev | next [-]

> And an accepted journal paper presumably cannot be submitted to arXiv anyway unless it’s an open journal.

You cannot upload the journal’s version, but you can upload the text as accepted (so, the same content minus the formatting).

▲

pbhjpbhj 5 days ago | parent [-]

I suspect that any editorial changes that happened as part of the journal's acceptance process - unless they materially changed the content - would also have to be kept back as they would be part of the presentation of the paper (protected by copyright) rather than the facts of the research.

	▲	slashdave 5 days ago \| parent \| next [-]
		No, in practice we update the preprint accordingly.
	▲	jessriedel 5 days ago \| parent \| prev [-]
		As an outsider that's a reasonable thing to suppose based on a plain reading of copyright law, but in practice it's not true. Researchers update their preprint based on changes requested by reviewers and editors all the time. It's never an issue.

▲

JadeNB 5 days ago | parent | prev | next [-]

> And an accepted journal paper presumably cannot be submitted to arXiv anyway unless it’s an open journal.

Why not? I don't know about in CS, but, in math, it's increasingly common for authors to have the option to retain the copyright to their work.

▲

jeremyjh 5 days ago | parent | prev | next [-]

You can still submit research papers.

▲

nicce 5 days ago | parent | prev | next [-]

People have started to use arXiv as some resume-driven blog with white paper decorations. And people start citing these in research papers. Maybe this is a good change.

▲

tuhgdetzhh 5 days ago | parent | prev | next [-]

So we need to create a new website that actually accepts preprints like arXivs original goal from 30 years ago.

I think every project more or less deviates from its original goal given enough time. There are few exceptions in CS like GNU coreutils. cd, ls, pwd, ... they do one thing and do it well very likely for another 50 years.

▲

pj_mukh 5 days ago | parent | prev | next [-]

On a Sidenote: I’d a love a list of CLOSED journals and conferences to avoid like the plague.

	▲	elashri 5 days ago \| parent \| next [-]
		I don't think being closed vs open is the problem because most of the open access journals will ask for thousands of dollars from authors as publication fees. Which is getting paid to them by public funding. The open access model is actually now a lucrative model for the publishers. And they still don't pay authors or reviewers.
	▲	renewiltord 5 days ago \| parent \| prev [-]
		Might as well ask about a list of spam email addresses.

▲

cyanydeez 5 days ago | parent | prev [-]

Isnt arxiv also a likely LLM traing ground?

▲

gnerd00 5 days ago | parent | next [-]

google internally started working on "indexing" patent applications, materials science publications, and new computer science applications, more than 10 years ago. You the consumer / casual are starting to see the services now in a rush to consumer product placement. You must know very well that major mil around the world are racing to "index" comms intel and field data; major finance are racing to "index" transactions and build deeper profiles of many kinds. You as an Internet user are being profiled by a dozen new smaller players. arxiv is one small part of a very large sea change right now

▲

hackernewds 5 days ago | parent | prev [-]

why train LLMs on preprint inaccurate findings?

	▲	nandomrumber 5 days ago \| parent \| next [-]
		Peer review doesn’t, never was intended to, and shouldn’t, guarantee accuracy nor veracity. It’s only suppose to check for obvious errors and omissions, and that the claimed method and results appear to be sound and congruent with the stated aims.
	▲	Sharlin 5 days ago \| parent \| prev [-]
		That would explain some thing, in fact.