I am quite puzzled how an LLM could even start "write" a scientific paper.

Say you start with a set of findings, for example, western blots, data from a transgenic mouse engineered for the relevant gene, and some single cell sequencing data. Your manuscript describes the identification of a novel protein, editing the gene in a mouse and showing what pathways are affected in the mouse.

What material would you give the LLM? How would the LLM "know" which of these novel findings were in any way meaningful? As far as I'm aware, it is unlikely that the LLM would be able to do anything other that paraphrase what you instruct it to write. It would be a return to the days before word processing became common, and researchers would either dictate their manuscripts to a typist, or hand the typist a stack of hand-written paper.

The actually hard part of writing scientific papers is not putting the words "down on paper" so to speak, but deciding what to say.

▲

nijuashi 2 days ago | parent | next [-]

When we go to grad school, we’re taught how to write a research paper. Each field has a more or less standard format, where different types of data go in specific sections. So if an LLM is trained on enough papers in that field, it can learn to plug in the information you provide according to those conventions.

In that sense, you’d give the LLM the purpose of the paper, the field you’re writing in, and the relevant data from your lab notebook. Personally, I never enjoyed writing manuscripts — most of the time goes into citing every claim and formatting everything correctly, which often feels more like clerical work than communicating discovery.

I don’t mind if LLMs help write these papers. I don’t think learning to mimic this stylistic form necessarily adds to the process of discovery. Scientists should absolutely be rigorous and clear, but I’d welcome offloading the unnecessary tedium of stylized writing to automation.

▲

pcrh 2 days ago | parent [-]

I am experienced in writing scientific papers, so I know what it takes.

I remain to be convinced that the tasks you propose an LLM could do contribute any more to the process of writing a paper than dictating to a typist could do in the 1950's. It's impressive for a machine, but not particularly productivity-boosting. Tedious tasks such as correctly formatting references belong to the copy-editing stage (i.e. very last stage of writing a paper), where indeed I have seen journals adopt "AI" approaches. But these processes are not a bottleneck in the scientist's workflow.

I certainly don't think the performance of LLMs that I'm familiar with would be any use at all in compiling the original data into scientifically accurate figures and text, and providing meaningful interpretations. Most likely they would simply throw out random "hallucinations" in grammatically correct prose.

▲

knappa a day ago | parent [-]

But correctly formatting references is pretty much a solved task through reference managers, possibly plus bibtex. It's a well-defined task, after all, and well suited to traditional software techniques. [1] If someone used an LLM to format the references, you would still have to go back through them.

If there is any use for LLMs in paper writing, I would think that it is for tedious but not well-defined tasks. For example, asking if an already written paper conforms to a journal's guidelines and style. I don't know about you, but I spend a meaningful amount of time [2] getting my papers into journal page limits. That involves rephrasing to trim overhangs, etc. "Rephrase the following paragraph to reduce the number of words by at least 2" is the kind of thing that LLMs really do seem to be able to do reliably.

1: As usual, the input data can be wrong, but that would be a problem for LLMs too. 2: I don't actually know how much time. It probably isn't all that long, but it's tedious and sure does feel like a long time while I'm doing it.

	▲	pcrh 17 hours ago \| parent [-]
		Re-phrasing to fit within word or character limits is certainly something I would pay for! I have often spent more time doing this than writing the original draft, especially for grant applications...

▲

dboreham 2 days ago | parent | prev | next [-]

The LLM can make a plan or outline first, which is also writing.

▲

pcrh 2 days ago | parent [-]

Any researcher already has this in their head long before any writing takes place.

▲

polairscience 2 days ago | parent [-]

I must be a bad researcher then because every paper I've written starts as a very vague "here are the overarching implications and important results". But the detailed order of results and the nuts and bolts of how to argue out the conclusions gets decided in drafting. Only the simplest of results I've had is essentially pre-written.

	▲	pcrh 2 days ago \| parent [-]
		>"here are the overarching implications and important results". That's the outline. I doubt an LLM would help much in deciding how best to present the finer details, as they will be very specific to your particular manuscript.

▲

dist-epoch 2 days ago | parent | prev [-]

> How would the LLM "know" which of these novel findings were in any way meaningful

Given that they are trained on all of arXiv, ..., it's much more likely they are aware of all public relevant papers than your average researcher.

	▲	pcrh 2 days ago \| parent [-]
		A researcher on any particular topic is not supposed to be an "average" researcher, but already deeply familiar with their subject.