Remix.run Logo
dekhn 6 hours ago

Maybe nature and cell and a few other journals should be exceptions: they should be the place that the most advanced scientists publish interesting ideas early for the consumption by their competitors. At that level of science, all the competitors can reproduce each other's experiments if necessary; the real value is expanding the knowledge of what seems possible quickly.

(I am not seriously proposing this, but it's interesting to think about distinguishing between the very small amount of truly innovative discovery versus the very long tail of more routine methods development and filling out gaps in knowledge)

Bratmon 6 hours ago | parent [-]

> that level of science, all the competitors can reproduce each other's experiments if necessary

But they don't, and that's the problem!

godelski 3 hours ago | parent | next [-]

The problem is bigger. It even blocks research!

In my own experience I was unable to publish a few works because I was unable to outperform a "competitor" (technically we're all on the same side, right?). So I dig more and more into their work and really try to replicate their work. I can't! Emailing the authors I get no further and only more questions. I submit the papers anyways, adding a section about replication efforts. You guessed it, rejected. With explicit comments from reviewers about lack of impact due to "competitor's" results.

Is an experience I've found a lot of colleagues share. And I don't understand it. Every failed replication should teach us something new. Something about the bounds of where a method works.

It's odd. In our strive for novelty we sure do turn down a lot of novel results. In our strive to reduce redundancy we sure do create a lot of redundancy.

jltsiren 43 minutes ago | parent [-]

I've seen this from both sides.

Sometimes the result is wrong, or it's not as big or as general as claimed. Or maybe the provided instructions are insufficient to replicate the work. But sometimes the attempt to replicate a result fails, because the person doing it does not understand the topic well enough.

Maybe they are just doing the wrong things, because their general understanding of the situation is incorrect. Maybe they fail to follow the instructions correctly, because they have subtle misunderstandings. Or maybe they are trying to replicate the result with data they consider similar, but which is actually different in an important way.

The last one is often a particularly difficult situation to resolve. If you understand the topic well enough, you may be able to figure out how the data is different and what should be changed to replicate the result. But that requires access to the data. Very often, one side has the data and another side the understanding, but neither side has both.

Then there is the question of time. Very often, the person trying to replicate the result has a deadline. If they haven't succeeded by then, they will abandon the attempt and move on. But the deadline may be so tight that the authors can't be reasonably expected to figure out the situation by then. Maybe if there is a simple answer, the authors can be expected to provide it. But if the issue looks complex, it may take months before they have sufficient time to investigate it. Or if the initial request is badly worded or shows a lack of understanding, it may not be worth dealing with. (Consider all the bad bug reports and support requests you have seen.)

dekhn 6 hours ago | parent | prev [-]

Advanced groups usually replicate their competitor's results in their own hands shortly after publication (or they just trust their competitor's competence). But they don't spend any time publishing it unless they fail to replicate and can explain why they can't replicate. From their perspective, it's a waste of time. I think this has been shown to be a naive approach (given the high rate of image fraud in molecular biology) but people who are in the top of the field have strong incentives to focus on moving the state of the art forward without expending energy on improving the field as a whole.

MarkusQ 4 hours ago | parent | next [-]

"strong incentives to focus on moving the state of the art forward without expending energy on improving the field as a whole"

That sort of Orwellian doublethink is exactly the problem. They need to move it forward without improving it, contribute without adding anything, challenge accepted dogma without rocking the boat, and...blech!

godelski 3 hours ago | parent [-]

  > challenge accepted dogma without rocking the boat
I think the funniest part is how we have all these heroes of science who faced scrutiny by their peers, but triumphed in the end. They struggled because they challenged the status quo. We celebrate their anti authoritative nature. We congratulate them for their pursuit of truth! And then get mad when it happens. We pretend this is a thing of the past, but it's as common as ever[0,1].

You must create paradigm shifts without challenging the current paradigm!

[0] https://www.scientificamerican.com/article/katalin-karikos-n...

[1] https://www.globalperformanceinsights.com/post/how-a-rejecte...

Bratmon 5 hours ago | parent | prev | next [-]

All that makes it more important for top journals to reward replication, not less!

jltsiren 4 hours ago | parent [-]

Top journals are not inherently prestigious. They are prestigious because they try to publish only the most interesting and most significant results. If they started publishing successful replication studies, they would lose prestige, and more interesting journals would eventually rise to the top. (Replication studies that fail to replicate a major result in a spectacular way are another matter.)

godelski 3 hours ago | parent | prev [-]

Are you explaining this from experience or from speculation?

I can tell you that it doesn't match my own experience. I also think it doesn't match your example. Those cases of verified image fraud are typically part of replication efforts. The reason the fraud is able to persist is due to the lack of replication, not the abundance of it.

dekhn 3 hours ago | parent [-]

Mostly experience (based on being a PhD scientist, a postdoc, a National Lab scientist, and engineer at several bigtech companies), partly speculation (none of the groups/labs I worked in operated at "the highest level", but I worked adjacent to many of those).

I'm pretty sure most image fraud went completely unrealized even in the case of replication failure. It looks like (pre AI) it was mostly a few folks who did it as a hobby, unrelated to their regular jobs/replication work.

godelski 3 hours ago | parent [-]

In most of the labs I've worked in replication is not a common task[0]

  > 'm pretty sure most image fraud went completely unrealized even in the case of replication failure
Part of my point is that being unable to publish replication efforts means we don't reduce ambiguity in the original experiments. I was taught that I should write a paper well enough that a PhD student (rather than candidate) should be able to reproduce the work. IME replication failures are often explained with "well I must be doing something wrong." A reasonable conclusion, but even if true the conclusion is that the original explanation was insufficiently clear.

  > It looks like (pre AI) it was mostly a few folks who did it as a hobby
I'm sorry, didn't you say

  >>> Advanced groups usually replicate their competitor's results in their own hands shortly after publication 
Because your current statement seems to completely contradict your previous one.

Or are you suggesting that the groups you didn't work with (and are thus speculating) are the ones who replicate works and the ones you did work with "just trust their competitor's competence")? Because if this is what you're saying then I do not think this "mostly" matches your experience. That your experience more closely matches my own.

[0] I should take that back. I started in physics (undergrad) and went to CS for grad. Replication could often be de facto in physics, as it was a necessary step towards progress. You often couldn't improve an idea without understanding/replicating it (both theoretical and experimental). But my experience in CS, including at national labs, was that people didn't even run the code. Even when code was provided as part of reviewing artifacts I found that my fellow reviewers often didn't even look at it, let alone run it... This was common at tier 1 conferences mind you... I only knew one other person that consistently ran code.

dekhn 2 hours ago | parent [-]

Note that my field is biophysics (quantitative biology) while yours is physics and CS. Those are done completely differently from biology; with the exception of some truly enormous/complex/delicate experiments that require unique hardware, physics tends to be much more reproducible than biology, and CS doubly-so.

Replication of an experiment and finding image fraud are kind of done as two different things. If somebody publishes a paper with image fraud, it's still entirely possible to replicate their results(!) and if somebody publishes a paper without any image fraud, it's still entirely possible that others could fail to replicate. Also, most image errors in papers are, imho, due to sloppy handling/individual errors, rather than intentional fraud (it's one of the reasons I worked so hard on automating my papers- if I did make an error, there should be audit log demonstrating the problem, and the error should be rectified easily/quickly in the same way we fix bugs in production at big tech).

This came up a bunch when I was at LBL because of work done by Mina Bissell there on extracellular matrix. She is actively rewriting the paradigm but many people can't reproduce her results- complex molecular biology is notororiously fickle. Usually the answer is, "if you're a good researcher and can't reproduce my work, you come to my lab and reproduce it there" because the variables that affect this are usually things in the lab- the temperature, the reagents, the handling.

See https://www.nature.com/articles/503333a (written by Dr. Bissell).

godelski 35 minutes ago | parent [-]

  > physics tends to be much more reproducible than biology, and CS doubly-so.
With physics I think there is a better culture of reproduction, but that is, I believe, due more to culture. That it is acceptable to "be slow". There's a high stress on being methodical and extremely precise. The prestige is built on making your work bulletproof, and so you're really encouraged to help others reproduce your work as it strengthens it. You're also encouraged to analyze in detail and to faithfully reproduce, because finding cracks also yields prestige. I don't know if it's the money, but no one is in it for the money. Physics sure is a lot harder than anything else I've done and it pays like shit.

For CS the problem is wildly different. It should be easy to reproduce as code is trivial to copy. Ignoring the issue of not publishing code alongside results, there's also often subtle things that can make or break works. I've found many times in replication efforts that the success can rely on a single line that essentially comes form a work that was the reference to a reference of the work I'm trying to reproduce. The problem here is honestly more of laziness. In contrast to physics there's an extreme need for speed. In physics (like everyone else I knew) I often felt like I was not smart enough, and that encouraged people to dive deeper and keep improving or to give up. In CS (like everyone else I knew) I often felt like I was not fast enough, and that encouraged people to chase sponsorships from labs that provided more compute, it encouraged a "shotgun" approach (try everything), or for people to give up (aka "GPU poor").

The reason I'm saying this is because I think it is important to understand the different cultures and how replication efforts differ. In physics a replication failure was often assumed to be due to a lack of intelligence. In CS a replication effort is seen as a waste of time. Both are failures of the scientific process. Science is intended to be self-correcting. Replication is one means of this, but at its heart is the pursuit of counterfactual models. This gives us ways to validate, or invalidate, models through means other than direct replication. You can pursue the consequences of the results if you are unable to pursue the replication itself. This is almost always a good path to follow as it is the same one that leads to the extension and improvement of understanding.

There's a lot I agree and disagree with from Dr Bissell's article. Our perspectives may differ due to our different fields, but I do think it also serves as some a point of collaboration, if not on the subject of meta-science. Biology is not unique in having expensive experiments. I want to point out two famous and large physics projects: the LHC's discovery of the Higgs Boson[0] and LIGO's Observation of a Gravitational Wave[1]. The former has 9 full pages of authors (IIRC over 200) while the latter has about 3. These works are both too expensive to replicate while also demonstrating replication. Certainly we aren't going to take another 2 decades to build another CERN and replicate the experiments. But there's an easy to miss question that might also make apparent the existence of replication: who is qualified to review the paper and is not already an author of it? There's definitely some, but it really isn't that many. In these mega projects (and there are plenty more examples) the replication is done through collaboration. Independent teams examine the instruments that make the measurements. Independent teams make measurements, using the same device or different devices (ATLAS isn't the only detector at CERN), different teams independently analyze and process the information, and different teams model and simulate them. With LIGO this is also true. It would be impossible to locate those black holes without at least 2 facilities: one in Hanford (Washington) and the other in Livingston (Louisiana) (and now there's even more facilities). Astrophysics has a long history of this type of replication/collaboration as one team will announce an observation and it is a request for other observations. Observations that often were already made! In HEP (high energy particle physics) this may be less direct, but you'll notice other particle physics labs are in the author list of[0]. That's because despite the exact experiment not being replicatable in other facilities, there are still other experiments done. In the effort to find the Higgs there were many collisions performed at Fermi Lab.

I don't think this same in biophysics, but I think there are nuggets that may be fruitful. Bissell mentions at the end of her argument that she believes replication might have higher success were labs to send scientists to the original labs. I fully agree! That would follow the practice we see in these mega experiments in physics. But I also do think she's brushing off an important factor: it is far quicker and cheaper to replicate works than it is to produce them. You're a scientist, you know how the vast majority of time (and usually the vast majority of money) is "wasted" in failures (it'd be naive to call it waste). Much of this goes away with replication efforts. The greater the collaboration the greater the reduction in time and money.

And I do agree with Bissell in that we probably shouldn't replicate everything[2]. At least if we want to optimize our progress. But also I want to stress that there is no perfect system and there are many roadblocks to progress. Frankly, I'd argue that we waste far more time in things like grant writing and publication revisions. I don't know a single scientist who hasn't had a work rejected due to reviewers either not giving the work enough care or simply because they were unqualified (often working in a different niche so don't understand the minutia of the problem). As for the grant writings, I think they're a necessary evil but I'm also a firm believer of what Mervin Kelly (former director of Bell Labs) said when asked how you manage a bunch of geniuses: "you don't"[3]. You're a scientist, an expert in your domain. You already know what directions to look in. You've only gotten this far because you've been honing that skill. We don't have infinite money, so of course we have to have some bar, but we can already sniff out promising directions and we're much better at sniffing out fraud. Science has been designed to be self-correcting.

[More of a side note]

  > Usually the answer is, "if you're a good researcher and can't reproduce my work, you come to my lab and reproduce it there" because the variables that affect this are usually things in the lab- the temperature, the reagents, the handling.
And we should not undermine the importance of these variables. Failures based on them are still informative. They still inform us about the underlying causal structure that leads to success. If these variables were not specified in the paper, then a replication failure shows the mistake of the writing. Alternatively a failure can bound these variables, by making them more explicit. I'm no expert in biophysics, but I'm fairly certain that understanding the bounds of the solution space is important for understanding how the processes actually work.

[0] https://arxiv.org/abs/1207.7214

[1] https://arxiv.org/abs/1602.03837

[2] I also would be very cautious about paid replication efforts. I am strongly against it as well as paywalls on publishing (both in creation of publication as well as the access of).

[3] https://1517.substack.com/p/why-bell-labs-worked