On reproducibility

This is my first blog post on the Garvan website. I hope this blog will serve as a bridge linking my published work with my personal thoughts on various issues underlying the work. Through this blog, you can also read my authentic (and free) expression that is not normally possible in the "dull" scientific media.

In this post, I would like to have a few words on the issue of reproducibility (or lack thereof). I have recently become interested in scientific reproducibility, because it is a bedrock principle of science. As a journal editor and expert reviewer, I have seen many instances of p-value hacking, abuse of statistical analyses, publication biases, conclusion bias, coding errors, all contribute to the lack of reproducibility of clinical research.

Irreproducibility is of course not a new problem. In 1988, Professor Alvan Feinstein documented 56 findings (concerning, among others, dietary fat, vitamin D, vitamin E, folic acid, and hormone replacement therapy) from case-control and cohort studies were not reproduced by randomised controlled trials. Since then, researchers have documented many instances of irreproducibility in almost all disciplines of scientific research.

In recent years, Professor John Ioannidis, through his masterly work, has made the word "reprodubility" one of the most buzziest words in science. There are a lot of empirical evidence suggesting that a large proportion of published findings is probably false, in the sense that they are wrong and/or irreproducible.

But what is reproducibility? Well, an article just published in Science Translational Medicine (1) just answers that question. The authors, Steve Goodman, Daniele Fanelli and John Ioannidis, propose that the concept of reproducibility should be understood in terms of 3 aspects: method reproducibility, results reproducibility, and inferential reproducibility. They think that inferential reproducibility is the most important one.

The authors define methods reproducibility as "the ability to implement, as exactly as possible, the experimental and computational procedures, with the same data and tools, to obtain the same results."

Results reproducibility is "what was previously described as 'replication,' that is, the production of corroborating results in a new study, having followed the same experimental methods."

and, inferential reproducibility: "is the making of knowledge claims of similar strength from a study replication or reanalysis. This is not identical to results reproducibility, because not all investigators will draw the same conclusions from the same results, or they might make different analytic choices that lead to different inferences from the same data."

What is the magnitude of the problem? Well, a survey just published in Nature (2) suggests that "More than 70% of researchers have tried and failed to reproduce another scientist's experiments." The survey also asked participants (more than 1500 scientists) what factors contribute to irreproducible research, and the result points to the following leading factors: (i) selective reporting; (ii) publication pressure; and (iii) low study power or poor analyses. It is interesting to note that when participants were asked "what factors could boost reproducibility", the most common answer was "better underrstanding of statistics" (2).


(1) http://stm.sciencemag.org/lookup/doi/10.1126/scitranslmed.aaf5027

(2) http://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970