Advertisement
Essay

Essay Essays are opinion pieces on a topic of broad interest to a general medical audience.

See all article types »

Why Most Published Research Findings Are False

  • John P. A. Ioannidis
  • Published: August 30, 2005
  • DOI: 10.1371/journal.pmed.0020124

Reader Comments (31)

Post a new comment on this article

Why most published research findings are true but so many are useless

Posted by plosmedicine on 30 Mar 2009 at 23:51 GMT

Author: Yonatan Loewenstein
Position: No occupation was given
Institution: Howard Hughes Medical Institute and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge MA 02139
E-mail: yonatanl@mit.edu
Submitted Date: March 01, 2006
Published Date: March 7, 2006
This comment was originally posted as a “Reader Response” on the publication date indicated above. All Reader Responses are now available as comments.

John P. A. Ioannidis makes an interesting claim that most research findings are false<sup>1</sup>. Much of his argument is based on the assumption that the pre-study probability that a single hypothesis is correct is small and that the failure to incorporate this prior information in the estimation of the post-priori likelihood of the hypothesis to be correct leads to erroneous conclusions in the vast majority of studies. I argue that the prior probabilities for many of the hypotheses tested are large and therefore only a minority of the published findings is false. However, these published hypotheses are, in many of these cases, of little scientific or clinical interest.

Consider a hypothetical epidemiological study of exploratory nature that attempts to find correlations between external factors and a specific medical condition. A positive, publishable, result in such a study is the finding of correlation between a specific external factor and a change in the medical condition: for example an improvement, suggesting a causal relation between the two. To support this hypothesis, a statistical test is presented to show that the likelihood that the external factor is uncorrelated with the improvement in the medical condition is low. What is the a-priori probability that the hypothesis is correct? Biological systems, such as biochemical or genetic networks, are characterized by a high degree of interconnectivity. Thus perturbations in one node of the network tend to propagate throughout the entire network.

Since the subject is a biological system, any external factor is almost sure to have some effect on any medical condition. It is the direction of the effect, whether it improves the condition or worsens it, and its magnitude that are the unknown variables. In the absence of any mechanistic model, the parsimonious assumption is that both of the alternatives, improvement and worsening, are equally likely and therefore the prior of the hypothesis that the external factor is correlated with an improvement of the condition is 0.5.

Thus, when any supporting experimental evidence is incorporated with the prior, the post-priory probability that the hypothesis is true is larger than 0.5, Therefore, most published research hypotheses are correct, even if the stated p-value in these studies is overconfident, as a result of the biases described by Ioannidis.

What can be learned from a study that presents correlations between the external factor and the change in the medical condition with a high level of confidence p? Not much. The level of confidence depends as much on the level of true correlations as it depends on the size of the dataset. The null hypothesis of no-correlation can be ruled out with the same level of confidence if the correlations are strong and the dataset small or if the correlations are weak and the dataset is large. What is of clinical or scientific importance are the latter type of correlations, that is, finding factors that change the medical condition by 'a lot'. The corresponding statistical question is: How likely is it that the external factor changes the medical condition by at least x%, a question too seldom explicitly addressed in the medical and biological scientific literature.

References

1. Ioannidis JPA (2005) Why most research findings are false. PLoS Med : e124.

No competing interests declared.

RE: Why most published research findings are true but so many are useless

Clif_Carl replied to plosmedicine on 17 Jan 2010 at 08:14 GMT

A priori, before flipping a fair coin, we already know that one side is heads, one is tails and therefore the result of a fair flip is .5 either way. One could express the effect of external factors (coin manipulations, etc.) in the manner described.

But life is not a nickle or a quarter. In many situations it is not known a priori if the situation, the "coin", is fair or how much it might be weighted. This is why people gather knowlege, apply it to their experience and make judgements. Whether a situation is weighted or not, or by how much, cannot be assumed. The a priori probabilities of outcomes in an unfamiliar situation are unknown. Always.

A priori, one cannot assume, in the absence of knowlege or experience, that the probabilities of improvement or worsening from some particular external factor are equal (.5). Instead, they are unknown.

No competing interests declared.

RE: RE: Why most published research findings are true but so many are useless

AngelaKennedy replied to Clif_Carl on 09 Mar 2011 at 15:57 GMT

Thank you Clif Carl for this sage caveat. It's obvious but often seems to be forgotten.

Another issue around correlation of 'external factors' and 'medical conditions' is the thorny problem of direction of causation, which I believe plosmedicine left out of his analysis. This is a key problem in all correlations found, especially in medical research. The tendency to incorrectly affirm the consequent appears very high, especially in research claiming psychogenic aetiology for somatic illness.

No competing interests declared.

R is the problem

oleary_t replied to plosmedicine on 04 Jul 2010 at 14:44 GMT


The claim that 'so many [findings] are useless' has the sympathy of my cynical side, but it a hasty conclusion. Merely finding a correlation, by itself, is of little use when attempting to quantitatively relate the efficacy of a drug to dosage (to chose the example given at the end). However, it is a necessary first-step to a more detailed study of the system: i.e. 'essential' rather than 'useless'.

I would argue that the most cogent point this response makes is in relation to the true value of R in Ioannidis's analysis. The claim that we conduct research by blindly choosing variables then test correlations is preposterous, for, as the reply points out, biological systems are highly interconnected. Moreover, our knowledge of the correlations is cumulative and non-trivial to begin with, so any quantity relating to prior belief of a correlation should reflect this. The former fact means that a low value of R is an absurd assumption even in fields such as epidemiology, the latter suggests that the original article would be more useful if it modelled the effect of accumulating evidence.

No competing interests declared.

RE: R is the problem

rasraster replied to oleary_t on 18 Feb 2013 at 17:07 GMT

oleary_t, I think your point has some validity, but there are some big issues with it.

For one thing, there is a serious problem in the typical methodology used to interpret results and/or reject null hypotheses. I refer you to two excellent summaries of the issue, Cohen's "The Earth is Round p<0.5" and Hubbard and Bayarri's "P Values are not Error Probabilities." Both are easily available online. plosmedicine's makes a comment that unfortunately shows this same misunderstanding of statistical methodology that most researchers also exhibit: "To support this hypothesis, a statistical test is presented to show that the likelihood that the external factor is uncorrelated with the improvement in the medical condition is low."

Second, there is something of a "house of cards" nature to conclusions drawn in a field. Attempts to replicate research results are no longer common. Researchers often take previous conclusions at face value and use them to formulate their own theories, when in fact the evidence of the previous conclusions may be very shaky.

Third, approaches that are close to being scattershot may not be as uncommon as you think. Biau et al., in "P Value and the Theory of Hypothesis Testing," note that by 1981 there were 246 factors reported as potentially predictive of cardiovascular disease, including slow beard growth, fingerprint patterns, and others. It's taken >25 years and a lot of effort to weed out the noise and get to the 9 or so factors that have finally been proven as clinically relevant to risk.

This leads to my fourth point, which is consistent with plosmedicine's final paragraph (though I don't agree with his/her analysis in general), which is that effect size is a very important factor to include in the design of a study. In other words, you can refine your test via sample size to prove that noise (random variation that can look like a signal if you put too fine a focus on it) is significant, but it is not really important in the proper perspective of treatment effects that consistently make a difference greater than random variation and experimental error. Statistical power analysis is needed to get a handle on this and it's not usually done.

Finally - and I don't mean to pick on plosmedicine, but the above comment illustrates my point well - too often researchers forget that correlation is not causation, to wit: "A positive, publishable, result in such a study is the finding of correlation between a specific external factor and a change in the medical condition: for example an improvement, suggesting a causal relation between the two." A positive result of correlation should be only HALF of a published full-length study. The other half should be a high-quality analysis of the factors involved in interpreting whether the correlation might represent causation, and how, or might represent a parallel effect of a deeper cause instead. Without this, the knowledge on which to form further hypotheses is incomplete.

And we get a bonus point in this quote too...the bias that arises because lack of support for a finding of correlation is not usually published/publishable. So people don't get to see the 15 studies that find no correlation, only the 1 or 2 studies that do.

No competing interests declared.

RE: Why most published research findings are true but so many are useless

optimalpolicies replied to plosmedicine on 17 Apr 2011 at 23:30 GMT

It is an excellent point. However, in practice, a priori, most researchers predict a minimum level of improvement, enough to be of some biological significance. Ex: taking fish oils improves IQ by at least 2% (otherwise the study would not be conducted.). Under these conditions, the events improvement vs. no improvement or worsening are not equally likely. Although researchers test the probability of the improvement being greater than zero, they will not publish unless the improvement is clinically valuable (of some substance). Under those conditions, the a prior probability was very small, and likely remains very small after the study. We can rephrase by saying that most published research findings are false when we consider the substance or magniture of the improvement

No competing interests declared.