This article originally appeared in Meta Science News/Brain and Behavior November 2015
Inquiry is fatal to certainty,” Will Durant, historian and philosopher
The latest news in psychology is the uncertainty of it all. And that may be making the researchers a little anxious. Reproducibility, a cornerstone of scientific method, is under scrutiny, and it’s not holding up according to “Estimating the Reproducibility of Psychological Science.” The article, recently published in Science, found rates for reproducing results of psychology studies at a mere 36%.
The short-term goal of the study was to establish an initial estimate of the reproducibility of psychological science. Long term, the study will provide an open data set for further analysis.
To gather evidence, they designed a collaborative, large-scale study that ultimately replicated 100 correlational and experimental studies. The studies originally appeared in 2008 in 3 leading psychology journals–Psychological Science, Journal of Personality and Social Psychology and Journal of Experimental Psychology: Learning, Memory, and Cognition. Replication teams with relevant experience were recruited and when possible original materials used.
The replication teams evaluated 5 factors from the original studies. These included:
And for the replication studies they looked an additional two factors including challenge of conducting replication and a self-assessment on the quality of the replication. In addition to these quantitative assessments, the study also asked the replication teams to provide a subjective assessment that answered the question, “Did your results replicate the original effect?”.
What they found was surprising. While 97% of the original studies had significant results, only 36% of replications had statistically significant results and 47% of original effect sizes were in the 95% confidence interval of the replication effect size. The rates were slightly higher with subjective assessments at 39% of effects to have replicated the original results.
Nosek offered many reasons for the striking lack of reproducibility including the potential for selective reporting, selective analysis, and insufficient specification of the conditions. And he also cautions that this in no way validates or invalidates the original research: “After this intensive effort to reproduce a sample of published psychological findings, how many of the effects have we established are true? Zero. And how many of the effects have we established are false? Zero,” he says.
The underlying issue may be what Nosek calls “cultural practices” in scientific communication. “Low power research designs combined with publication bias favoring positive results together produce a literature with upwardly biased effect sizes,” he says. Scientific claims should gain credence, Nosek says, not from the status or authority of their originator but by the replicability of their supporting evidence.
Even though lack of reproducibility was the leading trend, they were not able to determine a specific “it” factor that would help ensure replication success. Correlational data did show that replication success was “better predicted” by the strength of the original evidence than by characteristics of the original and replication teams.
Humans desire certainty and science rarely provides it says Nosek. “Scientific progress is a cumulative process of uncertainty reduction that can only succeed if science itself remains the greatest skeptic of its explanatory claims,” he says.
KimberlyHatfield