- The protocols of the study must be clear and impeccable so that all possibilities of error can be evaluated. The investigators, not the reviewers, carry the burden of identifying each possible source of error, explaining how it was minimized, and providing a quantitative estimate of the effect of each error. These errors can be systematic—attributable to biases in the experimental set up—or statistical—the result of chance fluctuations. No new effect can be claimed unless all the errors are small enough to make it highly unlikely that they are the source of the claimed effect.
- The hypotheses being tested must be established clearly and explicitly before data taking begins, and not changed midway through the process or after looking at the data. In particular, “data mining” in which hypotheses are later changed to agree with some interesting but unanticipated results showing up in the data is unacceptable. This may be likened to painting a bull’s-eye around wherever an arrow has struck. That is not to say that certain kinds of exploratory observations, in astronomy, for example, may not be examined for anomalous phenomena. But they are not used in hypothesis testing. They may lead to new hypotheses, but these hypotheses must then be independently tested according to the protocols I have outlined.
- The people performing the study, that is, those taking and analyzing the data, must do so without any prejudgment of how the results should come out. This is perhaps themost difficult condition to follow to the letter, since most investigators start out with the hope of making a remark- able discovery that will bring them fame and fortune. They are often naturally reluctant to accept the negative results that more typically characterize much of research. Investigators may then revert to data mining, continuing to look until they convince themselves they have found what they were looking for.3 To enforce this condition and avoid such biases, certain techniques such as “blinding” may be included in the protocol, where neither the investigators nor the data takers and analyzers know what sample of data they are dealing with. For example, in doing a study on the efficacy of prayer, the investigators should not know who is being prayed for or who is doing the praying until all the data are in and ready to be analyzed.
- The hypothesis being tested must be one that contains the seeds of its own destruction. Those making the hypothesis have the burden of providing examples of possible experimental results that would falsify the hypothesis. They must demonstrate that such a falsification has not occurred. A hypothesis that cannot be falsified is a hypothesis that has no value.
- Even after passing the above criteria, reported results must be of such a nature that they can be independently replicated. Not until they are repeated under similar conditions by different (preferably skeptical) investigators will they be finally accepted into the ranks of scientific knowledge.
These conditions are desirable in any claim of knowledge; there is a time for unrestricted creativity (preceding ‘the’ scientific method) and there is a time for rigor (practicing ‘the’ scientific method). Would it be a good idea, a bad idea, or simply impossible to expect such conditions to be met by any scientific discipline? Which claims or disciplines are unable to meet these conditions, and (how) do they provide reliable knowledge? Although I tend to agree with Paul Feyerabend‘s statement in “Against Method” (1975) that “The idea that science can, and should, be run according to fixed and universal rules, is both unrealistic and pernicious” (page 295), I can’t imagine how to achieve reliable knowledge without at least falsification (condition 4) and reproducibility (condition 5).