Scientific Reasoning

By Joe Lau and Jonathan Chan

TUTORIAL S01: The hypothetical deductive method

TUTORIAL S02: Choosing among theories
TUTORIAL S03: The Bayesian theory of confirmation
TUTORIAL S04: Some basic concepts about causation
TUTORIAL S05: Mill’s methods for identifying causes
TUTORIAL S06: Common causal relations
TUTORIAL S07: Cause and effect diagrams

TUTORIAL S01: The hypothetical deductive method

S01.1 The four components of science

Scientific research is a complex affair, and the various disciplines within science operate in rather different ways. However, in general we can identify four main components in scientific research : 

  • Theories – The set of hypotheses, laws and facts that are about the empirical world.
  • The world – The objects, processes and properties of the real world form the subject matter of the theories.
  • Predictions – We employ our theories to make predictions about the world. This might be predictions about the future, but they can also be predictions about the past. For example, a geological theory about the earth’s history might predict that certain rocks contains a high percentage of special metals. A crucial part of scientific research is to check the predictions of theories to determine which theory to accept and which to reject.
  • Data – The information that is gathered from empirical observations or experiments. Data provides the evidence to test theories. They might also inspire new directions in research.

S01.2 The hypothetical-deductive method

The hypothetical-deductive method (HD method) is a very important method for testing theories or hypotheses. It is sometimes said to be “the scientific method”. This is not quite correct because surely there is not just one method being used in science. However, it is true that the HD method is of central importance, because it is one of the more basic methods common to all scientific disciplines, whether it is economics, physics, or biochemistry. Its application can be divided into four stages :

  1. Identify the hypothesis to be tested.
  2. Generate predications from the hypothesis.
  3. Use experiments to check whether predictions are correct.
  4. If the predictions are correct, then the hypothesis is confirmed. If not, then the hypothesis is disconfirmed.

Here is an illustration :

  1. Suppose your portable music player fails to switch on. You might then consider the hypothesis that perhaps the batteries are dead. So you decide to test whether this is true.
  2. Given this hypothesis, you predict that the music player should work properly if you replace the batteries with new ones.
  3. So you proceed to replace the batteries, which is the “experiment” for testing the prediction.
  4. If the player works again, then your hypothesis is confirmed, and so you throw away the old batteries. If the player still does not work, then the prediction is false, and the hypothesis is disconfirmed. So you might reject your original hypothesis and come up with an alternative one to test, e.g. the batteries are ok but your music player is broken.

This example helps us illustrate a few points about science and the HD method.

1. A scientific hypothesis must be testable

The HD method tells us how to test a hypothesis, and a scientific hypothesis must be one that is capable of being tested.

If a hypothesis cannot be tested, we cannot find evidence to show that it is probable or not. In that case it cannot be part of scientific knowledge. Consider the hypothesis that there are ghosts which we cannot see and can never interact with, and which can never be detected either directly or indirectly. This hypothesis is defined in such a way to exclude the possibility of testing. It might still be true and there might be such ghosts, but we would never be in a position to know and so this cannot be a scientific hypothesis.

2. Confirmation is not truth

In general, confirming the predictions of a theory increases the probability that a theory is correct. But in itself this does not prove conclusively that the theory is correct.

To see why this is the case, we might represent our reasoning as follows :

If H then P.
Therefore H.

Here H is our hypothesis “the batteries are dead”, and P is the prediction “the player will function when the batteries are replaced”. This pattern of reasoning is of course not valid, since there might be reasons other than H that also bring about the truth of P. For example, it might be that the original batteries are actually fine, but they were not inserted properly. Replacing the batteries would then restore the loose connection. So the fact that the prediction is true does not prove that the hypothesis is true. We need to consider alternative hypotheses and see which is more likely to be true and which provides the best explanation of the prediction. (Or we can also do more testing!)

In the next tutorial we shall talk about the criteria that help us choose between alternative hypotheses.

3. Disconfirmation need not be falsity

Very often a hypothesis generates a prediction only when given additional assumptions (auxiliary hypotheses). In such cases, when a prediction fails the theory might still be correct.

Looking back at our example again, when we predict that the player will work again when the batteries are replaced, we are assuming that there is nothing wrong with the player. But it might turn out that this assumption is wrong. In such situations the falsity of the prediction does not logically entail the falsity of the hypothesis. We might depict the situation by this argument : ( H=The batteries are dead, A=The player is not broken.)

If ( H and A ) then P.
It is not the case that P.
Therefore, it is not the case that H.

This argument here is of course not valid. When P is false, what follows is not that H is false, only that the conjunction of H and A is false. So there are three possibilities : (a) H is false but A is true, (b) H is true but A is false, or (c) both H and A are false. So we should argue instead :

If ( H and A ) then P.
It is not the case that P.
Therefore, it is not the case that H and A are both true.

Returning to our earlier example, if the player still does not work when the batteries are replaced, this does not prove conclusively that the original batteries are dead. This tells us that when we apply the HD method, we need to examine the additional assumptions that are invoked when deriving the predictions. If we are confident that the assumptions are correct, then the falsity of the prediction would be a good reason to reject the hypothesis. On the other hand, if the theory we are testing has been extremely successful, then we need to be extremely cautious before we reject a theory on the basis of a single false prediction. These additional assumptions used in testing a theory are known as “auxiliary hypotheses”.

TUTORIAL S02: Choosing among theories

Scientific reasoning is often about choosing the theory from a set of alternatives that we think is most likely to be true. But how do we decide which theory is the best one that is most likely to be true? Here are some relevant criteria.

Predictive power

The minimum requirement for a scientific theory is that it can help us make predictions and explain our observations. If a hypothesis generates no testable prediction, it fails the minimal requirement for a scientific hypothesis.

When we evaluate the predictive power of a theory, we consier both the quantity and the quality of the predictions. How many predictions can the theory make? How accurate and precise are they?


In general, we want theories that can explain the connections between events by revealing the underlying causal mechanisms. This can help us generate more predictions to test the theory and make other discoveries.


This is about whether a theory helps us make surprising or unexpected predictions which turn out to be correct, and whether the theory helps us detect and explain connections which we would not have noticed otherwise.


A simple theory is (roughly) one with fewer assumptions, and which posits less entities than its competitors. Many scientists believe strongly that we should search for simple theories if feasible.


A theory should be internally coherent in the sense that it is logically consistent. If not, there is something wrong with the theory as it stands, and so there is a need to revise the theory to come up with a better version.

The other aspect of coherence is that we should look for theories that fit together with other well-confirmed facts and scientific theories. Widely accepted theories are already well-confirmed, so if a hypothesis is incompatible with existing science, the default response should be that the hypothesis is mistaken. An extraordinary claim incompatible with scientific knowledge should require very strong evidence before it can be accepted.

TUTORIAL S03: The Bayesian theory of confirmation

Thomas Bayes

Belief does not come in an all-or-nothing manner. If it has been raining heavily the past week, and the clouds have not cleared, you might believe it is going to rain today as well. But you might not be certain that your belief is true, as it is possible that today turns out to be a sunny day. Still, you might decide to bring an umbrealla when you leave home, since you think it is more likely to rain than not. The Bayesian framework is a theory about how we should adjust our degrees of belief in a rational manner. In this theory, the probability of a statement, P(S), indicates the degree of belief an agent has in the truth of the statement S. If you are certain that S is true, then P(S)=1. If you are certain that it is false, then P(S)=0. If you think S is just as likely to be false as it is to be true, then P(S)=0.5.

One important aspect of the theory is that rational degrees of belief should obey the laws of probability theory. For example, one law of probability is that P(S) = 1 – P(not-S). In other words, if you are absolutely certain that S is true, then P(S) should be 1 and P(not-S)=0. It can be shown that if your system of degree of belief deviates from the laws of probability, and you are willing bet according to your beliefs, then you will be willing to enter into bets where you will lose money no matter what.

What is interesting, in the present context, is that the Bayesian framework provides a powerful theory of confirmation, and explains many aspects of scientific reasoning.

Here, P(H) measures your degree of belief in a hypothesis when you do not know the evidence E, and the conditional probability P(H|E) measures your degree of belief in H when E is known. We might then adopt these definitions :

1. E confirms or supports H when P(H|E) > P(H).2. E disconfirms H when P(H|E) < P(H).

3. E is neutral with respect to H when P(H|E) = P(H).

As an illustration, consider definition #1. Suppose you are asked whether Mary is married or not. Not knowing her very well, you don’t really know. So if H is the statement “Mary is married”, then P(H) is around 0.5. Now suppose you observe that she has got kids and has a ring on her finger, and living with a man. This would provide evidence supporting H, even though it does not prove that H is true. The evidence increases your confidence in H, so indeed P(H|E) > P(H). On the other hand, knowing that Mary likes ice-cream probably does not make a difference to your degree of belief in H. So P(H|E) is just the same as P(H), as in definition #3.

One possible measure of the amount of confirmation is the value of P(H|E)-P(H). The higher the value, the bigger the confirmation. The famous Bayes theorem says :

P(H|E) = P(E|H)xP(H)/P(E)

So, using Bayes theorem,

the amount of confirmation of hypothesis H by evidence E
= P(H|E)-P(H)
= P(E|H)xP(H)/P(E) – P(H)
= P(H) { P(E|H)/P(E) – 1 }

Notice that all else being equal, the degree of confirmation increases when P(E) decreases. In other words, if the evidence is rather unlikely to happen, this provides a higher amount of confirmation. This accords with the intuition that surprising predictions provide more confirmation than commonplace predictions. So this intuition can actually be justified within the Bayesian framework. Bayesianism is the project of trying to make sense of scientific reasoning and confirmation using the Bayesian framework. This approach holds a lot of promise, but this is not to say that it is uncontroversial.

S03.1 Further reading

TUTORIAL S04: Some basic concepts about causation

There are two types of causation : singular vs. general. Singular causation is a relation between two particular events, where a particular event is some activity or occurrence at some particular time or place. Here are some examples of singular causation : 

  • Her singing causes the windows to shatter.
  • The viral infection caused his death.

As for general causation, it is a relation between two types of events, as in :

  • Smoking causes cancer.
  • Pressing the button causes the bell to ring.

It seems reasonable to think that general causation is to be analysed in terms of singular causation. So “type X events cause type Y events” might be understood as something roughly like “particular events of type X are highly likely to cause particular events of type Y.”


Some useful terminology

The concept of a cause is quite vague, and sometimes it might be useful to distinguish between these three different concepts :

  • An event X is causally necessary for an event Y if and only if Y would not have happended if X had not occurred.
  • An event X is causally sufficient for an event Y if and only if the presence of X alone is enough to bring about Y.

So for example, heating a gas is causally sufficient but not necessary to increase its pressure – you can increase its pressure by compressing the gas as well. Pressing the light switch might be causally necessary to turn the light on but it is not sufficient since electricity is lso required.

  • Sometimes, a causal factor can be salient or relevant to the effect even if it is neither necessary nor sufficient, e.g. hardwork might be a causally relevant factor that is part of the explanation of why a student has passed, but presumably it is neither necessary nor sufficient.
  • We can also draw a distinction between triggering and standing or structural causes. A triggering cause is a cause that sets into motion the chain of events that lead to an effect. Whereas a standing cause is some static condition that contributes to the effect only in conjunction with a triggering cause.

For example, suppose there was an explosion in a room full of flammable gases. The triggering cause might be the event of someone lighting a match in the room, and the presence of the gases would be the standing cause. Similarly, the standing cause of a particular riot might have to do with high unemployment, with the triggering cause being some particular event such as perhaps someone being beaten up by the police.


Explaining causation in terms of causal mechanisms

The universe contains objects and processes at various levels. Big objects such as societies are composed of smaller objects such as individual human beings, and high level processes such as the conduction of electricity is composed of lower-level processes such as the movement of electrons. To explain causation, it is not enough just to know that A is the cause of B, we need a theory that explains how A causes B. What is needed is a theory of the lower-level causal mechanisms that lead from A to B.

For example, to explain why heating causes a piece of metal to expand, we cite the fact that heating gives energy to the metal atoms, and as a result of increasing vibration due to higher energy the distance between the atoms increase and this constitutes expansion. The structure of this explanation can be represented by a diagram :

What this diagram shows is that a high level physical causal process is explained in terms of a lower-level mechanism. Without lower-level mechanisms, we would not be able to understand how high-level causation can occur.

TUTORIAL S05: Mill’s methods for identifying causes

John Stuart Mill (1806-1873) was an English philosopher who wrote on a wide range of topics ranging from language and science to political philosophy. The so-called “Mill’s methods” are five rules for investigating causes that he has proposed.

S05.1 The Method of Agreement

The best way to introduce Mill’s methods is perhaps through an example. Suppose your family went out together for a buffet dinner, but when you got home all of you started feeling sick and experienced stomach aches. How do you determine the cause of the illness? Suppose you draw up a table of the food taken by each family member :

Member / Food taken Oyster Beef Salad Noodles Fallen ill?
Mum Yes Yes Yes Yes Yes
Dad Yes No No Yes Yes
Sister Yes Yes No No Yes
You Yes No Yes No Yes

Mill’s rule of agreement says that if in all cases where an effect occurs, there is a single prior factor C that is common to all those cases, then C is the cause of the effect. According to the table in this example, the only thing that all of you have eaten is oyster. So applying the rule of agreement we infer that eating oyster is the cause of the illnesses.

S05.2 The Method of Difference

Now suppose the table had been different in the following way:

Member / Food taken Oyster Beef Salad Noodles Fallen ill?
Mum Yes Yes Yes Yes Yes
Dad Yes Yes Yes Yes Yes
Sister Yes Yes Yes Yes Yes
You Yes Yes No Yes No

In this particular case you are the only one who did not fall ill. The only difference between you and the others is that you did not take salad. So that is probably the cause of the others’ illnesses. This is an application the method of difference. This rule says that where you have one situation that leads to an effect, and another which does not, and the only difference is the presence of a single factor in the first situation, we can infer this factor as the cause of the effect.

S05.3 The Joint Method

The joint method is a matter of applying both the method of agreement and the method of difference, as represented by the diagram above. So application of the joint method should tell us that it is the beef which is the cause this time.

Member / Food taken Oyster Beef Salad Noodles Fallen ill?
Mum Yes Yes Yes Yes Yes
Dad Yes Yes No Yes Yes
Sister Yes Yes Yes No Yes
You Yes No No Yes No

S05.4 The Method of Concomitant Variation

The method of concomitant variation says that if across a range of situations that lead to a certain effect, we find a certain property of the effect varying with variation in a factor common to those situations, then we can infer that factor as the cause.

Thus using the same kind of example, we might find that you felt somewhat sick having eaten one oyster, whereas your sister felt rather not well having eaten a few, and your father became critically ill having eaten ten in a row. Since the variation in the number of oysters corresponds to variation in the severity of the illness, it would be rational to infer that the illnesses were caused by the oysters.

S05.5 The Method of Residues

According to the method of residues, if we have a range of factors believed to be the causes of a range of effects, and we have reason to believe that all the factors, except one factor C, are causes for all the effects, except one, then we should infer that C is the cause of the remaining effect.

S05.6 General comments on Mill’s methods

Mill’s methods should come as no surprise, as these rules articulate some of the principles we use implicitly in causal reasoning in everyday life. But it is important to note the limitations of these rules.

  • First, the rules presuppose that we have a list of candidate causes to consider. But the rules themselves do not tell us how to come up with such a list. In reality this would depend on our knowledge or informed guesses about likely causes of the effects.
  • The other assumption presupposed by these methods is that among the list of factors under consideration, only one factor is the unique cause of the effect. But there is no guarantee that this assumption always holds. Also, sometimes the cause might be some complicated combinations of various factors.

TUTORIAL S06: Common causal relations

When we observe two events A and B, how do we find out if they are causally related? One possibility is that A is the cause of B. But there are many other alternatives to consider. Here we discuss some of the main ones. Please refer to the diagram below.

Case #1 – A and B are not causally related

  • The fact that A is followed by B does not make A the cause of B. Even when there seems to be a correlation between A and B, it is possible that they are not causally connected. Perhaps the correlation is accidental.
  • It is important to consider a control situation where A is absent, and see if B would still occur.
  • See also the Simposon’s paradox.

Case #2 – A is the cause of B

  • For a particular event A to be the cause of B, it is necessary that A happens earlier than B.
  • If a type of event A is positively correlated with B, this is one relevant piece of evidence that A is the cause of B. But we need to rule out the other possibilities which are discussed here.
  • If we can change B by changing A, this also supports the hypothesis that A is the cause of B. See Mill’s method of concomitant variation.

Case #3 – B is the cause of A

  • Sometimes correlation goes both ways. The fact that A causes B can explain the correlation, but maybe the reality is that B is the cause of A. For example, people who are depressed tend to have low self-esteem. Perhaps the former is the cause of the latter, but it is also possible that low self-esteem causes depression by making a person socially withdrawn and lacking in motivation. We need further observations to determine which possibility it is.

Case #4 – A and B form a causal loop

  • In many cases two causal factors can reinforce each other by forming a causal loop. In the example above, it is more plausible to think that depression affects self-esteem, and a lower self-esteem can cause further depression.
  • Of course, causal loops happen only between types of events. If a particular event A is the cause of a particular event B, then A must happen earlier than B and so B cannot be the cause of A.

Case #5 – A is a minor cause of B

  • An effect can have more than one cause, and some may be more important than others.

Case #6 – A and B have a common cause

  • Young children with larger noses tend to be more intelligent, but it is not because the nose size somehow accelerates cognitive development. Rather young children with larger noses are children who are older, and older children are more intelligent than younger ones because their brains have developed further. So A and B are correlated not because A is the cause of B, but because there is an underlying common cause.

Case #7 – B is a side effect of A

  • These are cases where the effect might be wrongly attributed to A when in fact it is due to some side effect of A.
  • It has been shown that medicine can have a placebo effect. The subjective belief that one is being treated can bring about relief from an illness even if the medical treatement being given is not really effective against the illness. For example, a patient might report that his pain has decreased as a result to taking a pill, even though the pill is a sugar pill with no effect on pain.

TUTORIAL S07: Cause and effect diagrams

The world being a complicated place, events are often related by complex causal connections. Cause and effect diagrams can play a very important role in understanding such connections, and assist in the calculation of statistical and probabilistic dependencies. By laying out such connections, diagrams can help us identify important crucial factors in the explanation, prediction and control of events. Here we discuss briefly two popular types of cause and effect diagrams.

S07.1 Causal networks

Causal networks are diagrams that indicate causal connections using arrows. Here is a simple example where an arrow from A to B indicates that A is the cause of B.

Causal networks are particularly useful in showing causal processes which involve a number of different stages. Here is a diagram that shows the lifecycle of a parasite Myxobolus cerebralis that passes between trouts and mud worms, causing a so-called “whirling disease” in trouts:

In more complicated cases, the arrows are given probability assignments to indicate how likely it is that one event would lead to another. Special algorithms or programs can then be used to calculate how likely it is for a particular effect to come about. These networks with probability are known as “Bayesian networks” or Belief nets“.

S07.2 Fishbone diagrams

Fishbone diagrams are so-called because they resemble fishbones. They are also called “Ishikawa diagrams”, named after Kaoru Ishikawa of Japan. A fishbone diagram is a graphical representation of the different factors that contribute to an effect. They are often used in business and management.

In a typical fishbone diagram, the effect is usually a problem to be resolved, and is placed at the “fish head”. The causes of the effect are then laid out along the “bones”, and classified into different types along the branches. Further causes can be laid out alongside further side branches. So the general structure of a fishbone diagram is something like this:

As an example, consider a case where a manufacturing company that receives a lot of complaints from its customers. The customers complain that the staff are not very helpful and responsive. In order to resolve this issue, it is necessary to understand the sources of the problem. We might classify the causes of the problem into four kinds: manpower issues, issues relating to the machinery used, issues relating to methods, process and management, and finally issues relating to the raw material used. Under each of these four heads, we can identify the more detailed causes that contribute to the problem. A group discussion might end up producing a diagram like this one:

[From Quality Tools, 2007.5.5]

Click here to see the diagram in full.

One advantage of these diagrams is that they give a big picture of the main causal factors leading to the effect. These diagrams are now often used in quality management, and in brainstorming sessions.

TUTORIAL S08: Fallacies about causation

Here are some not uncommon mistakes in reasoning about causation.

  • Post hoc fallacy – Inferring that X causes Y just because X is followed by Y. Example: “Last time I wore these red pants I got hit by a car. It must be because they bring bad luck.”
  • Mistaking correlation as causation – “Whenever I take this pill my cough clears up within a week, so this pill is very effective in curing coughs.” But perhaps mild coughs go away eventually even without taking medicine?
  • Reversing causal direction – Assuming that X causes Y without considering the possibility that Y is the cause of X – “Children who like violent video games are more likely to show violent behavior. This must be because they are copying the games.” But can it be that children who are more prone to violence are more fond of such video games?
  • Genetic fallacy – Thinking that if some item X is associated with a source with a certain property, then X must have the same property as well. But of course this might not be the case. Example: “Eugenics was practised by the Nazis so it is obviously disgusting and unacceptable.”
  • Fallacy of the single cause – Wrongly presupposing that an event has a single cause when there are many causally relevant factors involved. This is a fallacy where causal interactions are being over-simplified. For example, after tragedy such as a student committing suicide, people and the news media might start looking for “the cause”, and blame it on either the parents, the amount of school work, the society, etc. But there need not be a single cause that led to the suicide. Many factors might be at work.
  • Confusing good causal consequences with reasons for belief – Thinking that a claim C must be true because believing in C brings about some benefit. Example: “God exists because after I have become a believer I am a lot happier and is now a better person.”
  • Dr. Joe Lau
    Department of Philosophy, University of Hong Kong
  • Dr. Jonathan Chan
    Department of Religion and Philosophy, Baptist University of Hong Kong

© 2004-2010 Joe Lau and Jonathan Chan