Bayes' Theorem Formula

Bayes' theorem gives the posterior probability of a hypothesis given evidence: P(H|E) = P(E|H) x P(H)/P(E).

The Formula

P(AB)=P(BA)P(A)P(B)P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}

When to use: Start with a prior belief, then reweight it by how likely the evidence is under each hypothesis.

Quick Example

A disease test is 99% accurate; 1% of people have the disease. If you test positive, P(disease+)50%P(\text{disease}|+) \approx 50\% — not 99%, because the disease is rare.

Notation

P(A)P(A) is the prior, P(BA)P(B \mid A) is the likelihood, P(AB)P(A \mid B) is the posterior, and P(B)P(B) is the total evidence probability.

What This Formula Means

Bayes' theorem gives the posterior probability of a hypothesis given evidence: P(HE)=P(EH)P(H)P(E)P(H|E) = \frac{P(E|H) \cdot P(H)}{P(E)}.

Start with a prior belief, then reweight it by how likely the evidence is under each hypothesis.

Formal View

For events AA and BB with P(B)>0P(B) > 0: P(AB)=P(BA)P(A)P(B)P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} where P(B)=P(BA)P(A)+P(BAc)P(Ac)P(B) = P(B|A)P(A) + P(B|A^c)P(A^c) by the law of total probability.

Worked Examples

Example 1

medium
Email spam filter: P(spam)=0.3P(\text{spam})=0.3. The word 'free' appears in 80% of spam emails and 10% of legitimate emails. An email contains 'free'. Find P(spamfree)P(\text{spam}|\text{free}) using Bayes' theorem.

Answer

P(spamfree)0.774P(\text{spam}|\text{free}) \approx 0.774. There's a 77.4% chance the email is spam.

First step

1
Prior: P(spam)=0.3P(\text{spam})=0.3, P(legit)=0.7P(\text{legit})=0.7

See the full worked solution + why-it-works coaching

SetupKey insightWhy it worksCommon pitfallConnection

Unlock answer keys One Family plan — every worked solution, all subjects

Example 2

hard
Drug testing: P(user)=0.05P(\text{user})=0.05. Test sensitivity P(+user)=0.99P(+|\text{user})=0.99. Specificity P(non-user)=0.95P(-|\text{non-user})=0.95 (so P(+non-user)=0.05P(+|\text{non-user})=0.05). Find P(user+)P(\text{user}|+).

Example 3

medium
A box contains 40% red and 60% blue marbles. Red marbles are 'shiny' 30% of the time; blue marbles are shiny 10% of the time. A drawn marble is shiny. Find P(redshiny)P(\text{red}|\text{shiny}).

Common Mistakes

  • Treating P(HE)P(H|E) as equal to P(EH)P(E|H) - Bayes flips them, and the prior makes the two differ.
  • Ignoring the base rate (prior) - a rare condition keeps the posterior low even after strong evidence.
  • Using the wrong denominator - P(E)P(E) must total over ALL hypotheses (true and false), e.g. true positives plus false positives.

Why This Formula Matters

Real questions ask 'given a positive test, do I have the disease?' but data give you 'given the disease, how often does the test come back positive?' — Bayes is the only way to flip that, and ignoring the base rate (the prior) is the classic error behind wildly overstated test-result fears. It formalizes learning from evidence. Recognizing it by "Am I given P(EH)P(E|H) and a prior, and asked for the flipped P(HE)P(H|E)?" — rather than by familiar numbers — is what lets a student tell it apart from conditional probability and compound probability and law of total probability in a mixed problem set.

Frequently Asked Questions

What is the Bayes' Theorem formula?

Bayes' theorem gives the posterior probability of a hypothesis given evidence: P(HE)=P(EH)P(H)P(E)P(H|E) = \frac{P(E|H) \cdot P(H)}{P(E)}.

How do you use the Bayes' Theorem formula?

Start with a prior belief, then reweight it by how likely the evidence is under each hypothesis.

What do the symbols mean in the Bayes' Theorem formula?

P(A)P(A) is the prior, P(BA)P(B \mid A) is the likelihood, P(AB)P(A \mid B) is the posterior, and P(B)P(B) is the total evidence probability.

Why is the Bayes' Theorem formula important in Math?

Real questions ask 'given a positive test, do I have the disease?' but data give you 'given the disease, how often does the test come back positive?' — Bayes is the only way to flip that, and ignoring the base rate (the prior) is the classic error behind wildly overstated test-result fears. It formalizes learning from evidence. Recognizing it by "Am I given P(EH)P(E|H) and a prior, and asked for the flipped P(HE)P(H|E)?" — rather than by familiar numbers — is what lets a student tell it apart from conditional probability and compound probability and law of total probability in a mixed problem set.

What do students get wrong about Bayes' Theorem?

The procedure for bayes' theorem is the easy part; the trap is treating P(HE)P(H|E) as equal to P(EH)P(E|H). Asking "Am I given P(EH)P(E|H) and a prior, and asked for the flipped P(HE)P(H|E)?" first is what keeps a correct-looking calculation from being attached to the wrong concept.

What should I learn before the Bayes' Theorem formula?

Before studying the Bayes' Theorem formula, you should understand: conditional probability, probability, sample space.