Conditional probability, Bayes' theorem, law of total probability, and independence.
Conditional probability is the backbone of probabilistic reasoning — it formalizes how information updates uncertainty. This topic covers the definition of , the two fundamental formulas (total probability and Bayes), conditional expectation in all its forms, and the measure-theoretic definition which serves as the foundation for martingales.
Let be a probability space and with . The conditional probability of given is:
is itself a valid probability measure on . In particular and satisfies all Kolmogorov axioms. The map is the original measure restricted and renormalized to .
A card is drawn from a standard 52-card deck. Given that it is a face card (12 face cards), the probability it is a King is .
Let be a partition of : pairwise disjoint events with and for all . Then for any event :
The partition must be exhaustive (covers ) and pairwise disjoint. In practice, choose a partition that makes each easy to compute.
A factory has two machines: machine 1 produces 60% of items with a 2% defect rate, machine 2 produces 40% with a 5% defect rate. Probability of a defective item:
Whenever a problem involves multiple scenarios or causes, condition on them using the total probability formula. The key pattern is: identify a natural partition , compute in each case, then average weighted by .
Let be a partition of with . For any event with :
In Bayesian language: is the prior, is the likelihood, and is the . Bayes' formula inverts the conditioning — it answers "given that occurred, which cause is most likely?"
Using the factory example above: given that an item is defective, probability it came from machine 1:
Bayes is the fundamental tool for inference: you observe data and update beliefs about a hidden cause . Classic interview problems: medical tests (false positives), coin fairness, signal detection.
Let with . The conditional expectation of given is:
Roll a fair die. Given the result is even: .
Let be random variables. The conditional expectation is the unique (a.s.) random variable measurable with respect to such that for all :
is a random variable — a function of , not a number. It becomes a number only once is observed. This distinction is crucial: is a number, is a random variable.
For any random variables with :
The tower property is the single most used tool for computing expectations in multi-stage problems. Pattern: condition on the first source of randomness, compute the inner expectation, then take the outer expectation. Essential for: compound distributions, sequential experiments, Wald's identity.
The pull-out property is used constantly — it says that anything that is already known given can be factored out of the conditional expectation.
The conditional variance of given is:
The decomposition reads: total variance = average of conditional variances + variance of conditional means. The first term captures the within-group variability, the second captures the between-group variability.
Let and given , with i.i.d. with mean and variance . Then:
Law of total variance is the key tool for compound/mixture distributions. Whenever depends on a hidden random variable (number of events, regime, group), decompose using this law.
Let be a probability space, a sub--algebra, and . The is the unique (a.s.) random variable satisfying:
represents the information available. The larger is, the more information we have:
The tower property in this form — for — is the exact property used to define martingales. A process is a martingale if for all .
| Concept | Formula |
|---|---|
| Conditional probability | |
| Total probability |
| Situation | Tool |
|---|---|
| Multiple causes / scenarios | Total probability formula |
| Observed effect, infer cause | Bayes' formula |
| Multi-stage expectation | Tower property |
| Compound / mixture distribution variance | Law of total variance |
For a discrete random variable: .
In the discrete case: .
In the continuous case: where .
More generally, for any sub--algebra :
Existence and uniqueness are guaranteed by the Radon-Nikodym theorem.
| Bayes' formula |
| Conditional expectation (continuous) |
| Conditional density |
| Tower property |
| Pull-out property |
| Conditional variance |
| Law of total variance |
| defining property |