New Videos: Videos #5 and #6 on Probability for Actuarial Exam 1/P
Mathematics is built on an axiomatic foundation. What does this mean? In a nutshell, it means that every topic in mathematics is based on both undefined terms and on statements that are assumed to be true.
Upon hearing this, many people wonder “how can this be so?”. Isn’t it necessary to prove all mathematical facts with rigorous logic? Shouldn’t all terms be precisely defined?
A few people might try to answer these questions by embarking on a quest to rid mathematics of undefined terms and axioms. However, it is a hopeless quest. There is no way to complete the task without using circular reasoning.
Ideally, it is a good thing to keep undefined terms and axioms intuitive, simple, and minimal in number. This is somewhat true for probability, which is the main topic of interest in this post. However, there is a caveat that the reader should be comfortable with sets, functions, and even with the basic motivations for, interpretations of, and examples in, probability.
The problem-solving content of Videos #5 and #6 in my series on Probability for Actuarial Exam 1/P use some fundamental theorems that can be proved using the axioms of probability. The videos are directly below. I will discuss the axioms of probability, as well as foundational theorems and their proofs, further below.
An Axiomatic Approach to Basic Probability
Description of the Axioms
I will not be as abstract as possible in the axiomatic approach to probability that I am about to describe. First, I will assume the reader is familiar with the (informal) ideas of sets and functions. Essentially I am taking these to be undefined terms, though functions are informally described as “rules of assignment” and sets as “collections of objects”.
Second, I am also going to assume the reader is familiar with the idea of a sample space and of events from a random experiment, both as intuitive ideas and as sets. These are yet more undefined terms.
Finally, I am assuming that the events in this sample space have simple relationships between each other. To be more precise, I am assuming that the (set) complement of an event is another event and that a “countable” union of events is another event. You will be happy to know that I will not get into an abstract discussion of “sigma-algebras“.
The axioms for basic probability can now be described as follows. We start by assuming there is a “probability set function” The domain of is the set (collection) of all possible events. The codomain of is initially taken to be the interval (later we will prove that the codomain of can actually be taken to be the interval ). The output of for an arbitrary event will be denoted by Furthermore, is assumed to satisfy the following properties (axioms):
- i.e., something in the sample space must occur.
- (Non-negativity) for any event i.e., negative probabilities do not make sense.
- (Additivity) For any finite or “countably infinite” collection of events such that whenever (so the events are pairwise “disjoint” or “mutually exclusive”), we have . Note that we are implicitly assuming any such infinite sum (series) converges.
From these assumed truths, we can now prove some basic properties (proved truths).
Basic Consequences of the Axioms
The first fact to prove says there is no chance that nothing will happen. Seriously!
Theorem 1:
Proof: Note that and Therefore, by axioms (1) and (3), Canceling the “1” from both sides allows us to conclude that Q.E.D.
The next fact is the complement law. It says, for instance, that if the chance of rain tomorrow is 30%, then the chance of no rain tomorrow is 70%.
Theorem 2: For any event we have
Proof: Note that and that By axioms (1) and (3), Subtracting from both sides leads to the final result that Q.E.D.
The next fact shows that the codomain of can be taken to be the interval
Theorem 3: For any event we can say that
Proof: First, by axiom (2), for any event we directly have Next, by Theorem 2 that we just proved, since we can say that But since is another event, we know by axiom (2) that which means that But this helps us see that Hence, and the result follows because is an arbitrary event. Q.E.D.
The following theorem is sometimes described as saying that is “monotone increasing”: as the events get “bigger” (under a certain stipulation), the probabilities get bigger.
Theorem 4: For any two events and with it follows that
Proof: First note that, since we can say that where (you should try verifying this on your own). We also know that Therefore, axiom (3) implies that But by axiom (2). Therefore, This is what we wanted to prove. Q.E.D.
Also note that the proof of Theorem 4 really leads to the truth of another theorem.
Theorem 5: For any two events and with we can conclude that
In the general case where neither nor are necessarily a subset of the other, the following theorem can be stated and proved.
Theorem 6: If and are any two events, then and
Proof: We just prove the first equation. The second one is symmetric. Note that where Then, from axiom (3), Hence, . Q.E.D.
Finally, we prove the general addition rule.
Theorem 7: For any two events and the following formula is true:
Proof: Start by noting that where these three sets are all pairwise disjoint (mutually exclusive). Therefore, by axiom (3), But now Theorem 6 above gives
We are done. Q.E.D.
When Theorems 1 and 7 imply the truth of the “special addition rule”, that
Conditional Probability and the General Multiplication Rule
There are many situations where knowing more information will cause a person to change their estimate of the likelihood of an event.
For example, if you live in Minnesota and are wondering about the chances of rain tomorrow, a look at the current radar in South Dakota can help you revise an initial probability estimate.
As another example, consider taking one card at random from a well-shuffled standard 52-card deck. The probability that it is a “heart” is
But if someone you trust tells you that the card is “red”, that information will cause you to change your estimate of the previous probability to This last probability is “conditioned” on the fact that you know the card is red.
Let us consider this same example in the context of axiomatic probability. The sample space can be taken to be the 52-element set of all the distinct cards. The event that the card is a “heart” is a 13-element set consisting of the distinct cards that are hearts. And the event that the card is “red” is a 26-element set consisting of the distinct cards that are reds (hearts and diamonds).
The last calculation we did for the conditional probability of the card being a “heart” when it is know to be “red” could be rewritten as
Since we can say that , so the last calculation can also be represented as It is this last expression that is the “true” formula for the conditional probability, it works even when neither event is a subset of the other. The only restriction on this formula is that we cannot divide by zero, so the “known” event cannot have probability zero.
Notationally, the symbol represents the conditional probability of the event occurring if it is known, or given, that has occurred. Think of the vertical line as being shorthand for “given that”.
With these conventions, our definitional formula for conditional probability becomes
As long as we can multiply both sides of this formula by to get The order of the symbols is arbitrary, so we also can say that
These equations are actually true in general, even when and/or And the general statement of their truth is called the general multiplication rule.
We will see that all these equations are very useful in future problems. They also lead to a definition of the very important idea of independent events and the corresponding (special) multiplication rule (for independent events).