Studying for Exam LTAM, Part 1.8
When somebody asks you how old you are, how do you answer? Are you like most people? Do you give your age from your previous birthday? Or do you get extra-precise and say something like “I am 21 years, 5 months, and 1 week old”?
If you answer the way that most people do, then you already know a little bit about curate survival random variables. You have also made use of the “greatest integer” or “floor” function without even knowing it. Essentially this is a function that takes an arbitrary real number input and “rounds down” to the nearest integer less than or equal to .
In textbooks, this function is written either as or . A sampling of its values includes and while . The graph of is shown below. Open circles are included at points that are not on the graph of .
The Discrete Curtate Lifetime Random Variable
We can apply the greatest integer function to a continuous survival random variable to get a discrete random variable . If is the lifetime of a newborn, then will be the rounded-down age at death. For example, if the observed value of is 78.7591 years (about 75 years, 9 months, and 3 days), then the observed value of will be 78 years — that person did not make it to their 79th birthday.
Likewise, if we assume a newborn lives to age and has remaining lifetime , then is the remaining lifetime, rounded down to the nearest integer. The random variable is discrete. In general, discrete random variables take on either a finite number or a countably infinite number of values. For this situation, it can take on only non-negative integer values
Whereas probabilities for continuous random variables are found by integrating probability density functions (PDFs), probabilities for discrete random variables are found by summing values of probability mass functions (PMFs).
The probability mass function for can be written symbolically as , where . This is a function of . It is often written as or .
Computing the Probability Mass Function
A question now naturally arises: how should the PMF be computed? Since is defined in terms of , the answer should be related to the distribution functions of .
The fact that means that is equivalent to . Therefore, . But, since is a continuous random variable, we can say that . This is the same as , where is the cumulative distribution function (CDF) of . On the other hand, if is the survival function (SF) of , this also equals .
To summarize, we have just found that the difference is the PMF of . That is, we can say that for .
Alternatively, the general multiplication rule implies that . Hence, we can also write the PMF of as for .
We should note that a pre-subscript equal to “1” is usually omitted. In other words, we usually write in place of and in place of . Hence, it is common to write the PMF as
This has an intuitive interpretation. If we assume that time is measured in years, the chances of living between and whole years is the probability that lives at least years times the probability that, once attains age , he or she dies in the next year.
Computations and Graphs for Specific Survival Models
Let us now consider the distribution of for the various specific survival models we have considered so far: 1) uniform (De Moivre’s Law), 2) exponential (constant force), 3) triangular, and 4) Gompertz-Makeham.
Uniform Lifetime (De Moivre’s Law)
The SF of is and the CDF of is for and .
Hence, the PMF of is for . And, for the PMF value is since . Note that if is an integer, then . Also note that since (think about this inequality for specific examples if that helps).
As a double-check on this, we note that if , then , as seen above. And if , then so as seen above as well.
An animation of the graph of this PMF is shown below. In the animation, ranges from 0 to 40 while ranges from 60 to 100. This discrete distribution is “almost uniform” except for the very last probability when (except in the case where is exactly an integer, in which case the discrete distribution is exactly uniform).
Exponential Lifetime (Constant Force)
The SF of is and the CDF of is for and . (Recall that this distribution is memoryless so that these formulas do not actually depend on .)
Therefore, the PMF of is for .
Double-checking this with the other formula gives for .
An animation of this PMF does not vary with . Hence, we make an animation showing what happens as varies instead. This means we are really looking at a bunch of different exponential distributions rather than the family of distributions for at some particular value of .
In the animation, varies from 0.1 to 10. Note that when increases to around 5 or so, “almost all” the probability is at . In other words, there is almost no chance of surviving at least one more year in that situation. In fact, if , then .
It is also interesting to note here that the values of do not increase or decrease monotonically in . They go up and then down.
For example, if , then . Taking the derivative of this with respect to gives . This is equal to zero when .
It turns out that the graph of , as a function of , has a maximum value of at . This is the value of where the “second spike” (at ) has a maximum height (of ) in the animation above.
Triangular Lifetime
For this example, the formulas are piecewise and very complicated (see “Triangular Survival Models”). We will be content just to look at the animated graph of the PMF below. We see that it has a triangular shape to it, just like in the continuous case. In the animation varies from 0 to 20, varies from 40 to 60, and varies from 80 to 100.
Gompertz-Makeham Lifetime
In the Gompertz-Makeham model, the SF of is for .
Therefore, the PMF of is . This formula can be rewritten in various ways. But we will not bother doing so and just look at our animation.
In the next post of this series on studying for Exam LTAM, we will explore the mean and standard deviation of for these different models.