Curtate Future Lifetime Random Variable

Studying for Exam LTAM, Part 1.8

Photo by Annie Spratt on Unsplash

When somebody asks you how old you are, how do you answer? Are you like most people? Do you give your age from your previous birthday? Or do you get extra-precise and say something like “I am 21 years, 5 months, and 1 week old”?

If you answer the way that most people do, then you already know a little bit about curate survival random variables. You have also made use of the “greatest integer” or “floor” function without even knowing it. Essentially this is a function that takes an arbitrary real number input x and “rounds down” to the nearest integer less than or equal to x.

In textbooks, this function is written either as f(x)=\lfloor x\rfloor or f(x)=[[x]]. A sampling of its values includes f(3.1)=f(3.9)=f(3.9999)=3 and f(4)=4, while f(-3.1)=f(-3.9)=f(-3.9999)=f(-4)=-4. The graph of f is shown below. Open circles are included at points that are not on the graph of f.

The graph of the greatest integer (floor) function f(x)=\lfloor x\rfloor=[[x]]. Note that the open circles are at points which are not on the graph.

The Discrete Curtate Lifetime Random Variable

We can apply the greatest integer function to a continuous survival random variable T_{0} to get a discrete random variable K_{0}=\lfloor T_{0}\rfloor. If T_{0} is the lifetime of a newborn, then K_{0} will be the rounded-down age at death. For example, if the observed value of T_{0} is 78.7591 years (about 75 years, 9 months, and 3 days), then the observed value of K_{0} will be 78 years — that person did not make it to their 79th birthday.

Likewise, if we assume a newborn lives to age x>0 and has remaining lifetime T_{x}=T_{0}-x, then K_{x}=\lfloor T_{x}\rfloor is the remaining lifetime, rounded down to the nearest integer. The random variable T_{x} is discrete. In general, discrete random variables take on either a finite number or a countably infinite number of values. For this situation, it can take on only non-negative integer values 0,1,2,3,\ldots.

Whereas probabilities for continuous random variables are found by integrating probability density functions (PDFs), probabilities for discrete random variables are found by summing values of probability mass functions (PMFs).

The probability mass function for K_{x} can be written symbolically as P[K_{x}=k], where k=0,1,2,3,\ldots. This is a function of k. It is often written as p(k) or f(k).

Computing the Probability Mass Function

A question now naturally arises: how should the PMF be computed? Since K_{x} is defined in terms of T_{x}, the answer should be related to the distribution functions of T_{x}.

The fact that K_{x}=\lfloor T_{x}\rfloor means that K_{x}=k is equivalent to k\leq T_{x}<k+1. Therefore, P[K_{x}=k]=P[k\leq T_{x}<k+1]. But, since T_{x} is a continuous random variable, we can say that P[k\leq T_{x}<k+1]=P[k<T_{x}\leq k+1]=P[T_{x}\leq k+1]-P[T_{x}\leq k]. This is the same as F_{x}(k+1)-F_{x}(k), where F_{x}(t)=\,_{t}q_{x} is the cumulative distribution function (CDF) of T_{x}. On the other hand, if \,_{t}p_{x}=1-\,_{t}q_{x} is the survival function (SF) of T_{x}, this also equals \,_{k}p_{x}-\,_{k+1}p_{x}.

To summarize, we have just found that the difference \,_{k}p_{x}-\,_{k+1}p_{x} is the PMF of K_{x}. That is, we can say that P[K_{x}=k]=\,_{k}p_{x}-\,_{k+1}p_{x} for k=0,1,2,3,\ldots.

Alternatively, the general multiplication rule implies that \,_{k+1}p_{x}=P[T_{x}>k+1]=P[T_{x}>k]\cdot P[T_{x+k}>1|T_{x}>k]=\,_{k}p_{x}\cdot \,_{1}p_{x+k}. Hence, we can also write the PMF of K_{x} as P[K_{x}=k]=\,_{k}p_{x}-\,_{k}p_{x}\cdot \,_{1}p_{x+k}=\,_{k}p_{x}\cdot (1-\,_{1}p_{x+k})=\,_{k}p_{x}\cdot \,_{1}q_{x+k} for k=0,1,2,3,\ldots.

We should note that a pre-subscript equal to “1” is usually omitted. In other words, we usually write p_{x+k} in place of \,_{1}p_{x+k} and q_{x+k} in place of \,_{1}q_{x+k}. Hence, it is common to write the PMF as P[K_{x}=k]=\,_{k}p_{x}-\,_{k+1}p_{x}=\,_{k}p_{x}\cdot q_{x+k}.

This has an intuitive interpretation. If we assume that time is measured in years, the chances of (x) living between k and k+1 whole years is the probability that (x) lives at least k years times the probability that, once (x) attains age x+k, he or she dies in the next year.

Computations and Graphs for Specific Survival Models

Let us now consider the distribution of K_{x} for the various specific survival models we have considered so far: 1) uniform (De Moivre’s Law), 2) exponential (constant force), 3) triangular, and 4) Gompertz-Makeham.

Uniform Lifetime (De Moivre’s Law)

The SF of T_{x} is \,_{t}p_{x}=1-\frac{t}{\omega-x} and the CDF of T_{x} is \,_{t}q_{x}=\frac{t}{\omega-x} for 0\leq x<\omega and 0\leq t\leq \omega-x.

Hence, the PMF of K_{x} is \,_{k}p_{x}-\,_{k+1}p_{x}=\frac{(k+1)-k}{\omega-x}=\frac{1}{\omega-x} for k=0,1,2,\ldots,\lfloor \omega-x\rfloor-1. And, for k=\lfloor \omega-x\rfloor, the PMF value is \,_{\lfloor \omega-x\rfloor}p_{x}=1-\frac{\lfloor\omega-x\rfloor}{\omega-x} since \,_{\lfloor \omega-x\rfloor+1}p_{x}=0. Note that if \omega-x is an integer, then \,_{\lfloor \omega-x\rfloor}p_{x}=0. Also note that \,_{\lfloor\omega-x\rfloor}p_{x}=\frac{\omega-x-\lfloor\omega-x\rfloor}{\omega-x}<\frac{1}{\omega-x} since \omega-x-\lfloor\omega-x\rfloor<1 (think about this inequality for specific examples if that helps).

As a double-check on this, we note that if k=0,1,2,\ldots,\lfloor \omega-x\rfloor-1, then \,_{k}p_{x}\cdot q_{x+k}=\left(1-\frac{k}{\omega-x}\right)\cdot \frac{1}{\omega-x-k}=\frac{\omega-x-k}{(\omega-x)(\omega-x-k)}=\frac{1}{\omega-x}, as seen above. And if k=\lfloor\omega-x\rfloor, then q_{x+k}=1 so \,_{k}p_{x}\cdot q_{x+k}=\,_{\lfloor\omega-x\rfloor}p_{x}=\frac{\omega-x-\lfloor\omega-x\rfloor}{\omega-x}<\frac{1}{\omega-x} as seen above as well.

An animation of the graph of this PMF is shown below. In the animation, x ranges from 0 to 40 while \omega ranges from 60 to 100. This discrete distribution is “almost uniform” except for the very last probability when k=\lfloor \omega-x\rfloor (except in the case where \omega-x is exactly an integer, in which case the discrete distribution is exactly uniform).

The probability mass function (PMF) for K_{x}=\lfloor T_{x}\rfloor when T_{x} has a (continuous) uniform distribution over [0,\omega-x]. This discrete PMF is (usually) ‘almost uniform’. It is exactly uniform when \omega-x is an integer. In the animation x ranges from 0 to 40 while \omega ranges from 60 to 100.

Exponential Lifetime (Constant Force)

The SF of T_{x} is \,_{t}p_{x}=e^{-\lambda t} and the CDF of T_{x} is \,_{t}q_{x}=1-e^{-\lambda t} for \lambda>0, x\geq 0 and t\geq 0. (Recall that this distribution is memoryless so that these formulas do not actually depend on x.)

Therefore, the PMF of K_{x} is \,_{k}p_{x}-\,_{k+1}p_{x}=e^{-\lambda k}-e^{-\lambda (k+1)}=(1-e^{-\lambda})e^{-\lambda k}=\frac{e^{\lambda}-1}{e^{\lambda}}e^{-\lambda k} for k=0,1,2,3,\ldots.

Double-checking this with the other formula gives \,_{k}p_{x}\cdot q_{x+k}=e^{-\lambda k}\left(1-e^{-\lambda\cdot 1}\right)=(1-e^{-\lambda})e^{-\lambda k} for k=0,1,2,3,\ldots.

An animation of this PMF does not vary with x. Hence, we make an animation showing what happens as \lambda varies instead. This means we are really looking at a bunch of different exponential distributions rather than the family of distributions for K_{x} at some particular value of \lambda.

In the animation, \lambda varies from 0.1 to 10. Note that when \lambda increases to around 5 or so, “almost all” the probability is at k=0. In other words, there is almost no chance of surviving at least one more year in that situation. In fact, if \lambda=5, then P[K_{x}=0]=\,_{0}p_{x}-\,_{1}p_{x}=q_{x}=1-e^{-5}\approx 0.993262.

The probability mass function (PMF) for K_{x}=\lfloor T_{x}\rfloor when T_{x} has a (continuous) exponential distribution over [0,\infty). This discrete PMF is constant in x. The animation instead shows what happens as the parameter \lambda varies from 0.1 to 10. Note that the mean of the corresponding continuous distribution is \frac{1}{\lambda}. Also note that once \lambda gets up to about 5 or so, there is almost no chance of surviving at least 1 more year.

It is also interesting to note here that the values of P[K_{x}=k] do not increase or decrease monotonically in \lambda. They go up and then down.

For example, if k=1, then P[K_{x}=1]=(1-e^{-\lambda})e^{-\lambda}=e^{-\lambda}-e^{-2\lambda}. Taking the derivative of this with respect to \lambda gives \frac{d}{d\lambda}(e^{-\lambda}-e^{-2\lambda})=-e^{-\lambda}+2e^{-2\lambda}=e^{-\lambda}(2e^{-\lambda}-1). This is equal to zero when e^{-\lambda}=\frac{1}{2}\Leftrightarrow \lambda=\ln(2)\approx 0.693.

It turns out that the graph of P[K_{x}=1]=(1-e^{-\lambda})e^{-\lambda}=e^{-\lambda}-e^{-2\lambda}, as a function of \lambda, has a maximum value of 0.25 at \lambda=\ln(2)\approx 0.693. This is the value of \lambda where the “second spike” (at k=1) has a maximum height (of 0.25) in the animation above.

Triangular Lifetime

For this example, the formulas are piecewise and very complicated (see “Triangular Survival Models”). We will be content just to look at the animated graph of the PMF below. We see that it has a triangular shape to it, just like in the continuous case. In the animation x varies from 0 to 20, d varies from 40 to 60, and \omega varies from 80 to 100.

The PMF of K_{x} when T_{x} has a triangular distribution. In the animation x varies from 0 to 20, d varies from 40 to 60, and \omega varies from 80 to 100.

Gompertz-Makeham Lifetime

In the Gompertz-Makeham model, the SF of T_{x} is \,_{t}p_{x}=\exp\left(\frac{Bc^{x}}{\ln(c)}\left(1-c^{t}\right)-At\right) for t\geq 0.

Therefore, the PMF \,_{k}p_{x}\cdot q_{x+k}=P[K_{x}=k] of K_{x} is \exp\left(\frac{Bc^{x}}{\ln(c)}\left(1-c^{k}\right)-Ak\right)-\exp\left(\frac{Bc^{x}}{\ln(c)}\left(1-c^{k+1}\right)-A(k+1)\right). This formula can be rewritten in various ways. But we will not bother doing so and just look at our animation.

The PMF of K_{x} when T_{x} has a Gompertz-Makeham distribution. In the animation A varies from .0001 to .01, B varies from .0003 to .001, c varies from 1.07 to 1.12, and x varies from 0 to 60. It is perhaps most interesting to note the effect that A has on the height of the graph for small values of k: when A is relatively large, there is a relatively high chance of dying young.

In the next post of this series on studying for Exam LTAM, we will explore the mean and standard deviation of K_{x} for these different models.

Next: Curtate Expectation of Life