Families of Continuous Survival Random Variables

Studying for Exam LTAM, Part 1.1

Photo by Nick Morrison on Unsplash

Where do we begin? Assuming the reader’s knowledge of the fundamentals of (calculus-based) probability theory, the best place to start is with continuous survival models. This is also a common initial chapter in the textbooks on the subject.

As I mentioned in the post “Studying for Exam LTAM Series“, the textbook resources that I have and will make use of are: (1) “Actuarial Mathematics“, 2nd Edition, by Bowers, Gerber, Hickman, Jones, and Nesbitt; (2) “Models for Quantifying Risk“, 4th Edition, by Cunningham, Herzog, and London; and (3) “Actuarial Mathematics for Life Contingent Risks“, 2nd Edition, by Dickson, Hardy, and Waters.

Reference (3) is the one I will be referring to most often since it is the reference listed in the syllabus on the Exam LTAM page of the Society of Actuaries website in 2019. Most of the time I will use notation and ideas consistent with this reference, but sometimes I will deviate from it.

Non-negative Continuous (Lifetime) Random Variables

A continuous random variable X is a quantity whose value along a “continuum” is determined “by chance”. The word “continuum” is essentially just shorthand for an interval along the real line {\Bbb R}.

But what does it mean for such a variable to be determined “by chance”? This is actually a difficult question to answer because it is hard to define what “random” means. For the purposes of modeling, however, we assume that probabilities related to the values of X can be calculated by integrating an appropriate probability density function (PDF) f(x) over an appropriate interval.

For the values of x where the values of the PDF f(x) are large or small, the “probability density” is large or small, respectively. When integrating over a given constant-length x-interval over which f(x) is large or small, we therefore expect relatively large or small probabilities, respectively.

In the abstract setting, we typically assume that f is defined for all x\in {\Bbb R} (for all real x). Our key conditions for f to be a PDF in this situation are: (1) f(x)\geq 0 for all x\in {\Bbb R} and (2) \displaystyle\int_{-\infty}^{\infty}f(x)\, dx=1. These two conditions correspond to axioms (2) and (1), respectively, on probability set functions in my article on axiomatic probability theory.

The probability of X taking on a value in an interval of the form [a,b],(a,b),[a,b), or (a,b] is taken to be the value of \displaystyle\int_{a}^{b}f(x)\, dx. We would write, for instance, P[a\leq X<b]=\displaystyle\int_{a}^{b}f(x)\, dx.

The probability of X taking on any particular value is zero, since \displaystyle\int_{c}^{c}f(x)\, dx=0 for any number c.

But what about if the random variable of interest never takes on negative values, for instance? Do we really need to define f when x<0 in such situations? And when might such a situation occur?

Let us answer the last question first. Such a situation occurs, quite often, when the random variable of interest is an amount of time. Of special interest in actuarial science is when the amount of time represents the lifetime of an individual. Of course, this is not the only situation where a variable will be non-negative. But it is common enough that many people call such a quantity a (continuous) lifetime random variable.

Let T be a continuous lifetime random variable. Then the conditions for a function f to be a PDF for T are usually written as: (1) f(t)\geq 0 for all real t\geq 0 and (2) \displaystyle\int_{0}^{\infty}f(t)\, dt=1. Probabilities are found in the same way as above, though we would assume 0\leq a.

Cumulative Distribution Functions and Survival Functions

For any kind of random variable X, the cumulative distribution function (CDF) is defined by F(x)=P[X\leq x] for any x\in {\Bbb R}. This represents the probability that the random variable X is less than the number x.

If T is a lifetime random variable, we would write F(t)=P[T\leq t] for t\geq 0. When T is continuous, then this can be calculated by doing the integral \displaystyle\int_{0}^{t}f(\tau)\, d\tau.

Suppose T is the remaining lifetime of a person, in years. Then F(t) represents their likelihood of dying in the next t years. Since this is a “negative” way of looking at life, actuaries typically change their perspective. They typically focus on the probability of the person surviving at least another t years. This is the value of 1-F(t), and it has a name: the survival function (SF). This function is initially represented as S(t), though we will introduce a different notation later.

Note that the properties of F(t) and S(t) are “complementary” with each other. By definition, F(t)+S(t)=1 for all t\geq 0. In particular, F(0)+S(0)=1. But F(0)=P(T\leq 0)=0 (the life must live “at least a millisecond”). Therefore, S(0)=1.

Also note that \displaystyle\lim_{t\rightarrow\infty}F(t)=1. Hence, \displaystyle\lim_{t\rightarrow \infty}S(t)=0. Furthermore, it should make sense that F(t) is non-decreasing while S(t) is non-increasing.

Example: Uniform Distribution (Linear Survival), a.k.a. De Moivre’s Law

The simplest kind of probability distribution, continuous or discrete, is a uniform distribution. Let T be a continuous lifetime random variable, representing the time until death of an individual. If T has a uniform distribution, the PDF would be f(t)=const for t in some interval of the form [0,\omega], for some \omega>0. In fact, since we want the integral over this interval to be 1, we have f(t)=\frac{1}{\omega} for t\in [0,\omega].

For such a uniform distribution, the probability of dying during any particular time interval [a,b]\subseteq [0,\omega] of constant length is constant. In fact, the constant is \frac{b-a}{\omega}.

To be technical, since the PDF is defined for all t\geq 0, the PDF is actually the piecewise-defined (discontinuous) function f(t)=\begin{cases} \frac{1}{\omega} & \mbox{if }t\in [0,\omega] \\ 0 & \mbox{if } t>\omega\end{cases}.

Because of this, the CDF is F(t)=\displaystyle\int_{0}^{t}f(\tau)\, d\tau=\begin{cases}\frac{t}{\omega} & \mbox{if }t\in [0,\omega] \\ 1 & \mbox{if } t>\omega\end{cases}. And the SF is S(t)=1-F(t)=\begin{cases}1-\frac{t}{\omega} & \mbox{if }t\in [0,\omega] \\ 0 & \mbox{if } t>\omega\end{cases}.

Of course, quantities of great interest here are the mean, variance, and standard deviation of T.

The mean, also called the expected value, is \mu=E[T]=\displaystyle\int_{0}^{\infty}tf(t)\, dt=\displaystyle\int_{0}^{\omega}\frac{t}{\omega}\, dt.

Doing this integral gives \mu=\frac{1}{2\omega}t^{2}|_{0}^{\omega}=\frac{\omega}{2}. This makes intuitive sense because it says that the “center of mass” of the graph of f(t), which is a horizontal line over 0\leq t\leq \omega, is halfway between 0 and \omega.

To find the variance and standard deviation, we first find the “second moment”: E[T^{2}]=\displaystyle\int_{0}^{\infty}t^{2}f(t)\, dt=\displaystyle\int_{0}^{\omega}\frac{t^{2}}{\omega}\, dt.

Evaluation leads to E[T^{2}]=\frac{1}{3\omega}t^{3}|_{0}^{\omega}=\frac{\omega^{2}}{3}. Therefore, the variance is \sigma^{2}=V[T]=\mbox{Var}[T]=E[T^{2}]-(E[T])^{2}=\frac{\omega^{2}}{3}-\frac{\omega^{2}}{4}=\frac{\omega^{2}}{12}.

This implies that the standard deviation is \sigma=\sqrt{\sigma^{2}}=\frac{\omega}{\sqrt{12}}\approx 0.2887\omega.

There is a rule of thumb that says “almost all” the probability in any distribution will be within two standard deviations of the mean. In the case of a uniform distribution, two standard deviations is within about \pm 0.5774\omega. Since this is greater than 0.5\omega, in fact, “all of the probability” will be within two standard deviations of the mean at 0.5\omega (recall that the PDF is only nonzero on the interval [0,\omega].

The following animation shows the graph of the PDF, as well as the locations of \mu=\frac{\omega}{2} and \mu\pm 2\sigma\approx \frac{\omega}{2}\pm 0.5774\omega, as \omega increases from 50 to 100.

A uniform distribution PDF over the interval [0,\omega] as \omega increases from 50 to 100. Note the location of the mean \mu (blue dot) as a “center of mass” and the locations of \mu\pm 2\sigma (magenta dots). All the probability is within 2 standard deviations of the mean in this example.

For those who are interested, here is a picture of the Mathematica code that created this animation.

Mathematica code that creates the animation above.

The Family of RVs for a Given Survival RV

For concreteness, let us indeed imagine we are modeling human lifetimes. Suppose a baby has just been born a fraction of a second ago. No one but God knows how long that baby will live. From a human perspective, the baby’s length of life will be a continuous lifetime random variable (continuous because we imagine we are measuring to the nearest millisecond). Let us also assume time is measured in years.

Let T_{0} be the continuous lifetime random variable that represents how long that newborn baby will live, in years. Suppose x\geq 0 and, at birth, we make the assumption that the baby will live to at least age x years. Based on this assumption, after age x years, the baby will have T_{0}-x further years to live. Call this quantity T_{x}. In other words, define T_{x}=T_{0}-x when the newborn baby is assumed to make it to age x.

We have thus generated an infinite “family” of random variables, \{T_{x}\}_{x\geq 0}. We now determine how the PDF, CDF, and SF depend on x. We also introduce our first bit of (tricky) actuarial notation.

Given survival to age x, the SF, as a function of a future amount of time t, is defined to be P[T_{x}>t]. Since we are assuming survival to age x, this can be computed as a conditional probability based on the distribution of T_{0}. In other words, we assume/define that P[T_{x}>t]=P[T_{0}>x+t|T_{0}>x].

Since, in general, P[A|B]=\frac{P[A\cap B]}{P[B]}, we can write P[T_{x}>t]=P[T_{0}>x+t|T_{0}>x]=\frac{P[T_{0}>x+t]}{P[T_{0}>x]}.

It may not be strictly true that an adult who is currently age x=50 has the same probability of living at least another 10 years as the probability that a newborn baby who is assumed to live to age x=50 will live at least another 10 years after that. However, for simplicity, we will often assume this is true.

If S_{x}(t) represents, for x\geq 0, the SF of T_{x}, then we can write the equation P[T_{x}>t]=P[T_{0}>x+t|T_{0}>x]=\frac{P[T_{0}>x+t]}{P[T_{0}>x]} as S_{x}(t)=\frac{S_{0}(x+t)}{S_{0}(x)}. Also note that S_{0}(x+t)=S_{0}(x)\cdot S_{x}(t), which represents the general multiplication rule P[A\cap B]=P[A]\cdot P[B|A].

Now for the tricky notation. Most typically, actuaries denote the function S_{x}(t) by the symbol \,_{t}p_{x}. Furthermore, the CDF F_{x}(t)=1-S_{x}(t) is most commonly denoted by the symbol \,_{t}q_{x}. Based on this symbolism, we can write the identity \,_{t}p_{x}+\,_{t}q_{x}=1.

By convention we also write p_{x} for \,_{1}p_{x} and q_{x} for \,_{1}q_{x}.

The PDF will be denoted in the “usual” way, however. It is f_{x}(t)=F_{x}'(t)=\frac{\partial}{\partial t}(\,_{t}q_{x})=-\frac{\partial}{\partial t}(\,_{t}p_{x}).

The equation S_{x}(t)=\frac{S_{0}(x+t)}{S_{0}(x)}, which can also be written as \,_{t}p_{x}=\frac{\,_{x+t}p_{0}}{\,_{x}p_{0}}, has an important graphic interpretation. It shows that, to obtain the survival function for the remaining lifetime of a newborn baby when you assume that baby makes it to age x, you need to translate the graph of S_{0}(t) to the left by x units and then rescale it by multiplying by \frac{1}{S_{0}(x)}. This rescaling is done so that S_{0}(0)=1.

Back to the Uniform Distribution Example

For the uniform distribution example when 0\leq t\leq \omega, we have f_{0}(t)=\frac{1}{\omega}, F_{0}(t)=\frac{t}{\omega}, and S_{0}(t)=1-\frac{t}{\omega}=\frac{\omega-t}{\omega}.

Therefore, for an assumed attained age 0\leq x< \omega, we get \,_{t}p_{x}=S_{x}(t)=\frac{S_{0}(x+t)}{S_{0}(x)}=\frac{\omega-x-t}{\omega-x}=1-\frac{t}{\omega-x} and \,_{t}q_{x}=F_{x}(t)=1-S_{x}(t)=\frac{t}{\omega-x} (for 0\leq t\leq \omega-x in both cases). From this, we also have f_{x}(t)=\frac{1}{\omega-x} for 0\leq t\leq \omega-x. A person might object that F_{x}(t) is not differentiable at t=0 and t=\omega-x for this example, but we will not worry about this.

These calculations confirm that, if T_{0} is uniform over [0,\omega], then T_{x} is uniform over [0,\omega-x]. Note that the mean would then be \frac{\omega-x}{2} and the standard deviation would be \frac{\omega-x}{\sqrt{12}}.

The animation below shows, for \omega=100, how the graphs of S_{x}(t)=\,_{t}p_{x} and F_{x}(t)=\,_{t}q_{x} change as x increases from 0 to 50.

Uniform Distribution Survival Function (SF) \,_{t}p_{x}=1-\frac{t}{100-x} and CDF \,_{t}q_{x}=F_{x}(t)=1-S_{x}(t)=\frac{t}{100-x} (these equations valid for 0\leq t\leq 100-x) as x increases from 0 to 50.

The Mathematica code for this animation is shown in the figure below. Note that \,_{t}p_{x} and \,_{t}q_{x} have been made large via keyboard shortcuts and are colored red and blue through Mathematica‘s “Writing Assistant” palette, rather than using the built-in functions Text and Style.

Mathematica code that creates the animation above.

In blog posts to come shortly, we will dive into more examples related to these ideas.

Next: The Force of Mortality (Hazard Rate Function), Studying for Exam LTAM, Part 1.2