Visual Linear Algebra Online, Section 1.9
Inverse functions are a kind of high-technology in mathematics. They can help you solve infinitely many problems at once!
In linear algebra, some linear transformations on finite-dimensional Euclidean space have inverse functions. Those that do have an associated inverse matrix. In this section, our main goals are to explore how to calculate the inverse matrix and to see how it is useful.
A Basic Example
But let’s start with a basic example of an inverse function.
Suppose the height above the ground, in meters, of a falling object, as a function of time, in seconds, is . The graph of this function is shown below. Note that the appropriate domain for this application consists of those values of where . This is equivalent to seconds.
This function is decreasing because the object is falling. The graph is also concave down because the object falls faster and faster over time.
How long will it take the object to reach a height of 100 meters? Just set and solve for .
.
Solving the General Problem
What if we want to solve this problem for an arbitrary height ? The same steps give us the inverse function of . It will solve the general problem.
.
The inverse function represents the solution to infinitely many problems: every problem where we solve for . In other words, if we are given an arbitrary height above the ground , this function computes the amount of time it takes to fall to that height.
When the independent and dependent variables of a function have no real-life meaning, it is traditional to swap the variables when finding an inverse function. When this is done and the axes have the same scale, the graphs of the function and its inverse are reflections of each other across the diagonal line through the origin.
However, if the variables do have real-life meaning, the variables should not be swapped.
This is the case for our example. In this case, it is better to graph by swapping the axes of the original graph instead, as shown below. The domain of this function for this application is the interval .
This function is decreasing and concave down as well. This also reflects the fact that the object falls at a faster and faster rate over time, though explaining why this is true is more difficult.
Composition of the Function and Its Inverse
In Section 1.8, “Matrix Multiplication and Composite Transformations” we discussed function composition. This is where two functions are applied in sequence. It can also be thought of as “plugging one function into another”. A small circle is used to represent the binary operation of function composition, where two functions are combined in this way to obtain a third function.
There is an intimate relationship between a function and its inverse with respect to function composition. Let’s see what happens when we compose the functions for our example above.
For , we get
.
However, since we are assuming that , it follows that and therefore for all in the domain of .
Here is the computation for .
This is true for all in the domain of .
In both cases we see that and “undo” each other. The output of the composite function is the same as the input.
It should also be clear that we need to be careful in discussing the domains of these functions. For example, if , then instead. Also, if , then is an imaginary number, which we would want to avoid for this application.
You may have also noted that if we consider the domain of to be all of instead of the interval , then is no longer a one-to-one function. This would mean it has no inverse function. The domain must be restricted in order for an inverse function to exist.
Inverse Functions in General
Let and be two nonempty sets. Suppose is a function with domain and codomain . Notationally, we have represented this situation as or .
One-to-One (Injective) and Onto (Surjective) Functions
In previous sections, such as Section 1.5 “Matrices and Linear Transformations in Low Dimensions”, we have already discussed what it means for a function to be one-to-one (injective) and/or onto (surjective). However, here we will state precise definitions.
Definition of One-to-One (Injective) and Examples
Definition 1.9.1: A function is one-to-one (injective) if distinct inputs give distinct outputs. That is, is one-to-one if implies that . This is equivalent to saying that implies that .
As a quick example, consider the function defined by . If , then . Adding 5 to both sides of this equation gives . But then, dividing both sides of this by 4 allows us to conclude that . This is sufficient to prove that this function is one-to-one.
On the other hand, the function defined by is not one-to-one since, for example, .
Definition of Onto (Surjective) and Examples
Definition 1.9.2: A function is onto (surjective) if every element of is an output of some input from . That is, is onto if for all , there exists (at least one) such that .
The function defined by is surjective. Given , the number has the property that . Here are the details: .
It is no accident that .
On the other hand, the function defined by is not onto since for all . There are no inputs in that give negative outputs.
We can “force” to be onto by restricting its codomain. Do this by defining the codomain to be instead of .
The function defined by is onto. Given , the two numbers both have the property that (of course, when this is actually only one number).
Inverse Functions
Let be a one-to-one and onto function (a “bijection”). Then, given any , there must be a unique element such that . The existence of this element is guaranteed by the fact that is onto. The uniqueness of this element is guaranteed by the fact that is one-to-one.
We have essentially just defined . To be precise, define by saying, for each , that is the unique element of that gets mapped to . Alternatively, is the unique element of such that .
Does this “undoing action” work the other way around? Given , we know that . By definition, is the unique element of that gets mapped to . But we already know that gets mapped to ! Therefore, .
When is one-to-one and onto, we have just seen that can be defined. We say that is invertible in this situation.
Also note that if and are the functions defined by for all and for all , then and when is invertible. The functions and are called the identity mappings on and , respectively.
These last couple equations are analogous to the equations for nonzero numbers . The function , for example, is analogous to the number 1 in the sense that for any function . This is analogous to the fact that for any .
These ideas can certainly be confusing for many people. Make sure you completely understand everything before moving on.
The Inverse Matrix of an Invertible Linear Transformation
In Section 1.7, “High-Dimensional Linear Algebra”, we saw that a linear transformation can be represented by an matrix . This means that, for each input , the output can be computed as the product .
To do this, we define as a linear combination.
.
Then, in Section 1.8, “Matrix Multiplication and Composite Transformations”, we saw that if is a linear transformation, then can be represented by the product of two matrices.
Assuming that is the matrix representing , then , where is defined by:
.
This product is an matrix.
Therefore, we can also write
.
A Necessary Condition for Invertibility
Evidently, if and are to have a “chance” to be inverse functions, and for and to be inverse matrices, it must be the case that . This is indeed the case. In this situation, the resulting matrices are called square matrices because they have a “square shape” (rather than a general “rectangular shape”).
Let’s start by considering inverse linear transformations and inverse matrices in low dimensions.
Application of Inverse Transformations and Inverse Matrices
As with general inverse functions, inverse linear transformations and their corresponding inverse matrices can help us solve infinitely many problems at once. If is an invertible linear transformation with invertible matrix representative , then we can then solve the general problem for any .
In fact, we can write the answers to these infinitely many problems in one equation as . It is therefore useful to find a formula for by finding .
The Inverse Matrix of an Invertible Linear Transformation
We have seen that a linear transformation has a formula of the form for some . With matrix notation, we can also write this as .
Clearly such a function is one-to-one and onto, and hence invertible, if and only if . Just as clear, the inverse function in that situation will be . This can also be written as .
For confirmation of this fact, note that
Because of this, when , we say that the inverse matrix of the matrix is the matrix .
We also write this as when . In words, this equation says that the inverse matrix of is the matrix when .
By definition of a matrix times another matrix, we can also see that . The matrix is called the identity matrix. Note that for any .
If , then (and ) has no multiplicative inverse. We say that is noninvertible (or “not invertible”) when .
In the end, is the only noninvertible matrix.
The Inverse Matrix of an Invertible Linear Transformation
A linear transformation has a formula of the form for some matrix , say , where .
If is invertible, and assuming that is a linear transformation (which we will prove in the general case further below), then will have a matrix representative as well.
Let us suggestively call this matrix and write . One of our goals is to determine how the entries of depend on the entries of . Another goal is to see what conditions on the entries of are required for (and ) to be invertible.
Before doing so, we first note the truth of the following equation:
.
Because of this, we say that is the identity matrix. It has the property that for all . You can check that it also has the property that for any matrix .
If is invertible with inverse , then for all . The function is the identity mapping on . Its matrix is .
Since and , it follows that we want . This is the key equation to help us find .
Using the Key Equation
Below we show the equation in terms of the entries of the matrices:
.
Now multiply the matrices on the left to get:
.
If we think of and as given, this is equivalent to a system of four linear equations in four unknowns ( and ).
This system is “decoupled” (a.k.a. “uncoupled”). The first and third equations only involve and while the second and fourth only involve and . We can therefore think of this as two separate systems of two equations and two unknowns:
Using Row Operations to RREF to Find
We could do row operations on the following two augmented matrices to solve these two systems (the first system for and and the second system for and ).
However, it is more efficient to perform row operations to reduced row echelon form (RREF) on the single “doubly-augmented” matrix shown below.
Note that the third column represents numbers on the right-hand sides of the equations in the first system above. The fourth column represents numbers on the right-hand sides of the equations in the second system above. This matrix is often written in “block” form as , where and are thought of as “submatrices” of the entire matrix.
Just make sure you realize that the unknowns “switch” depending on which system you are focused on solving.
Details of the Row Operations
Here are the details of the elementary row operations. While doing these calculations, we assume, for the sake of convenience, that we are never dividing by zero. Obviously, there will be some situations where we would not be able to complete this calculation because of division by zero.
Continuing,
.
Finally,
.
This last step is dependent on the symbolic calculation .
This implies that the solution of the first system of two equations and two unknowns is , while the solution of the second system of two equations and two unknowns is .
In other words,
.
It is common to also write this as
.
In so doing we are assuming that we can multiply matrices by scalars (numbers) just as we can with vectors. This is indeed a valid operation.
Whereas scalar multiplication of a number and a vector is said to be done component-wise, scalar multiplication of a number times a matrix is said to be done entry-wise. These two concepts are essentially equivalent. In fact, matrices can even be thought of as vectors, if we want. For example, it is sometimes fruitful to think of a matrix as a four-dimensional vector.
Checking the Answer for the Inverse Matrix
We can always check the answer for the inverse matrix using matrix multiplication. Let’s avoid fractions initially by excluding the denominator.
Multiplying this matrix entry-wise by the scalar yields the identity matrix , as desired. Our answer is confirmed.
Condition for Invertibility
The calculations above are dependent on not dividing by zero. In other words, to do them, we are implicitly assuming that and .
However, in the end, it turns out that can be zero, as long as (so and if ). The formula for in that case turns out to be the same, as it should be because of our checking of the answer in the previous section.
Note that is a quantity we have seen before, in Section 1.5, “Matrices and Linear Transformations in Low Dimensions”. It is the determinant of the matrix , denoted by .
These observations are important enough to be summarized and labeled as a theorem.
Theorem 1.9.1: A matrix is invertible if and only if the determinant . Because of this, a linear transformation is invertible if and only if its (standard) matrix has a nonzero determinant.
Using the Inverse Matrix
With the formula for our inverse matrix in hand, we can very quickly solve an arbitrary system of two equations and two unknowns when there is a unique solution. The arbitrary system can be written both in scalar form and in matrix/vector form.
When , for any fixed vector , the unique solution of the system is
.
This can be checked by substitution into the original system. In fact, now that we have notation for the identity matrix, we can also check it via a more abstract substitution and the associative property of matrix multiplication: . For this example, would be the identity matrix. However, this abstract calculation works in any dimension.
The last calculation confirms that the vector is a solution of the equation when is invertible. The following calculations confirm that is the only possible solution when is invertible.
The Inverse Matrix of an Invertible Linear Transformation
For a linear transformation in three dimensions, let’s start with a couple particular examples rather than the general case.
An Invertible Example
Suppose is defined by:
.
If it exists, the inverse matrix of would be of the form and we would seek to solve 3 uncoupled systems of 3 equations and 3 unknowns each.
These are most efficiently solved simultaneously by performing row operations on a “triply-augmented” matrix, keeping in mind the unknowns solved for are different for each of the last three columns.
Here is the result for this example. The details are left to you as an exercise.
Focus on the fourth column. This corresponds to the solution of the first of the three systems of 3 equations and 3 unknowns. The unique solution of that system is .
Likewise, if we focus on the fifth and sixth columns, we obtain the unique solutions of the second and third systems, respectively. The answers are and .
The Answer and Confirmation Via Multiplication
Combining all this information leads to the conclusion that the inverse matrix of exists and is
.
We should check this. To help us avoid fractions as much as possible, notice that we can write our answer above as
Now multiply the original matrix times this new matrix with the factor of in front. Make sure you use the dot product-related version of matrix multiplication to check this as quickly as possible.
.
And this simplifies to
.
As another check, you could confirm that as well. The details of the calculations are actually different.
Using the Inverse Matrix
We can now use the inverse matrix to solve an arbitrary system of the form
.
The answer is
An Non-Invertible Example
Suppose is defined by:
.
The “triply-augmented” matrix we need to row reduce is again of the general form . Here is the result.
The third row has all zeros in the first three columns and nonzero numbers in the last three columns. The first three columns correspond to the coefficients of the unknowns of the three systems of 3 equations and 3 unknowns to solve for the entries of . This means that all three systems are inconsistent (the third equations that result from row operations are contradictions: , , and ).
Therefore, does not exist for this example! This means that also does not exist. The linear transformation (and its matrix ) are noninvertible.
Since is noninvertible, this also means the solutions of , if there are any, cannot be written in terms of an inverse matrix.
In the end, we will see that, for this example, such a system is consistent for some vectors and inconsistent for other such vectors. When the system is consistent, there will be infinitely many solutions.
These facts correspond to the fact that, for this example, the linear transformation defined by is neither one-to-one nor onto. Its kernel (null space of ) contains more than the zero vector and the image (column space of ) is not all of .
Determinant Condition for Three-Dimensional Square Matrices
Is there a determinant condition for linear transformations ? Yes, there is.
We start by stating the very complicated formula for the determinant of a matrix. This formula should not be memorized. Instead, there are techniques for computation that are worth remembering. We will reserve those techniques for the next section.
And here is the determinant condition.
Theorem 1.9.2: A matrix is invertible if and only if the determinant . Because of this, a linear transformation is invertible if and only if its (standard) matrix has a nonzero determinant.
Referring to the examples above, notice that
while
.
The General Case
Suppose is a linear transformation with (standard) for some square matrix . There are three issues to consider.
- If is an invertible function, is a linear transformation? (We have been assuming this to be true so far.)
- If is invertible (so that is invertible) and is linear, how is found so that ?
- Is there a determinant condition on to decide the invertibility of both and ?
The answer to (3) is “yes”. We will leave our exploration of this fact for the next section.
The Inverse of an Invertible Linear Transformation is a Linear Transformation
The answer to (1) is “yes”. Verifying this is trickier than you might think. Given scalars and vectors , we need to prove that is operation-preserving. That is, we need to prove that
.
To demonstrate this, we use the facts that and are inverses, that is linear (operation-preserving), and that is one-to-one when it is invertible. First, since and are inverses,
.
Next, for the same reason, and since is linear (operation-preserving),
.
Therefore, .
But now the fact that is one-to-one implies that . This is exactly what we set out to prove.
Algorithm for Finding the Inverse Matrix of an Invertible Linear Transformation
The algorithm (method) of finding , as well as determining its invertibility, is completely analogous to what we did in the two and three-dimensional cases above.
Form the augmented matrix and use elementary row operations to obtain its reduced row echelon form (RREF).
If the block form of the result is , then is invertible and .
If the block form of the result contains a row of zeros in the first columns, then is noninvertible.
This method is justified because we are solving a systems of linear equations in unknowns, and each column of on the right side of the original augmented matrix represents the right-hand side of one of these systems. These right hand sides consist of one “1” and “0’s”, based on the key equation .
A Visual Example (Inverse of a Rotation)
We end this section with a visual two-dimensional example. It demonstrates that the inverse of a counterclockwise rotation about the origin is a clockwise rotation about the origin. This can also be thought of as a counterclockwise rotation, but the first interpretation is more natural.
Let be a counterclockwise rotation about the origin. In Section 1.8, “Matrix Multiplication and Composite Transformations”, we saw that
.
Here are the steps of the row-reduction algorithm on the augmented matrix :
This is in the block form , so .
The function does indeed represent a clockwise rotation about the origin.
This is visualized in the animation below. We see both the action of and in this animation.
It is also interesting to note that the inverse of a shear will be a shear in the “opposite direction” while the inverse of a reflection will be itself.
Exercises
- (a) Show all the details of the row operations necessary on the block augmented matrix to confirm that the inverse matrix of is . (b) Confirm that by direct multiplication.
- Let be defined by . Find a formula for or confirm that does not exist. If exists, write a simplified formula for the solution of the general system , where .
Video for Section 1.9
Here’s a video overview of the content of Section 1.9.