Visual Linear Algebra Online, Section 1.9
![](https://i0.wp.com/infinityisreallybig.com/wp-content/uploads/2019/07/Rotation3Final.gif?resize=408%2C488&ssl=1)
Inverse functions are a kind of high-technology in mathematics. They can help you solve infinitely many problems at once!
In linear algebra, some linear transformations on finite-dimensional Euclidean space have inverse functions. Those that do have an associated inverse matrix. In this section, our main goals are to explore how to calculate the inverse matrix and to see how it is useful.
A Basic Example
But let’s start with a basic example of an inverse function.
Suppose the height above the ground, in meters, of a falling object, as a function of time, in seconds, is . The graph of this function is shown below. Note that the appropriate domain for this application consists of those values of
where
. This is equivalent to
seconds.
![The height of a falling object as a function of time. (Free fall without air resistance)](https://i0.wp.com/infinityisreallybig.com/wp-content/uploads/2019/07/HeightAsFunctionOfTime2.png?resize=525%2C299&ssl=1)
This function is decreasing because the object is falling. The graph is also concave down because the object falls faster and faster over time.
How long will it take the object to reach a height of 100 meters? Just set and solve for
.
.
Solving the General Problem
What if we want to solve this problem for an arbitrary height ? The same steps give us the inverse function
of
. It will solve the general problem.
.
The inverse function represents the solution to infinitely many problems: every problem where we solve
for
. In other words, if we are given an arbitrary height above the ground
, this function computes the amount of time
it takes to fall to that height.
When the independent and dependent variables of a function have no real-life meaning, it is traditional to swap the variables when finding an inverse function. When this is done and the axes have the same scale, the graphs of the function and its inverse are reflections of each other across the diagonal line through the origin.
However, if the variables do have real-life meaning, the variables should not be swapped.
This is the case for our example. In this case, it is better to graph by swapping the axes of the original graph instead, as shown below. The domain of this function for this application is the interval
.
![](https://i0.wp.com/infinityisreallybig.com/wp-content/uploads/2019/07/TimeAsFunctionOfHeight.png?resize=525%2C278&ssl=1)
This function is decreasing and concave down as well. This also reflects the fact that the object falls at a faster and faster rate over time, though explaining why this is true is more difficult.
Composition of the Function and Its Inverse
In Section 1.8, “Matrix Multiplication and Composite Transformations” we discussed function composition. This is where two functions are applied in sequence. It can also be thought of as “plugging one function into another”. A small circle is used to represent the binary operation of function composition, where two functions are combined in this way to obtain a third function.
There is an intimate relationship between a function and its inverse with respect to function composition. Let’s see what happens when we compose the functions for our example above.
For , we get
.
However, since we are assuming that , it follows that
and therefore
for all
in the domain of
.
Here is the computation for .
This is true for all in the domain of
.
In both cases we see that and
“undo” each other. The output of the composite function is the same as the input.
It should also be clear that we need to be careful in discussing the domains of these functions. For example, if , then
instead. Also, if
, then
is an imaginary number, which we would want to avoid for this application.
You may have also noted that if we consider the domain of to be all of
instead of the interval
, then
is no longer a one-to-one function. This would mean it has no inverse function. The domain must be restricted in order for an inverse function to exist.
Inverse Functions in General
Let and
be two nonempty sets. Suppose
is a function with domain
and codomain
. Notationally, we have represented this situation as
or
.
One-to-One (Injective) and Onto (Surjective) Functions
In previous sections, such as Section 1.5 “Matrices and Linear Transformations in Low Dimensions”, we have already discussed what it means for a function to be one-to-one (injective) and/or onto (surjective). However, here we will state precise definitions.
Definition of One-to-One (Injective) and Examples
Definition 1.9.1: A function is one-to-one (injective) if distinct inputs give distinct outputs. That is,
is one-to-one if
implies that
. This is equivalent to saying that
implies that
.
As a quick example, consider the function defined by
. If
, then
. Adding 5 to both sides of this equation gives
. But then, dividing both sides of this by 4 allows us to conclude that
. This is sufficient to prove that this function is one-to-one.
On the other hand, the function defined by
is not one-to-one since, for example,
.
Definition of Onto (Surjective) and Examples
Definition 1.9.2: A function is onto (surjective) if every element of
is an output of some input from
. That is,
is onto if for all
, there exists (at least one)
such that
.
The function defined by
is surjective. Given
, the number
has the property that
. Here are the details:
.
It is no accident that .
On the other hand, the function defined by
is not onto since
for all
. There are no inputs in
that give negative outputs.
We can “force” to be onto by restricting its codomain. Do this by defining the codomain to be
instead of
.
The function defined by
is onto. Given
, the two numbers
both have the property that
(of course, when
this is actually only one number).
Inverse Functions
Let be a one-to-one and onto function (a “bijection”). Then, given any
, there must be a unique element
such that
. The existence of this element is guaranteed by the fact that
is onto. The uniqueness of this element is guaranteed by the fact that
is one-to-one.
We have essentially just defined . To be precise, define
by saying, for each
, that
is the unique element of
that gets mapped to
. Alternatively,
is the unique element of
such that
.
Does this “undoing action” work the other way around? Given , we know that
. By definition,
is the unique element of
that gets mapped to
. But we already know that
gets mapped to
! Therefore,
.
When is one-to-one and onto, we have just seen that
can be defined. We say that
is invertible in this situation.
Also note that if and
are the functions defined by
for all
and
for all
, then
and
when
is invertible. The functions
and
are called the identity mappings on
and
, respectively.
These last couple equations are analogous to the equations for nonzero numbers
. The function
, for example, is analogous to the number 1 in the sense that
for any function
. This is analogous to the fact that
for any
.
These ideas can certainly be confusing for many people. Make sure you completely understand everything before moving on.
The Inverse Matrix of an Invertible Linear Transformation
In Section 1.7, “High-Dimensional Linear Algebra”, we saw that a linear transformation can be represented by an
matrix
. This means that, for each input
, the output
can be computed as the product
.
To do this, we define as a linear combination.
.
Then, in Section 1.8, “Matrix Multiplication and Composite Transformations”, we saw that if is a linear transformation, then
can be represented by the product of two matrices.
Assuming that is the
matrix representing
, then
, where
is defined by:
.
This product is an matrix.
Therefore, we can also write
.
A Necessary Condition for Invertibility
Evidently, if and
are to have a “chance” to be inverse functions, and for
and
to be inverse matrices, it must be the case that
. This is indeed the case. In this situation, the resulting matrices are called square matrices because they have a “square shape” (rather than a general “rectangular shape”).
Let’s start by considering inverse linear transformations and inverse matrices in low dimensions.
Application of Inverse Transformations and Inverse Matrices
As with general inverse functions, inverse linear transformations and their corresponding inverse matrices can help us solve infinitely many problems at once. If is an invertible linear transformation with invertible matrix representative
, then we can then solve the general problem
for any
.
In fact, we can write the answers to these infinitely many problems in one equation as . It is therefore useful to find a formula for
by finding
.
The Inverse Matrix of an Invertible Linear Transformation ![{\Bbb R}\longrightarrow {\Bbb R} {\Bbb R}\longrightarrow {\Bbb R}](https://s0.wp.com/latex.php?latex=%7B%5CBbb+R%7D%5Clongrightarrow+%7B%5CBbb+R%7D&bg=ffffff&fg=000000&s=0)
We have seen that a linear transformation has a formula of the form
for some
. With matrix notation, we can also write this as
.
Clearly such a function is one-to-one and onto, and hence invertible, if and only if . Just as clear, the inverse function in that situation will be
. This can also be written as
.
For confirmation of this fact, note that
Because of this, when , we say that the inverse matrix of the
matrix
is the
matrix
.
We also write this as when
. In words, this equation says that the inverse matrix of
is the matrix
when
.
By definition of a matrix times another
matrix, we can also see that
. The matrix
is called the
identity matrix. Note that
for any
.
If , then
(and
) has no multiplicative inverse. We say that
is noninvertible (or “not invertible”) when
.
In the end, is the only noninvertible
matrix.
The Inverse Matrix of an Invertible Linear Transformation ![{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} {\Bbb R}^{2}\longrightarrow {\Bbb R}^{2}](https://s0.wp.com/latex.php?latex=%7B%5CBbb+R%7D%5E%7B2%7D%5Clongrightarrow+%7B%5CBbb+R%7D%5E%7B2%7D&bg=ffffff&fg=000000&s=0)
A linear transformation has a formula of the form
for some
matrix
, say
, where
.
If is invertible, and assuming that
is a linear transformation (which we will prove in the general case further below), then
will have a matrix representative as well.
Let us suggestively call this matrix and write
. One of our goals is to determine how the entries of
depend on the entries of
. Another goal is to see what conditions on the entries of
are required for
(and
) to be invertible.
Before doing so, we first note the truth of the following equation:
.
Because of this, we say that is the
identity matrix. It has the property that
for all
. You can check that it also has the property that
for any
matrix
.
If is invertible with inverse
, then
for all
. The function
is the identity mapping on
. Its matrix is
.
Since and
, it follows that we want
. This is the key equation to help us find
.
Using the Key Equation
Below we show the equation in terms of the entries of the matrices:
.
Now multiply the matrices on the left to get:
.
If we think of and
as given, this is equivalent to a system of four linear equations in four unknowns (
and
).
This system is “decoupled” (a.k.a. “uncoupled”). The first and third equations only involve and
while the second and fourth only involve
and
. We can therefore think of this as two separate systems of two equations and two unknowns:
Using Row Operations to RREF to Find ![A^{-1} A^{-1}](https://s0.wp.com/latex.php?latex=A%5E%7B-1%7D&bg=ffffff&fg=000000&s=0)
We could do row operations on the following two augmented matrices to solve these two systems (the first system for and
and the second system for
and
).
However, it is more efficient to perform row operations to reduced row echelon form (RREF) on the single “doubly-augmented” matrix shown below.
Note that the third column represents numbers on the right-hand sides of the equations in the first system above. The fourth column
represents numbers on the right-hand sides of the equations in the second system above. This matrix is often written in “block” form as
, where
and
are thought of as “submatrices” of the entire matrix.
Just make sure you realize that the unknowns “switch” depending on which system you are focused on solving.
Details of the Row Operations
Here are the details of the elementary row operations. While doing these calculations, we assume, for the sake of convenience, that we are never dividing by zero. Obviously, there will be some situations where we would not be able to complete this calculation because of division by zero.
Continuing,
.
Finally,
.
This last step is dependent on the symbolic calculation .
This implies that the solution of the first system of two equations and two unknowns is , while the solution of the second system of two equations and two unknowns is
.
In other words,
.
It is common to also write this as
.
In so doing we are assuming that we can multiply matrices by scalars (numbers) just as we can with vectors. This is indeed a valid operation.
Whereas scalar multiplication of a number and a vector is said to be done component-wise, scalar multiplication of a number times a matrix is said to be done entry-wise. These two concepts are essentially equivalent. In fact, matrices can even be thought of as vectors, if we want. For example, it is sometimes fruitful to think of a matrix as a four-dimensional vector.
Checking the Answer for the Inverse Matrix
We can always check the answer for the inverse matrix using matrix multiplication. Let’s avoid fractions initially by excluding the denominator.
Multiplying this matrix entry-wise by the scalar yields the identity matrix
, as desired. Our answer is confirmed.
Condition for Invertibility
The calculations above are dependent on not dividing by zero. In other words, to do them, we are implicitly assuming that and
.
However, in the end, it turns out that can be zero, as long as
(so
and
if
). The formula for
in that case turns out to be the same, as it should be because of our checking of the answer in the previous section.
Note that is a quantity we have seen before, in Section 1.5, “Matrices and Linear Transformations in Low Dimensions”. It is the determinant of the matrix
, denoted by
.
These observations are important enough to be summarized and labeled as a theorem.
Theorem 1.9.1: A matrix
is invertible if and only if the determinant
. Because of this, a linear transformation
is invertible if and only if its (standard) matrix has a nonzero determinant.
Using the Inverse Matrix
With the formula for our inverse matrix in hand, we can very quickly solve an arbitrary system of two equations and two unknowns when there is a unique solution. The arbitrary system can be written both in scalar form and in matrix/vector form.
When , for any fixed vector
, the unique solution of the system is
.
This can be checked by substitution into the original system. In fact, now that we have notation for the identity matrix, we can also check it via a more abstract substitution and the associative property of matrix multiplication: . For this example,
would be the
identity matrix. However, this abstract calculation works in any dimension.
The last calculation confirms that the vector is a solution of the equation
when
is invertible. The following calculations confirm that
is the only possible solution when
is invertible.
The Inverse Matrix of an Invertible Linear Transformation ![{\Bbb R}^{3}\longrightarrow {\Bbb R}^{3} {\Bbb R}^{3}\longrightarrow {\Bbb R}^{3}](https://s0.wp.com/latex.php?latex=%7B%5CBbb+R%7D%5E%7B3%7D%5Clongrightarrow+%7B%5CBbb+R%7D%5E%7B3%7D&bg=ffffff&fg=000000&s=0)
For a linear transformation in three dimensions, let’s start with a couple particular examples rather than the general case.
An Invertible Example
Suppose is defined by:
.
If it exists, the inverse matrix of would be of the form
and we would seek to solve 3 uncoupled systems of 3 equations and 3 unknowns each.
These are most efficiently solved simultaneously by performing row operations on a “triply-augmented” matrix, keeping in mind the unknowns solved for are different for each of the last three columns.
Here is the result for this example. The details are left to you as an exercise.
Focus on the fourth column. This corresponds to the solution of the first of the three systems of 3 equations and 3 unknowns. The unique solution of that system is .
Likewise, if we focus on the fifth and sixth columns, we obtain the unique solutions of the second and third systems, respectively. The answers are and
.
The Answer and Confirmation Via Multiplication
Combining all this information leads to the conclusion that the inverse matrix of exists and is
.
We should check this. To help us avoid fractions as much as possible, notice that we can write our answer above as
Now multiply the original matrix times this new matrix with the factor of in front. Make sure you use the dot product-related version of matrix multiplication to check this as quickly as possible.
.
And this simplifies to
.
As another check, you could confirm that as well. The details of the calculations are actually different.
Using the Inverse Matrix
We can now use the inverse matrix to solve an arbitrary system of the form
.
The answer is
An Non-Invertible Example
Suppose is defined by:
.
The “triply-augmented” matrix we need to row reduce is again of the general form . Here is the result.
The third row has all zeros in the first three columns and nonzero numbers in the last three columns. The first three columns correspond to the coefficients of the unknowns of the three systems of 3 equations and 3 unknowns to solve for the entries of . This means that all three systems are inconsistent (the third equations that result from row operations are contradictions:
,
, and
).
Therefore, does not exist for this example! This means that
also does not exist. The linear transformation
(and its matrix
) are noninvertible.
Since is noninvertible, this also means the solutions of
, if there are any, cannot be written in terms of an inverse matrix.
In the end, we will see that, for this example, such a system is consistent for some vectors and inconsistent for other such vectors. When the system is consistent, there will be infinitely many solutions.
These facts correspond to the fact that, for this example, the linear transformation defined by
is neither one-to-one nor onto. Its kernel (null space of
) contains more than the zero vector and the image (column space of
) is not all of
.
Determinant Condition for Three-Dimensional Square Matrices
Is there a determinant condition for linear transformations ? Yes, there is.
We start by stating the very complicated formula for the determinant of a matrix. This formula should not be memorized. Instead, there are techniques for computation that are worth remembering. We will reserve those techniques for the next section.
And here is the determinant condition.
Theorem 1.9.2: A matrix
is invertible if and only if the determinant
. Because of this, a linear transformation
is invertible if and only if its (standard) matrix has a nonzero determinant.
Referring to the examples above, notice that
while
.
The General Case
Suppose is a linear transformation with (standard)
for some
square matrix
. There are three issues to consider.
- If
is an invertible function, is
a linear transformation? (We have been assuming this to be true so far.)
- If
is invertible (so that
is invertible) and
is linear, how is
found so that
?
- Is there a determinant condition on
to decide the invertibility of both
and
?
The answer to (3) is “yes”. We will leave our exploration of this fact for the next section.
The Inverse of an Invertible Linear Transformation is a Linear Transformation
The answer to (1) is “yes”. Verifying this is trickier than you might think. Given scalars and vectors
, we need to prove that
is operation-preserving. That is, we need to prove that
.
To demonstrate this, we use the facts that and
are inverses, that
is linear (operation-preserving), and that
is one-to-one when it is invertible. First, since
and
are inverses,
.
Next, for the same reason, and since is linear (operation-preserving),
.
Therefore, .
But now the fact that is one-to-one implies that
. This is exactly what we set out to prove.
Algorithm for Finding the Inverse Matrix of an Invertible Linear Transformation
The algorithm (method) of finding , as well as determining its invertibility, is completely analogous to what we did in the two and three-dimensional cases above.
Form the augmented matrix and use elementary row operations to obtain its reduced row echelon form (RREF).
If the block form of the result is , then
is invertible and
.
If the block form of the result contains a row of zeros in the first
columns, then
is noninvertible.
This method is justified because we are solving a systems of
linear equations in
unknowns, and each column of
on the right side of the original augmented matrix
represents the right-hand side of one of these systems. These right hand sides consist of one “1” and
“0’s”, based on the key equation
.
A Visual Example (Inverse of a Rotation)
We end this section with a visual two-dimensional example. It demonstrates that the inverse of a counterclockwise rotation about the origin is a
clockwise rotation about the origin. This can also be thought of as a
counterclockwise rotation, but the first interpretation is more natural.
Let be a
counterclockwise rotation about the origin. In Section 1.8, “Matrix Multiplication and Composite Transformations”, we saw that
.
Here are the steps of the row-reduction algorithm on the augmented matrix :
This is in the block form , so
.
The function does indeed represent a
clockwise rotation about the origin.
This is visualized in the animation below. We see both the action of and
in this animation.
![](https://i0.wp.com/infinityisreallybig.com/wp-content/uploads/2019/07/Rotation3Final.gif?resize=408%2C488&ssl=1)
It is also interesting to note that the inverse of a shear will be a shear in the “opposite direction” while the inverse of a reflection will be itself.
Exercises
- (a) Show all the details of the row operations necessary on the block augmented matrix
to confirm that the inverse matrix of
is
. (b) Confirm that
by direct multiplication.
- Let
be defined by
. Find a formula for
or confirm that
does not exist. If
exists, write a simplified formula for the solution of the general system
, where
.
Video for Section 1.9
Here’s a video overview of the content of Section 1.9.