Matrices and Linear Transformations in Low Dimensions

Visual Linear Algebra Online, Section 1.5

The null space (kernel) of the linear transformation defined by T\left(\begin{array}{c} x \\ y \end{array}\right)=2x+y is a straight line through the origin in the plane {\Bbb R}^{2}.

In mathematics, a matrix is not a simulated reality, but instead just a plain-old rectangular array of numbers. This does not mean, however, that mathematical matrices are uninteresting.

In fact, we will ultimately see that matrices have a ton of fascinating interpretations and applications.

Matrices can be of any “size”. In general, an m\times n real matrix is an array of real numbers with m rows and n columns. For example, the matrix A shown below is a 3\times 4 real matrix.

A=\left(\begin{array}{cccc} 9 & -2 & -4 & -7 \\ -1 & -5 & 6 & 8 \\ \sqrt{2} & 0 & \sqrt[3]{7} & -\pi  \end{array}\right)

Some people think matrices look nicer if we use square brackets [ ] rather than parentheses ( ) to contain it. The object below represents the exact same matrix as the one above.

A=\left[\begin{array}{cccc} 9 & -2 & -4 & -7 \\ -1 & -5 & 6 & 8 \\ \sqrt{2} & 0 & \sqrt[3]{7} & -\pi  \end{array}\right]

We will use both notations in Visual Linear Algebra Online. However, in this section, we will stick with parentheses because our matrices will be relatively “small”. In fact, we could even use the label “small matrices”. In this section, we explore the relationship between such small matrices and linear transformations in two dimensions, as well as even in one-dimension. Starting in the next section of this online textbook we will explore higher-dimensional situations.

Before we continue, you should realize that a two-dimensional vector {\bf u}=\left(\begin{array}{c} a \\ b \end{array}\right) can certainly be thought of as a 2\times 1 matrix. In fact, you should also realize that a real number a can even be thought of either as a one-dimensional vector or as a 1\times 1 matrix \left(a\right) (or [a])!!!

Low-Dimensional Linear Transformations

Recall from Section 1.4, “Linear Transformations in Two Dimensions” that a transformation (function) T is said to be linear if T(s{\bf u}+t{\bf v})=sT({\bf u})+tT({\bf v}) for all vectors {\bf u} and {\bf v} in the domain of T and all scalars (real numbers) s and t. This will essentially be the definition of linearity we use throughout this online textbook, no matter what the situation is.

Also recall that {\Bbb R} represents the real line (the set of real numbers) and {\Bbb R}^{2} represents the plane.

Linear Transformations {\Bbb R}\longrightarrow {\Bbb R}

As explored in the exercises of Section 1.4, “Linear Transformations in Two Dimensions”, a function T:{\Bbb R}\longrightarrow {\Bbb R} defined by the formula T(x)=ax+b is a linear transformation if and only if b=0.

In fact, it can also be shown that if T:{\Bbb R}\longrightarrow {\Bbb R} is a linear transformation, it must be of the form T(x)=ax for some constant a, though this is harder to prove.

Of course, in rectangular coordinates, the graph of such a function is a straight line through the origin (with slope a).

Since numbers can be thought of as 1\times 1 matrices, this equation T(x)=ax can be written as T(x)=(a)(x). In other words, to find the output T(x) at the input x, multiply the matrix (a) times the matrix (x). Do this by multiplying the number a by the number x.

In fact, if we let A=(a) and {\bf x}=(x), we can then write T(x)=A{\bf x}. (Read this as “T of x equals A times {\bf x}“.) We could also write this as T({\bf x})=A{\bf x}. (Read this in the same kind of way.)

Yes, this may seem weird. But the weirdness is worth it!!!

For what reason? Because it leads to natural generalizations.

We now generalize.

Linear Transformations {\Bbb R}\longrightarrow {\Bbb R}^{2}

What is the form of a linear transformation T:{\Bbb R}\longrightarrow {\Bbb R}^{2}?

Based on the more basic notion of linearity from calculus, as well as the “new” version of linearity from Section 1.4, “Linear Transformations in Two Dimensions”, we might hope the statement below is true for T:{\Bbb R}\longrightarrow {\Bbb R}^{2}. This statement is indeed an important mathematical statement of fact, so we label it with the word “Theorem”.

Theorem 1.5.1: Let a, b, c, and d be (real) constants. Then a function T:{\Bbb R}\longrightarrow {\Bbb R}^{2} defined by T(x)=\left(\begin{array}{c} ax+b \\ cx+d\end{array}\right) is a linear transformation if and only if b=d=0.

Theorems Require Proof

Any mathematical statement of fact like this technically requires a proof before we should believe it. Proofs are, in fact, an essential part of any course on linear algebra. We will emphasize them less than most textbooks. But we will still do them sometimes, including in this section.

This particular theorem is an “if and only if” statement. Because of this, there are really two logical implications to prove.

  1. If T(x)=\left(\begin{array}{c} ax+b \\ cx+d\end{array}\right) is a linear transformation, then b=d=0, and
  2. If b=d=0, then T(x)=\left(\begin{array}{c} ax+b \\ cx+d\end{array}\right) is a linear transformation.
The Proof

We now commence with the proof.

Proof:

Suppose first that T(x)=\left(\begin{array}{c} ax+b \\ cx+d\end{array}\right) is a linear transformation. This means that T(sx+ty)=sT(x)+tT(y) for any numbers s,x,t,y\in {\Bbb R}. (Our input “vectors” here are actually numbers x and y, as well as the linear combination sx+ty. The output of T is a “true” vector.)

Now calculate using the formula for T and vector operations. We have

T(sx+ty)=\left(\begin{array}{c} a(sx+ty)+b \\ c(sx+ty)+d\end{array}\right)=\left(\begin{array}{c} asx+aty+b \\ csx+cty+d\end{array}\right) and

sT(x)+tT(y)=s\left(\begin{array}{c} ax+b \\ cx+d\end{array}\right)+t\left(\begin{array}{c} ay+b \\ cy+d\end{array}\right)=\left(\begin{array}{c} asx+aty+sb+tb \\ csx+cty+sd+td \end{array}\right).

Since \left(\begin{array}{c} asx+aty+b \\ csx+cty+d\end{array}\right)=\left(\begin{array}{c} asx+aty+sb+tb \\ csx+cty+sd+td \end{array}\right) for any choice of s,x,t,y\in {\Bbb R}, it must be the case that b=(s+t)b and d=(s+t)d for any choices of s,t\in {\Bbb R}. The only way this can happen is if b=d=0, so this is what we conclude.

To prove the other (reverse) implication, start by assuming that b=d=0. We must then prove that T(x)=\left(\begin{array}{c} ax \\ cx\end{array}\right) is a linear transformation. In other words, we must show that T(sx+ty)=sT(x)+tT(y) for any numbers s,x,t,y\in {\Bbb R}.

But for such arbitrary numbers, this is easily done with a couple vector calculations as follows.

T(sx+ty)=\left(\begin{array}{c} a(sx+ty) \\ c(sx+ty)\end{array}\right)=\left(\begin{array}{c} asx+aty \\ csx+cty\end{array}\right)

and

sT(x)+tT(y)=s\left(\begin{array}{c} ax \\ cx\end{array}\right)+t\left(\begin{array}{c} ay \\ cy\end{array}\right)=\left(\begin{array}{c} asx+aty \\ csx+cty\end{array}\right)

Q.E.D.

Side Note: when a mathematician finishes a proof, they often get very excited and start talking in Latin. The abbreviation Q.E.D. stands for “Quad Erat Demonstrandum“. It basically means “that which was to be demonstrated“. In other words, “We are done! Yay!

THE Form of the Formula

The formula T(x)=\left(\begin{array}{c} ax \\ cx\end{array}\right) can be thought of as the scalar x times the vector \left(\begin{array}{c} a \\ c\end{array}\right). In other words, T(x)=x\left(\begin{array}{c} a \\ c\end{array}\right).

If, instead, we want to think of this as a “matrix multiplication”, it will be standard to put the x on the right and use parentheses (x) or square brackets [x]. If we let A=\left(\begin{array}{c} a \\ c\end{array}\right), then we can write:

T(x)=A(x)=\left(\begin{array}{c} a \\ c\end{array}\right)(x)=\left(\begin{array}{c} ax \\ cx\end{array}\right).

Once again, make sure you realize the subtle point that T(x) is “T of x” while A(x) is “A times (x)“. In fact, this is a point where it might be more clear to write A[x] rather than A(x).

If we let {\bf x}=(x), this can also be written as T({\bf x})=A{\bf x}.

We are therefore multiplying a 2\times 1 matrix A by a 1\times 1 matrix {\bf x}=(x) to get a 2\times 1 matrix (vector) T(x).

Once again, this may seem weird. But the weirdness is worth it!!!

We continue generalizing further below. For the moment, we discuss how to graph such a linear transformation.

This Graph (IMAGE) of Such A Linear Transformation is a Parameterized Line through the origin

The name of this text is Visual Linear Algebra Online. So let’s start visualizing!

A linear transformation T:{\Bbb R}\longrightarrow {\Bbb R}^{2} with formula T(x)=\left(\begin{array}{c} ax \\ cx\end{array}\right) (or, as a point rather than a vector, T(x)=(ax,cx)) can be visualized as a “parametric curve” as the “parameter” x varies over {\Bbb R}.

If either a\not=0 or c\not=0, this will be a straight line through the origin parallel to the nonzero vector \left(\begin{array}{c} a \\ c\end{array}\right). If a=c=0, it will just be the origin.

In the animation below, a=2 and c=1. The parameter x varies from -4 to 4. If we let x vary over {\Bbb R}, the resulting line (going on forever and ever) would be the image T({\Bbb R})=\{T(x)\, |\, x\in {\Bbb R}\} of the linear transformation over its entire domain. The image is a one-dimensional subset of the two-dimensional space {\Bbb R}^{2}. Clearly, such a linear transformation cannot be an onto function. There are plenty of points in {\Bbb R}^{2} that are not in the image T({\Bbb R}).

The image of the linear transformation T:{\Bbb R}\longrightarrow {\Bbb R}^{2} defined by T(x)=(2x,x) (or T(x)=\left(\begin{array}{c} 2x \\ x\end{array}\right)) is the entire blue line (going on forever and ever in both directions). This is a one-dimensional subset of two-dimensional space. Therefore, T is not an onto function.

Such a function is one-to-one, however (when either a\not=0 or c\not=0). If T(x)=T(y), then \left(\begin{array}{c} ax \\ cx\end{array}\right)=\left(\begin{array}{c} ay \\ cy\end{array}\right). But this implies that ax=ay and cx=cy. Since either a\not=0 or c\not=0, we can conclude that x=y. Therefore, distinct inputs correspond to distinct outputs.

Linear Transformations {\Bbb R}^{2}\longrightarrow {\Bbb R}

What is the form of a linear transformation T:{\Bbb R}^{2}\longrightarrow {\Bbb R}? We state the answer as a theorem.

Theorem 1.5.2: Let a, b, and c be (real) constants. Then a function T:{\Bbb R}^{2}\longrightarrow {\Bbb R} defined by T\left(\begin{array}{c} x \\ y \end{array}\right)=ax+by+c is a linear transformation if and only if c=0.

Once again, this is an “if and only if” statement, so the proof requires two parts.

Proof:

Suppose that a function T:{\Bbb R}^{2}\longrightarrow {\Bbb R} defined by T\left(\begin{array}{c} x \\ y \end{array}\right)=ax+by+c is a linear transformation. Then, for any scalars s,t\in {\Bbb R} and any vectors {\bf u}=\left(\begin{array}{c} x_{1} \\ y_{1} \end{array}\right), {\bf v}=\left(\begin{array}{c} x_{2} \\ y_{2} \end{array}\right)\in {\Bbb R}^{2}, we can say that T(s{\bf u}+t{\bf v})=sT({\bf u})+tT({\bf v}).

Using the formula for T, we compute

T(s{\bf u}+t{\bf v})=T\left(\begin{array}{c} sx_{1}+tx_{2} \\ sy_{1}+ty_{2} \end{array}\right)=a(sx_{1}+tx_{2})+b(sy_{1}+ty_{2})+c

=asx_{1}+atx_{2}+bsy_{1}+bty_{2}+c.

On the other hand,

sT({\bf u})+tT({\bf v})=sT\left(\begin{array}{c} x_{1} \\ y_{1} \end{array}\right)+tT\left(\begin{array}{c} x_{2} \\ y_{2} \end{array}\right) =s(ax_{1}+by_{1}+c)+t(ax_{2}+by_{2}+c)=asx_{1}+bsy_{1}+cs+atx_{2}+bty_{2}+ct.

This simplifies to

=asx_{1}+atx_{2}+bsy_{1}+bty_{2}+c(s+t).

Since s and t are arbitrary and these final expressions are equal, this implies that c=0.

To prove the other (reverse) implication, start by assuming that c=0. We must then prove that T\left(\begin{array}{c} x \\ y\end{array}\right)=ax+by is a linear transformation.

Let s,t\in {\Bbb R} and {\bf u}=\left(\begin{array}{c} x_{1} \\ y_{1} \end{array}\right), {\bf v}=\left(\begin{array}{c} x_{2} \\ y_{2} \end{array}\right)\in {\Bbb R}^{2} be given. The proof is finished with a couple computations.

T(s{\bf u}+t{\bf v})=T\left(\begin{array}{c} sx_{1}+tx_{2} \\ sy_{1}+ty_{2} \end{array}\right)=a(sx_{1}+tx_{2})+b(sy_{1}+ty_{2}) =asx_{1}+atx_{2}+bsy_{1}+bty_{2},

and

sT({\bf u})+tT({\bf v})=sT\left(\begin{array}{c} x_{1} \\ y_{1} \end{array}\right)+tT\left(\begin{array}{c} x_{2} \\ y_{2} \end{array}\right)=asx_{1}+bsy_{1}+atx_{2}+bty_{2}.

These last two expressions are equal. We are done.

Q.E.D.

THE FORM OF THE FORMULA

The formula T\left(\begin{array}{c} x \\ y \end{array}\right)=ax+by can be thought of as the dot product of the vector {\bf u}=\left(\begin{array}{c} a \\ b \end{array}\right) and the vector {\bf x}=\left(\begin{array}{c} x \\ y \end{array}\right) (see information about dot products in Section 1.2, “Vectors in Two Dimensions”). In other words, we can write T({\bf x})={\bf u}\cdot {\bf x}.

Alternatively, if we think of ax+by as a one-dimensional vector (1\times 1 matrix) (ax+by), we can, by the definition of vector addition and scalar multiplication, write it as x(a)+y(b) (or as (x)(a)+(y)(b), but we are writing it as x(a)+y(b) for a good reason).

The expression x(a)+y(b) is a “linear combination” of the one-dimensional vectors (1\times 1 matrices) (a) and (b). The values x,y\in {\Bbb R} are called the weights in this linear combination (they are sometimes called the scalars or coefficients in the linear combination as well).

By convention (and it is a good convention), we now also think of (ax+by) as the product of the 1\times 2 matrix \left(\begin{array}{cc} a & b \end{array}\right) with the 2\times 1 matrix \left(\begin{array}{c} x \\ y \end{array}\right).

Symbolically, we write:

T\left(\begin{array}{c} x \\ y\end{array}\right)=\left(\begin{array}{cc} a & b \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right).

If A=\left(\begin{array}{cc} a & b \end{array}\right) and {\bf x}=\left(\begin{array}{c} x \\ y \end{array}\right), this can also be written as T({\bf x})=A{\bf x}.

Yet again, this may seem weird. But the weirdness is worth it!!!

SUCH A LINEAR TRANSFORMATION Has a Graph which is a Plane Through the Origin in Three-Dimensional Space

We have not yet discussed rectangular coordinates in three-dimensional space {\Bbb R}^{3}. That will be rectified right now.

First of all, you should realize that three-dimensional space is essentially just the space we live in — minus any objects “in” it at all. It is the space of height, width, and depth. It is an idealized three-dimensional “world of the mind”.

Rectangular Coordinates in 3D

As with the plane, we can impose a rectangular (Cartesian) coordinate system on three-dimensional space. We first choose a point to be the origin {\mathcal O} and then choose three mutually-perpendicular lines to be our axes. Orientations of the positive directions for these axes should also be chosen.

The resulting picture might look something like what is shown below, with axes labeled x, y, and z. Of course, this must be imagined with perspective to truly “see” its three-dimensional nature.

Three-dimensional space {\Bbb R}^{3} with axes chosen in a mutually-perpendicular manner to define a three-dimensional rectangular coordinate system.

Given a point P\in {\Bbb R}^{3}, we define its three-dimensional rectangular coordinates (x,y,z) in the following way. Let x be the signed distance of P to the yz-plane (the plane containing the y-axis and z-axis). Let y be the signed distance of P to the xz-plane (the plane containing the x-axis and z-axis). And let z be the signed distance of P to the xy-plane (the plane containing the x-axis and y-axis). We illustrate this below. The rectangular coordinates of the (red) point in question are (x,y,z)=(-50,70,90). Notice that the point is “behind” the yz-plane, where x<0. Of course, the meaning of “behind” depends on your viewing perspective.

The red point has rectangular coordinates (x,y,z)=(-50,70,90). The viewing angle is changing by 90^{\circ} in the animation. The red line has signed ‘length’ that is negative, -50. The blue line has signed length 70 and the green line has signed length 90.
Three-Dimensional “Graph Paper”

We could also think of this in terms of three-dimensional “graph paper” with “boxes” (cubes), as illustrated below.

The point with rectangular coordinates (-50,70,90) shown in three-dimensional ‘graph paper’. Each ‘box’ (cube) in this ‘paper’ has dimensions 50\times 50\times 50.
Graphs of Linear Transformations {\Bbb R}^{2}\longrightarrow {\Bbb R} in 3D

For any function f:{\Bbb R}^{2}\longrightarrow {\Bbb R}, linear or not, its graph is defined as the set \{(x,y,z)\, |\, z=f(x,y)\mbox{ and } (x,y)\in {\Bbb R}^{2}\}\subseteq {\Bbb R}^{3}.

To draw this graph, it is usually a matter of plotting a bunch of points in 3D and “connecting” these points with a two-dimensional “surface” that fits them. The graph will typically be a surface because the domain is two-dimensional.

For a linear transformation (function), when rectangular coordinates are used, this will result in a graph which is a plane through the origin in {\Bbb R}^{3}.

For example, suppose A=\left(\begin{array}{cc} 2 & 1 \end{array}\right), {\bf x}=\left(\begin{array}{c} x \\ y \end{array}\right), and T({\bf x})=T\left(\begin{array}{c} x \\ y \end{array}\right)=A{\bf x}=\left(\begin{array}{cc} 2 & 1 \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right)=2x+y. Examples of points that can be plotted from this formula include (x,y,z)=(3,4,2\cdot 3+4)=(3,4,10), (x,y,z)=(-2,1,2\cdot (-2)+2)=(-2,1,-2), (x,y,z)=(1,-2,2\cdot 1+(-2))=(1,-2,0), and (x,y,z)=(0,0,2\cdot 0+0)=(0,0,0).

Here is the resulting graph, with the computed points highlighted. The first three are blue and the one at the origin is red. Make sure you understand which blue point has which coordinates. Note that the positive y-axis goes “into” the screen.

The graph of the linear transformation T\left(\begin{array}{c} x \\ y \end{array}\right)=2x+y is a plane that goes through the origin. This is true for any linear transformation T:{\Bbb R}^{2}\longrightarrow {\Bbb R}.
Discussion of Properties
Is T Onto?

Suppose T:{\Bbb R}^{2}\longrightarrow {\Bbb R} has the formula T\left(\begin{array}{c} x \\ y\end{array}\right)=\left(\begin{array}{cc} a & b \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right)=ax+by. If either a\not=0 or b\not=0, then T will definitely be onto. If a=b=0, then T is definitely not onto.

To confirm the first statement of the previous paragraph, let c\in {\Bbb R} be arbitrary. If a\not=0, then T\left(\begin{array}{c} \frac{c}{a} \\ 0\end{array}\right)=a\cdot \frac{c}{a}+0=c. If b\not=0, then T\left(\begin{array}{c} 0 \\ \frac{c}{b}\end{array}\right)=0+b\cdot \frac{c}{b}=c.

Is T one-to-one?

On the other hand, no matter what the values of a and b are, T will most definitely not be one-to-one.

If a=b=0, then every point gets mapped to the number 0.

On the other hand, if, for example, b\not=0, then the set \{(x,y)\, |\, ax+by=0\}=\{(x,-\frac{a}{b}x)\, |\, x\in {\Bbb R}\} is the set of all points/vectors that get mapped to zero. This is a (parametrically-defined) line through the origin parallel to the vector \left(\begin{array}{c} 1 \\ -\frac{a}{b} \end{array}\right).

This last set has two common names. We call it either a) the null space of T (the word “null” emphasizes that it is the set of all points/vectors that get mapped to zero) or b) the kernel of T.

Actually, the first name is more commonly used when referring to the matrix A=\left(\begin{array}{cc} a & b \end{array}\right) that defines T by the formula T({\bf x})=A{\bf x}, but we will use both names.

The null space (kernel) of the example T\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{cc} 2 & 1 \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right)=2x+y is shown below. Note that it is a line through the origin.

The null space (kernel) of the linear transformation T:{\Bbb R}^{2}\longrightarrow {\Bbb R} defined by T\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{cc} 2 & 1 \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right)=2x+y is a straight line through the origin (with slope -2) in {\Bbb R}^{2}.

Linear Transformations {\Bbb R}^{2}\longrightarrow {\Bbb R}^{2}

By now, you might be able to guess the formula for a linear transformation {\Bbb R}^{2}\longrightarrow {\Bbb R}^{2}.

the form of the formula

The general formula is of the form T\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{c} ax+by \\ cx+dy \end{array}\right) for some constants a,b,c,d\in {\Bbb R}. We will leave a proof of this to the exercises.

Notice that the output is a linear combination of two vectors.

\left(\begin{array}{c} ax+by \\ cx+dy \end{array}\right)=\left(\begin{array}{c} ax \\ cx \end{array}\right)+\left(\begin{array}{c} by \\ dy \end{array}\right)=x\left(\begin{array}{c} a \\ c \end{array}\right)+y\left(\begin{array}{c} b \\ d\end{array}\right).

Now we already know that \left(\begin{array}{cc} a & b \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right)=ax+by=xa+yb. This is a linear combination of a and b with x and y as the weights.

Likewise, \left(\begin{array}{cc} c & d \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right)=cx+dy=xc+yd. This is a linear combination of c and d with x and y as the weights.

Now imagine “stacking” the 1\times 2 matrices \left(\begin{array}{cc} a & b \end{array}\right) and \left(\begin{array}{cc} c & d \end{array}\right) on top of each other. This gives a 2\times 2 matrix A=\left(\begin{array}{cc} a & b \\ c & d \end{array}\right). Now write the following equation, when {\bf x}=\left(\begin{array}{c} x \\ y \end{array}\right):

A{\bf x}=\left(\begin{array}{cc} a & b \\ c & d \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{c} ax+by \\ cx+dy \end{array}\right)=x\left(\begin{array}{c} a \\ c \end{array}\right)+y\left(\begin{array}{c} b \\ d\end{array}\right).

For an arbitrary 2\times 2 matrix A and an arbitrary two-dimensional vector (2\times 1 matrix) {\bf x}, it therefore makes sense to define the product A{\bf x} as follows. It is the two-dimensional vector obtained by forming a linear combination of the columns of A with the entries of {\bf x} as the weights.

This last statement is the key definition that will generalize to higher dimensions. It is also a key to understanding the relationships between matrix multiplication and many applications.

Summary

To summarize, suppose T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} is a linear transformation. Then T\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{c} ax+by \\ cx+dy \end{array}\right) for some constants a,b,c,d\in {\Bbb R}. If A=\left(\begin{array}{cc} a & b \\ c & d \end{array}\right) and {\bf x}=\left(\begin{array}{c} x \\ y \end{array}\right), then T({\bf x})=A{\bf x}.

Always remember! Read this as: T of {\bf x} equals A times {\bf x}!

The matrix/vector product A{\bf x} is called the (standard) matrix representation of the output of linear transformation T({\bf x}). The matrix A itself is called the (standard) matrix of T.

And, once again, the output of T is a linear combination of the columns of A. The entries of {\bf x} are the weights of the linear combination. That is, the output of T is:

\left(\begin{array}{c} ax+by \\ cx+dy \end{array}\right)=x\left(\begin{array}{c} a \\ c \end{array}\right)+y\left(\begin{array}{c} b \\ d \end{array}\right).

Discussion of Properties

In the previous section, Section 1.4, “Linear Transformations in Two Dimensions”, we considered many examples of transformations T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2}. We visualized these examples in two ways: 1) as mappings and 2) as vector fields.

For the linear transformation examples we considered in Section 1.4, all but one were both one-to-one and onto functions. Only the projection mapping was neither one-to-one nor onto.

In mathematics, it is dangerous to reason inductively from examples to general principles. Mathematics is based on deductive reasoning. However, it does turn out to be true that “most” linear transformations T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} are both one-to-one and onto. We can make this precise by stating the following theorem.

Theorem 1.5.3: Let T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} be a linear transformation defined by T\left(\begin{array}{c} x \\ y \end{array}\right)=A{\bf x}=\left(\begin{array}{cc} a & b \\ c & d \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{c} ax+by \\ cx+dy \end{array}\right). Then T is one-to-one and onto if and only if ad-bc\not=0.

Recall from Section 1.3, “Systems of Linear Equations in Two Dimensions” that the quantity ad-bc is called the determinant of the 2\times 2 matrix A=\left(\begin{array}{cc} a & b \\ c & d \end{array}\right). Our shorthand symbol for this quantity will be \det(A).

Therefore, this statement of this theorem is equivalent to saying the following. A linear transformation T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} is one-to-one and onto if and only if its matrix A has \det(A)\not=0.

The following statement is also a theorem.

Theorem 1.5.4: Let A be the matrix for a linear transformation T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2}. Then the null space (kernel) of T is just the origin {\bf 0} if and only if \det(A)\not=0.

partial proof of Theorem 1.5.4

We will leave a proof of Theorem 1.5.3 for the exercises. Instead, we will provide a (partial) proof of Theorem 1.5.4 by using Theorem 1.5.3 (assuming it is true). It will also be necessary to use the linearity of T.

Partial Proof:

Suppose first that \det(A)\not=0. Then Theorem 1.5.3 implies that T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} defined by T({\bf x})=A{\bf x} is one-to-one and onto.

Now clearly T({\bf 0})=A{\bf 0}=\left(\begin{array}{cc} a & b \\ c & d \end{array}\right)\left(\begin{array}{c} 0 \\ 0 \end{array}\right)=\left(\begin{array}{c} 0 \\ 0 \end{array}\right)={\bf 0}. This means that {\bf 0} is an element of the null space (kernel) of T.

But there can be no other elements in the null space of T. For if there were some other nonzero element {\bf x}, this would contradict the fact that T is one-to-one.

The reverse implication will only be done partially. Suppose that T has a null space that is only the zero vector {\bf 0}.

We want to prove first that T is one-to-one. Towards this end, assume that T({\bf u})=T({\bf v}). The linearity of T allows us to then conclude that T({\bf u}-{\bf v})=T({\bf u})-T({\bf v})={\bf 0}. But now our assumption about the null space of T helps us see that {\bf u}-{\bf v}={\bf 0} so that {\bf u}={\bf v}.

It is at this point that this writing becomes a “partial” proof. We will not prove the claim that T is onto.

This is because doing so would involve doing calculations that would prove (more directly), that \det(A)=ad-bc\not=0. Our purpose is to make us of Theorem 1.5.3 instead.

So, we finish this partial proof by just stating that T is onto. Thus, by Theorem 1.5.3, \det(A)\not=0.

Q.E.D.

Row Operations on Small Matrices

In Section 1.3, we discussed (elementary) “row operations” on systems of two equations and two unknowns. We finish of this section by describing elementary row operations on small matrices. These operations will allow us to solve systems of linear equations more quickly. They can also be generalized to higher dimensions (larger matrices) and can be used to answer questions about linear transformations. These questions include determining when linear transformations are one-to-one and/or onto.

Before we get into these row operations, we look at the relationships between all the main concepts discussed so far in this online textbook.

Systems of Linear Equations, Matrices, Vectors, and Linear Transformations

Systems of Linear Equations, Matrices, and Vectors

Suppose the numbers a, b, c, d, u, and v are given. Consider the following system of two linear equations and two unknowns x and y.

\begin{cases}\begin{array}{rcl} ax+by & = & u \\ cx+dy & = & v \end{array} \end{cases}

In Section 1.3, we saw that elimination (of unknowns) through the use of row operations could be used to transform this system to the following equivalent system when the determinant ad-bc\not=0.

\begin{cases}\begin{array}{rcl} x\hspace{.5in} & = & \frac{d}{ad-bc}u-\frac{b}{ad-bc}v \\ \hspace{.5in} y & = & -\frac{c}{ad-bc}u+\frac{a}{ad-bc}v \end{array} \end{cases}

Using vector notation, these two systems of linear equations can be written as two vector equations.

\left(\begin{array}{c} ax+by \\ cx+dy \end{array}\right)=x\left(\begin{array}{c} a \\ c \end{array}\right)+y\left(\begin{array}{c} b \\ d \end{array}\right)=\left(\begin{array}{c} u \\ v \end{array}\right), and

\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{c}\frac{d}{ad-bc}u-\frac{b}{ad-bc}v \\ -\frac{c}{ad-bc}u+\frac{a}{ad-bc}v \end{array}\right)=u\left(\begin{array}{c} \frac{d}{ad-bc} \\ -\frac{c}{ad-bc} \end{array}\right)+v\left(\begin{array}{c} -\frac{b}{ad-bc} \\ \frac{a}{ad-bc} \end{array}\right)

Based on our definition of matrix/vector multiplication, we can also write the following.

\left(\begin{array}{cc} a & b \\ c & d \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{c} u \\ v \end{array}\right), and

\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{cc} \frac{d}{ad-bc} & -\frac{b}{ad-bc} \\ -\frac{c}{ad-bc} & \frac{a}{ad-bc} \end{array}\right)\left(\begin{array}{c} u \\ v \end{array}\right)

If {\bf x}=\left(\begin{array}{c} x \\ y \end{array}\right), {\bf u}=\left(\begin{array}{c} u \\ v \end{array}\right), and A=\left(\begin{array}{cc} a & b \\ c & d \end{array}\right), then the first of these matrix/vector equations can be shortened to A{\bf x}={\bf u}.

As a convention, the second will be shortened to {\bf x}=A^{-1}{\bf u}, where A^{-1}=\left(\begin{array}{cc} \frac{d}{ad-bc} & -\frac{b}{ad-bc} \\ -\frac{c}{ad-bc} & \frac{a}{ad-bc} \end{array}\right) is the so-called “inverse matrix” of A. Clearly, it exists when the determinant ad-bc\not=0.

Systems of Linear Equations and Linear Transformations

Let T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} be the linear transformation defined by T({\bf x})=A{\bf x}=\left(\begin{array}{cc} a & b \\ c & d \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right).

If we are given u,v\in {\Bbb R}, then solving the system

\begin{cases}\begin{array}{rcl} ax+by & = & u \\ cx+dy & = & v \end{array} \end{cases}

is equivalent to determining a vector {\bf x}=\left(\begin{array}{c} x \\ y \end{array}\right) such that T({\bf x})={\bf u}=\left(\begin{array}{c} u \\ v \end{array}\right).

Here are the punchlines. 1) The solution of the system exists for any u and v exactly when T is onto. And 2) For a given u and v where the solution exists, it will be unique exactly when T is one-to-one.

When the matrix A of T is 2\times 2, existence and uniqueness are equivalent. This is not the case in general, however.

Elementary Row Operations on Augmented Matrices

Let us consider a couple examples where elementary row operations on matrices are used to answer questions about linear transformations.

Example 1

The Linear transformation

Suppose T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} is the linear transformation defined by T({\bf x})=A{\bf x}, where {\bf x}=\left(\begin{array}{c} x \\ y \end{array}\right) and A=\left(\begin{array}{cc} 1 & 2 \\ 4 & 5 \end{array}\right). Suppose {\bf u}=\left(\begin{array}{c} 3 \\ 6 \end{array}\right). We wish to answer the following questions: 1) Does a solution {\bf x} of T({\bf x})={\bf u} exist? And 2) If it exists, is it unique? And 3) What is/are the solution(s)?

the corresponding system

This is the same as asking essentially the same set of questions about the linear system

\begin{cases}\begin{array}{rcl} x+2y & = & 3 \\ 4x+5y & = & 6\end{array}\end{cases}

elementary row operations applied to the augmented matrix

Instead of applying row operations to the system of equations, we apply them to the following 2\times 3 augmented matrix, where the numbers on the right-hand sides of the equations form a 3rd column. This is justified because the variables in the system of equations just serve as “placeholders” of the calculations. The “real action” of elimination is in the coefficients and numbers on the right-hand sides.

\left(\begin{array}{ccc} 1 & 2 & 3 \\ 4 & 5 & 6 \end{array}\right)

Now we proceed with elementary row operations that will give us an equivalent system that has the unknowns x and y explicitly solved for.

First, multiply row 1 by -4 and add it to row 2 to obtain a “new” row 2 that has a “0” in the lower left corner (representing elimination of the x-term in the second equation).

Symbolically we write this as -4R_{1}+R_{2}\longrightarrow R_{2}.

\left(\begin{array}{ccc} 1 & 2 & 3 \\ 4 & 5 & 6 \end{array}\right)\xrightarrow{-4R_{1}+R_{2}\longrightarrow R_{2}}\left(\begin{array}{ccc} 1 & 2 & 3 \\ 0 & -3 & -6 \end{array}\right)

Next, make the -3 in the second row become a 1 by multiplying the second row by -\frac{1}{3}. Represent this symbolically as -\frac{1}{3}R_{2}\longrightarrow R_{2}.

\left(\begin{array}{ccc} 1 & 2 & 3 \\ 0 & -3 & -6 \end{array}\right)\xrightarrow{-\frac{1}{3}R_{2}\longrightarrow R_{2}}\left(\begin{array}{ccc} 1 & 2 & 3 \\ 0 & 1 & 2 \end{array}\right)

The second row now represents the equation y=2, but we don’t stop here. Our next step is to eliminate the 2 from the first row above the 1 in the second row. Do this by multiplying the second row by -2, adding it to the first row, and replacing the first row. This is the operation -2R_{2}+R_{1}\longrightarrow R_{1}.

\left(\begin{array}{ccc} 1 & 2 & 3 \\ 0 & 1 & 2 \end{array}\right)\xrightarrow{-2R_{2}+R_{1}\longrightarrow R_{1}}\left(\begin{array}{ccc} 1 & 0 & -1 \\ 0 & 1 & 2 \end{array}\right)
the unique solution

From this, we see the final answers are x=-1 and y=2. In other words, the solution exists (the system is consistent), the solution is unique, and the solution is the point (x,y)=(-1,2). You should check this in the original system.

There will be a unique solution for any {\bf u}=\left(\begin{array}{c} u \\ v \end{array}\right) in Example 1 since \det(A)=1\cdot 4-2\cdot 3=4-6=-2\not=0.

Example 2

For example 2, we consider the linear transformation T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} defined by T({\bf x})=A{\bf x}=\left(\begin{array}{cc} 1 & 2 \\ -2 & -4 \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right).

Note that \det(A)=1\cdot -4-2\cdot (-2)=-4+4=0. Hence, T will be neither one-to-one nor onto.

Instead of picking a specific {\bf u}=\left(\begin{array}{c} u \\ v \end{array}\right) and doing what we did in Example 1, we let {\bf u} be arbitrary. We seek conditions on the values of u and v for when the solution will or will not exist and when it will or will not be unique.

Here is the first elementary row operation on an arbitrary augmented matrix.

\left(\begin{array}{ccc} 1 & 2 & u \\ -2 & -4 & v\end{array}\right)\xrightarrow{2R_{1}+R_{2}\longrightarrow R_{2}}\left(\begin{array}{ccc} 1 & 2 & u \\ 0 & 0 & v+2u \end{array}\right)

Interpreting the Equivalent (reduced) form of the system

The second row represents the equation 0x+0y=0=v+2u. If v+2u\not=0\Leftrightarrow v\not=-2u, then this will be an equation that cannot be solved (it will be inconsistent). Therefore, when v\not= -2u, the equation T({\bf x})={\bf u} is inconsistent and cannot be solved. There is no solution.

On the other hand, if v=-2u, then the second row is consistent and there are solutions. In fact, there are infinitely many: all those points (x,y) satisfying x+2y=u\Leftrightarrow y=-\frac{1}{2}x+\frac{1}{2}u.

In parametric vector form, the solution set when v=-2u can be written:

\left\{x\left(\begin{array}{c} 1 \\ -\frac{1}{2}\end{array}\right)+\left(\begin{array}{c} 0 \\ \frac{1}{2}u\end{array}\right)\, |\, x\in {\Bbb R}\right\}.

geometric interpretation

For any fixed number u, this represents a line parallel to the vector \left(\begin{array}{c} 1 \\ -\frac{1}{2}\end{array}\right) going through the point (0,\frac{1}{2}u). All such points in the xy-plane get mapped to the point (u,-2u) on the line v=-2u in the uv-plane.

The linear transformation T is neither one-to-one nor onto. The null space (kernel) of T is a line through the origin parallel to \left(\begin{array}{c} 1 \\ -\frac{1}{2}\end{array}\right):

\left\{x\left(\begin{array}{c} 1 \\ -\frac{1}{2}\end{array}\right)\, |\, x\in {\Bbb R}\right\}.

We end Section 1.5 by visualizing T as a mapping through the use of an animation. The domain of T is the xy-plane on the left and the codomain of T is the uv-plane on the right.

In the animation, the value of u changes, and the red point (u,v)=(u,-2u) on the right represents a point in the image T({\Bbb R}^{2}). In fact, for any fixed value of u, the entire red line on the left gets mapped onto that one red point (u,v)=(u,-2u) on the right. Note that the coordinates of the black point are (x,y)=\left(0,-\frac{1}{2}u\right) and that the components of black vector are \left(\begin{array}{c} 1 \\ -\frac{1}{2}\end{array}\right).

When u=0, the red dot on the right is at the origin and the entire red line going through the origin on the left is the null space (kernel) of T.

Visualizing the mapping T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} for Example 2 that is neither one-to-one nor onto. For any fixed value of u, the entire red line on the left gets mapped onto the red point on the right with coordinates (u,-2u). The entire plane on the left gets mapped onto (just) the entire blue line on the right.

This visually emphasizes that the linear transformation T is neither one-to-one nor onto.

Basic Exercises

  1. Graph (in rectangular coordinates) the linear transformation T:{\Bbb R}\longrightarrow {\Bbb R} defined by the formula T(x)=-3x. Is T one-to-one? Is T onto? Solve the equation T(x)=12. (Yes, this is all supposed to be easy.)
  2. Graph (in rectangular coordinates) the linear transformation T:{\Bbb R}\longrightarrow {\Bbb R}^{2} defined by the formula T(x)=(x,-3x) (or T(x)=\left(\begin{array}{c} x \\ -3x \end{array}\right)). Is T one-to-one? Is T onto? Solve the equation T(x)=(-4,12) (or T(x)=\left(\begin{array}{c} -4 \\ 12 \end{array}\right)). Note the relationships between the answers in #1 and #2.
  3. Do your best to make a decent graph of the linear transformation T:{\Bbb R}^{2}\longrightarrow {\Bbb R} defined by the formula T(x,y)=-x+2y (or T\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{cc} -1 & 2 \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right)). Is T one-to-one? Is T onto? Solve the equation T(x,y)=11. Is there a solution? If so, is it unique? Write the answer in parametric vector form if necessary.

Row Operations Exercises

  1. Let T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} be the linear transformation defined by T({\bf x})=T\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{c} x-3y \\ -3x+5y \end{array}\right)=\left(\begin{array}{cc} 1 & -3 \\ -3 & 5\end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right). Is T one-to-one? Is T onto? Use elementary row operations to solve the equation T\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{c} 16 \\ -32 \end{array}\right). Is there a solution? If so, is it unique? Write the answer in parametric vector form if necessary. Describe the null space (kernel) and image of T.
  2. Answer the same questions as in #4 for T({\bf x})=T\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{c} 2x-5y \\ -4x+10y \end{array}\right)=\left(\begin{array}{cc} 2 & -5 \\ -4 & 10\end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right).

Challenge Exercises

  1. Prove Theorem 1.5.3
  2. Suppose that T:{\Bbb R}^{2}\longrightarrow {\Bbb R}^{2} is defined by T\left(\begin{array}{c} x \\ y \end{array}\right)=\left(\begin{array}{c} ax+by+e \\ cx+dy+f \end{array}\right). Prove that T is a linear transformation if and only if e=f=0.

Video for Section 1.5

Here is a video overview of the content of this section.

Next: Section 1.6, Linear Algebra in Three Dimensions