The Derivative of y = x^2

Calculus 1, Lectures 4A through 5B

The derivative of x^2 at an arbitrary value of x is 2x. This is the slope of the tangent line at that point.
The derivative of y=f(x)=x^{2} at an arbitrary value of x is equal by 2x. This is the slope of the tangent line at that point and is approximated by 2x+\Delta x, which are slopes of secant lines.

Lectures 4A through 5B of my series of Calculus 1 lectures have a lot of content. I will summarize the content in this post, but I want my main focus to be on just one part.

That part is to understand the meaning of the derivative of x^2, as well as what it equals. This will be the topic of the last section of this blog post.

But before we hone in on this idea, let’s summarize the other content of the lectures.

Lecture 4A

I start Lecture 4A out by doing some problem-solving with exponential equations.

Calc 1, Lec 4A: Exponential Problems, Linear vs Exponential Modeling of GDP, Function Composition

The general gist how to solve such problems is this. Given two data points (x_{1},y_{1}) and (x_{2},y_{2}), we seek to find numbers a and b such that the graph of the exponential function y=f(x)=a\cdot b^{x} goes through these points.

That means we want y_{1}=a\cdot b^{x_{1}} and y_{2}=a\cdot b^{x_{2}} to both be true. Dividing the first equation by the second gives \frac{y_{1}}{y_{2}}=b^{x_{1}-x_{2}}, assuming y_{2}\not=0. This equation is then solved for b by taking a root of both sides. Then you can use one of the original equations to solve for a.

Another point of emphasis in Lecture 4A is the idea of data modeling. Real-life data about the United States Gross Domestic Product (GDP) is looked at. Then, I graphically compare whether a linear or exponential model is a better “fit” for the data. The exponential model is seen to be better because its residuals (errors) are more random. This is a very important perspective to have if you want to use mathematical models in real life.

The final point of emphasis in Lecture 4A is function composition. Function composition means we apply two functions in sequence. Given two functions f and g with appropriate domains and codomains, we can form a new function f\circ g. This function is defined by the formula (f\circ g)(x)=f(g(x)). In other words, the output of g is plugged into f. It is an important concept, both for understanding functions more deeply, and for the idea of an inverse function.

Lecture 4B

That leads us into Lecture 4B, where I begin describing inverse functions and their properties.

Calculus 1, Lecture 4B: Function Composition, Inverse Functions, Quadratic Functions

Initially, more detail is given about the idea of function composition by consideration of an example.

From there, inverse functions are defined for one-to-one functions. Given a one-to-one (injective) function y=f(x), its inverse function x=f^{-1}(y) is defined to satisfy two properties with respect to composition. It must satisfy (f^{-1}\circ f)(x)=f^{-1}(f(x))=x for all x in the domain of f. And it must satisfy (f\circ f^{-1})(y)=f(f^{-1}(y))=y for all y in the domain of f^{-1} (which is the range of f).

Another fact I emphasize in this lecture is that functions that are not one-to-one can sometimes be made to be so by restricting their domains appropriately. In particular, the domains of quadratic functions can be appropriately restricted by using the method of completing the square to write their formulas in vertex form.

Lectures 5A and 5B

I start Lecture 5A by using the computer algebra system (CAS) Mathematica to make a plot of y=f(x)=x^2. Recall that we are ultimately interested in understanding the derivative of x^2 in this post.

Calc 1, Lec 5A: Mathematica Plot, Inverse of Quadratic (Completing the Square), Intro to Derivatives

I also use a function called “Manipulate” in Mathematica to make an animation of the graph of y=f(x-h)=(x-h)^{2} as h increases. This corresponds to a horizontal translation (shift) of the “parent” function y=f(x)=x^{2}.

From there, I solve a physics problem about the height of an object under the influence of gravity by using an inverse function of a quadratic. As mentioned already, the method of completing the square is used to put the quadratic in vertex form. Then the domain is restricted and an appropriate inverse function is found with the quadratic formula.

After a brief discussion of infinite limits of y=f(x)=x^{2}, I move on to introducing the derivative. In particular, I focus on the derivative of y=f(x)=x^{2}. This discussion continues, with visuals, into Lecture 5B.

Calculus 1, Lecture 5B: Visualize the Derivative of x^2, Logarithm Definition, Graphs, & Properties

In the last part of Lecture 5B, I talk about the definition of logarithms, and then discuss their graphs and properties.

But now let’s get back to looking at the derivative of x^2.

The Derivative of x^2

We want the derivative of x^2 at an arbitrary value of x to measure the (instantaneous) rate of change of the function at that point. This is also described as the slope of the tangent line to the graph of y=f(x)=x^{2} at that point.

Start by picking an arbitrary value of x. Then let \Delta x be a nonzero number.

This is standard notation for the change in x. We are indeed imagining that the independent variable (input) for the function f is changing from x to x+\Delta x. If \Delta x>0, then the input is increasing and x<x+\Delta x. If \Delta x<0, then the input is decreasing x+\Delta x<x.

For simplicity, assume that \Delta x>0. The graph of f is not a line, so it does not have a constant slope. However, the average rate of change of f over the closed interval [x,x+\Delta x] can still be defined. It is defined to be the slope of the line connecting the points (x,f(x)) and (x+\Delta x,f(x+\Delta x)).

This line is called a secant line to the graph of f. Its slope is \frac{\Delta y}{\Delta x}=\frac{f(x+\Delta x)-f(x)}{\Delta x}. For f(x)=x^{2}, we can do the following simplification of this expression.

\frac{f(x+\Delta x)-f(x)}{\Delta x}=\frac{(x+\Delta x)^{2}-x^{2}}{\Delta x}=\frac{x^{2}+2x\cdot \Delta x+(\Delta x)^{2}-x^{2}}{\Delta x}=\frac{\Delta x(2x+\Delta x)}{\Delta x}=2x+\Delta x.

Visual of a Secant Line Approaching a Tangent Line

The last equality is true as long as \Delta x\not=0, which we are assuming. What we have just done can be visualized as shown below.

The derivative of x^2 is approximated by the slope of a secant line as delta x goes to zero.
The slope of the secant line to the graph of y=f(x)=x^{2} between (x,f(x)) and (x+\Delta x,f(x+\Delta x)) is 2x+\Delta x. In this visual, x=1.5 so the slope simplifies to 3+\Delta x. As \Delta x gets closer and closer to 0, this slope gets closer and closer to 3.

So what is the derivative then? It is the limit of the slopes of these secant lines as \Delta x “goes to” zero. This limit, if it exists, is then defined to be the slope of the tangent line to the graph at the given value of x. In other words, at the given point (x,f(x)) on the graph of f, the tangent line is the line through that point with slope equal to the limit just described.

The Derivative is a Limit

But what is a limit?

We will get more precise about limits in a later post. For the moment, in the given situation, just think of it as the values of \frac{f(x+\Delta x)-f(x)}{\Delta x} as \Delta x gets closer and closer to zero, without actually equalling zero (because then we would be dividing by zero).

Since \frac{f(x+\Delta x)-f(x)}{\Delta x}=2x+\Delta x when \Delta x\not=0, this limit can be found by thinking about the values of 2x+\Delta x when \Delta x is close to zero.

How is this thinking done? Put simply, since 2x+\Delta x is a “nice” (continuous) function of \Delta x (thinking of x as “fixed”), we can just plug in \Delta x=0 to get the answer 2x.

This is the derivative of y=f(x)=x^{2}! Notationally, we write \frac{dy}{dx}=f'(x)=2x.

Now you may very well be scratching your head and saying: but wait a minute! I thought we were saying \Delta x\not=0! But we plugged in zero at the end! How can that be?!?!?

That, my friend, is part of the confusing subtlety of limits. Like I said, we will delve more deeply into this in another post.

Don’t be too down on yourself if you don’t understand it right away. After all, even the genuises Isaac Newton and Gottfried Wilhelm Leibniz didn’t fully understand it in the 1600’s. Full understanding had to wait for Augustin-Louis Cauchy and Karl Weierstrass to come along in the 1800’s.

But don’t worry! You can “stand on the shoulders of these giants”. You can indeed “get it” with enough help and thinking time!