How does the chain rule work in derivatives

It seemed important to me to rethink the concept of deriving a function. The process of differentiation (i.e., calculating derivatives) is one of the most basic operations in analysis and even mathematics. In this tutorial on Math Crack I will try to shed some light on the meaning and interpretation of what a derivative is and does.

To clarify the scope of this tutorial, I would first like to say that we are not practicing with solving specific exercise problems with derivatives, but rather trying to understand what we are doing and when operating with derivatives. Once we understand what we are doing, we have a WAYYY better chance of solving problems.


At the beginning, at least the definition of a derivative must be written. Assume \ (f \) is a function and \ ({{x} _ {0}} \ in dom \ left (f \ right) \). Ok, have we already started with the technical details? We're just saying that \ (f \) is a function. Imagine a function \ (f \) using the graph below:

When we say "\ ({{x} _ {0}} \ in dom \ left (f \ right) \)", we are only saying that \ ({{x} _ {0}} \) is a point on which the function is well defined (i.e. belongs to it) domain ). But hold it, is it possible that a point \ ({{x} _ {0}} \) does NOT make a function well-defined? Certainly! Consider the following feature:

\ [f \ left (x \ right) = \ frac {1} {x-1} \]

Such a function is NOT well defined for \ ({{x} _ {0}} = 1 \). What is not well defined at \ ({{x} _ {0}} = 1 \)? Because if we insert the value of \ ({{x} _ {0}} = 1 \) into the function, we get

\ [f \ left (1 \ right) = \ frac {1} {1-1} = \ frac {1} {0} \]

This is an INVALID operation (as you know from elementary school, at least you can't divide by zero using traditional arithmetic rules), so the function at \ ({{x} _ {0}} = 1 \) is not well defined . If a function is well defined at one point, it simply means that the function can be evaluated at that point without any illegal operations.

Now we can say it again, because now you know what we mean: Assuming \ (f \) is a function and \ ({{x} _ {0}} \ in dom \ left (f \ right) \ ). The derivative at point \ ({{x} _ {0}} \) is defined as

\ [f '\ left ({{x} _ {0}} \ right) = \ lim_ {x \ to {x_0}} \, \ frac {f \ left (x \ right) -f \ left ({{ x} _ {0}} \ right)} {x - {{x} _ {0}}} \]

if there is such a limit.

Ok, that's the meat of the problem and we're going to discuss it in a moment. I want you to have some things EXTREMELY clear here:

• If the above limit exists, we call if \ (f '\ left ({{x} _ {0}} \ right) \) and it is used as the "derivative of the function \ (f \ left (x \ right ) \) at the point \ ({{x} _ {0}} \) ". \ (f '\ left ({{x} _ {0}} \ right) \) is simply a symbol with which we can refer to the derivative of the function \ (f \ left (x \ right) \) at the point \ ({{x} _ {0}} \) (if available) refer. We could have used any other symbol, e.g. B. "\ (deriv {{\ left (f \ right)} _ {{{x} _ {0}}}} \)" or "\ (derivative \ _f \ _ {{x} _ {0}} \) ". For aesthetic reasons we prefer "\ (f '\ left ({{x} _ {0}} \ right) \)".

The point is that it is a MADE UP symbol to refer to the derivative of the function \ (f \ left (x \ right) \) at the point \ ({{x} _ {0}} \) . The funny thing about math is that notation is important. Although a concept exists regardless of the notation used to express it, a logical, flexible, compact notation can cause things to catch fire, as opposed to what can happen with a cumbersome, uninspired notation

The role of notation

(Historically, the two simultaneous developers of a usable version of the derivative concept, Leibniz and Newton, used radically different notations. Newton used \ (\ dot {y} \) while Leibniz used \ (\ frac {dy} {dx} \). Leibniz notation burned and made it easy to fully develop Calculus, while Newton's notation caused more than a headache. Really, it was that important).

• The derivation is a POINTWISE operation. This means that it is an operation that is performed on a function at some point and it needs to be checked point by point. Of course there is an infinite number of points in a typical domain like the real line \ (\ mathbb {R} \). Therefore, it can take a while to manually check that a derivative is defined at each point. BUT there are some rules that make it possible to simplify the work considerably by calculating the derivative at a generic point \ ({{x} _ {0}} \) and then analyzing for which values ​​of \ ({{ x} _ {0}} \) is the limit that defines the derivative. So you can relax because the difficult manual labor isn't too strenuous if you know what you are doing naturally.

• If the derivative of a function \ (f \) exists at a point \ ({{x} _ {0}} \), we say that the function at \ ({{x} _ {0}} \) is differentiable is. We can also say that a function in a REGION is differentiable (a region is a set of points) if the function is differentiable at EVERY point in this region. Although the concept of derivative is a point-by-point concept (defined at a particular point), it can be understood as a global concept when it is defined for each point in a region.

• If we \ (D \) define the set of all points in the real line at which the derivative of a function is defined, we can define the derivative function \ (f '\) as follows:

\ [\ begin {aligned} & f ': D \ subseteq \ mathbb {R} \ to \ mathbb {R} \ & x \ mapsto f' \ left (x \ right) \ \ end {aligned} \]

This is a function because we uniquely link each \ (x \) on \ (D \) with the value \ (f '\ left (x \ right) \). This means that every value from \ (x \) on \ (D \) is assigned to the value \ (f '\ left (x \ right) \). The set of all pairs \ (\ left (x, f '\ left (x \ right) \ right) \) for \ (x \ in D \) forms a function, and you can do anything you can do with functions , e.g. B. graphically.

This should clear up the question many students have about derivatives, as they are wondering how do we have a derivative "function" when the derivative is something that is calculated at some point. Well the answer is that we are computing the derivative at many points, which is the basis for defining the derivative as a function.

Last words: notation hell

When the concept of derivation was brought into the modern form we know from Newton and Leibniz (I emphasize the term "modern form" since Calculus was developed almost entirely in a more intuitive and less formal way by the Greeks and others - a long time ago ) they chose radically different notations. Newton chose \ (\ overset {\ bullet} {\ mathop {y}} \, \), while Leibniz \ (\ frac {dy} {dx} \) chose. So far, so good. However, the concept of derivative means much less if we don't have powerful derivative theorems.

With their respective notations, both had little problem proving basic differentiation theorems such as linearity and product rule, but Newton saw no need to formulate the chain rule formally, possibly because his notation was not suitable for it "Duh" rule shows. To be more precise, we assume that \ (y = y \ left (x \ right) \) is one function and \ (u = u \ left (x \ right) \) is another function.

It is a natural question whether I can easily compute the derivative of the composition \ (y \ left (u \ left (x \ right) \ right) \) based on the derivatives of \ (y \) and \ ( u \). The answer to this question is the chain rule. In Leibniz notation, the rule applies

\ [\ frac {dy} {dx} = \ frac {dy} {du} \ frac {du} {dx} \]

It's almost like you can cancel the __XYZ_A __ like this:

\ [\ require {cancel} \ frac {dy} {dx} = \ frac {dy} {\ cancel {du}} \ frac {\ cancel {du}} {dx} \]

but that's not exactly how it is. But that's the nice thing about Leibniz notation. It has a strong intuitive attraction (and the "cancellation" of __XYZ_A __ is almost a reality, it is only carried out at the level of \ (\ Delta u \) and there are limits), but you still have to understand what Leibniz does with that said rule. He says:

"The derivative of the composite function \ (y \ left (u \ left (x \ right) \ right) \) is the same as the derivative of \ (y \) at the point \ (u \ left (x \ right) \) multiplied by the derivative of \ (u \) at the point \ (x \) "

The chain rule with Newton's notation takes the following form:

\ [\ overset {\ bullet} {\ mathop {\ left (f \ circ u \ right)}} \, = \ overset {\ bullet} {\ mathop {f}} \, \ left (u \ left (x \ right) \ right) \ overset {\ bullet} {\ mathop {u}} \, \ left (x \ right) \]

A little less pretty, right? But guess what, Newton's chain rule says exactly the same thing as

\ [\ frac {dy} {dx} = \ frac {dy} {du} \ frac {du} {dx} \]

However, this latter notation caught fire and contributed enormously to the rapid development of modern calculus, while Newton's form was far less popular. Although the sentences said exactly the same thing, one was gold and the other was not so much. Why? NOTATION my friend.