1
00:00:01 --> 00:00:03
The following content is
provided under a Creative
2
00:00:03 --> 00:00:05
Commons license.
Your support will help MIT
3
00:00:05 --> 00:00:08
OpenCourseWare continue to offer
high quality educational
4
00:00:08 --> 00:00:13
resources for free.
To make a donation or to view
5
00:00:13 --> 00:00:18
additional materials from
hundreds of MIT courses,
6
00:00:18 --> 00:00:23
visit MIT OpenCourseWare at
ocw.mit.edu.
7
00:00:23 --> 00:00:27
So far we have learned about
partial derivatives and how to
8
00:00:27 --> 00:00:31
use them to find minima and
maxima of functions of two
9
00:00:31 --> 00:00:35
variables or several variables.
And now we are going to try to
10
00:00:35 --> 00:00:38
study, in more detail,
how functions of several
11
00:00:38 --> 00:00:41
variables behave,
how to compete their
12
00:00:41 --> 00:00:44
variations.
How to estimate the variation
13
00:00:44 --> 00:00:50
in arbitrary directions.
And so for that we are going to
14
00:00:50 --> 00:00:56
need some more tools actually to
study this things.
15
00:00:56 --> 00:01:00
More tools to study functions.
16
00:01:00 --> 00:01:15
17
00:01:15 --> 00:01:26
Today's topic is going to be
differentials.
18
00:01:26 --> 00:01:34
And, just to motivate that,
let me remind you about one
19
00:01:34 --> 00:01:43
trick that you probably know
from single variable calculus,
20
00:01:43 --> 00:01:48
namely implicit
differentiation.
21
00:01:48 --> 00:01:56
Let's say that you have a
function y equals f of x then
22
00:01:56 --> 00:02:05
you would sometimes write dy
equals f prime of x times dx.
23
00:02:05 --> 00:02:17
And then maybe you would -- We
use implicit differentiation to
24
00:02:17 --> 00:02:29
actually relate infinitesimal
changes in y with infinitesimal
25
00:02:29 --> 00:02:35
changes in x.
And one thing we can do with
26
00:02:35 --> 00:02:39
that, for example,
is actually figure out the rate
27
00:02:39 --> 00:02:43
of change dy by dx,
but also the reciprocal dx by
28
00:02:43 --> 00:02:48
dy.
And so, for example,
29
00:02:48 --> 00:02:58
let's say that we have y equals
inverse sin(x).
30
00:02:58 --> 00:03:03
Then we can write x equals
sin(y).
31
00:03:03 --> 00:03:08
And, from there,
we can actually find out what
32
00:03:08 --> 00:03:13
is the derivative of this
function if we didn't know the
33
00:03:13 --> 00:03:18
answer already by writing dx
equals cosine y dy.
34
00:03:18 --> 00:03:28
That tells us that dy over dx
is going to be one over cosine
35
00:03:28 --> 00:03:32
y.
And now cosine for relation to
36
00:03:32 --> 00:03:40
sine is basically one over
square root of one minus x^2.
37
00:03:40 --> 00:03:44
And that is how you find the
formula for the derivative of
38
00:03:44 --> 00:03:50
the inverse sine function.
A formula that you probably
39
00:03:50 --> 00:03:54
already knew,
but that is one way to derive
40
00:03:54 --> 00:03:57
it.
Now we are going to use also
41
00:03:57 --> 00:03:59
these kinds of notations,
dx, dy and so on,
42
00:03:59 --> 00:04:03
but use them for functions of
several variables.
43
00:04:03 --> 00:04:05
And, of course,
we will have to learn what the
44
00:04:05 --> 00:04:08
rules of manipulation are and
what we can do with them.
45
00:04:08 --> 00:04:17
46
00:04:17 --> 00:04:20
The actual name of that is the
total differential,
47
00:04:20 --> 00:04:23
as opposed to the partial
derivatives.
48
00:04:23 --> 00:04:28
The total differential includes
all of the various causes that
49
00:04:28 --> 00:04:33
can change -- Sorry.
All the contributions that can
50
00:04:33 --> 00:04:38
cause the value of your function
f to change.
51
00:04:38 --> 00:04:43
Namely, let's say that you have
a function maybe of three
52
00:04:43 --> 00:04:44
variables, x,
y, z,
53
00:04:44 --> 00:04:56
then you would write df equals
f sub x dx plus f sub y dy plus
54
00:04:56 --> 00:05:02
f sub z dz.
Maybe, just to remind you of
55
00:05:02 --> 00:05:07
the other notation,
partial f over partial x dx
56
00:05:07 --> 00:05:14
plus partial f over partial y dy
plus partial f over partial z
57
00:05:14 --> 00:05:18
dz.
Now, what is this object?
58
00:05:18 --> 00:05:22
What are the things on either
side of this equality?
59
00:05:22 --> 00:05:24
Well, they are called
differentials.
60
00:05:24 --> 00:05:26
And they are not numbers,
they are not vectors,
61
00:05:26 --> 00:05:29
they are not matrices,
they are a different kind of
62
00:05:29 --> 00:05:32
object.
These things have their own
63
00:05:32 --> 00:05:36
rules of manipulations,
and we have to learn what we
64
00:05:36 --> 00:05:40
can do with them.
So how do we think about them?
65
00:05:40 --> 00:05:51
First of all,
how do we not think about them?
66
00:05:51 --> 00:05:55
Here is an important thing to
know.
67
00:05:55 --> 00:06:07
Important.
df is not the same thing as
68
00:06:07 --> 00:06:12
delta f.
That is meant to be a number.
69
00:06:12 --> 00:06:16
It is going to be a number once
you have a small variation of x,
70
00:06:16 --> 00:06:19
a small variation of y,
a small variation of z.
71
00:06:19 --> 00:06:21
These are numbers.
Delta x, delta y and delta z
72
00:06:21 --> 00:06:24
are actual numbers,
and this becomes a number.
73
00:06:24 --> 00:06:26
This guy actually is not a
number.
74
00:06:26 --> 00:06:30
You cannot give it a particular
value.
75
00:06:30 --> 00:06:33
All you can do with a
differential is express it in
76
00:06:33 --> 00:06:36
terms of other differentials.
In fact, this dx,
77
00:06:36 --> 00:06:38
dy and dz, well,
they are mostly symbols out
78
00:06:38 --> 00:06:42
there.
But if you want to think about
79
00:06:42 --> 00:06:46
them, they are the differentials
of x, y and z.
80
00:06:46 --> 00:06:52
In fact, you can think of these
differentials as placeholders
81
00:06:52 --> 00:06:57
where you will put other things.
Of course, they represent,
82
00:06:57 --> 00:07:02
you know, there is this idea of
changes in x,
83
00:07:02 --> 00:07:05
y, z and f.
One way that one could explain
84
00:07:05 --> 00:07:09
it, and I don't really like it,
is to say they represent
85
00:07:09 --> 00:07:12
infinitesimal changes.
Another way to say it,
86
00:07:12 --> 00:07:14
and I think that is probably
closer to the truth,
87
00:07:14 --> 00:07:19
is that these things are
somehow placeholders to put
88
00:07:19 --> 00:07:22
values and get a tangent
approximation.
89
00:07:22 --> 00:07:25
For example,
if I do replace these symbols
90
00:07:25 --> 00:07:30
by delta x, delta y and delta z
numbers then I will actually get
91
00:07:30 --> 00:07:33
a numerical quantity.
And that will be an
92
00:07:33 --> 00:07:39
approximation formula for delta.
It will be the linear
93
00:07:39 --> 00:07:44
approximation,
a tangent plane approximation.
94
00:07:44 --> 00:07:52
What we can do -- Well,
let me start first with maybe
95
00:07:52 --> 00:08:00
something even before that.
The first thing that it does is
96
00:08:00 --> 00:08:10
it can encode how changes in x,
y, z affect the value of f.
97
00:08:10 --> 00:08:15
I would say that is the most
general answer to what is this
98
00:08:15 --> 00:08:18
formula, what are these
differentials.
99
00:08:18 --> 00:08:24
It is a relation between x,
y, z and f.
100
00:08:24 --> 00:08:36
And this is a placeholder for
small variations,
101
00:08:36 --> 00:08:53
delta x, delta y and delta z to
get an approximation formula.
102
00:08:53 --> 00:09:00
Which is delta f is
approximately equal to fx delta
103
00:09:00 --> 00:09:06
x fy delta y fz delta z.
It is getting cramped,
104
00:09:06 --> 00:09:11
but I am sure you know what is
going on here.
105
00:09:11 --> 00:09:15
And observe how this one is
actually equal while that one is
106
00:09:15 --> 00:09:19
approximately equal.
So they are really not the same.
107
00:09:19 --> 00:09:22
Another thing that the notation
suggests we can do,
108
00:09:22 --> 00:09:26
and they claim we can do,
is divide everything by some
109
00:09:26 --> 00:09:29
variable that everybody depends
on.
110
00:09:29 --> 00:09:33
Say, for example,
that x, y and z actually depend
111
00:09:33 --> 00:09:39
on some parameter t then they
will vary, at a certain rate,
112
00:09:39 --> 00:09:42
dx over dt, dy over dt,
dz over dt.
113
00:09:42 --> 00:09:46
And what the differential will
tell us then is the rate of
114
00:09:46 --> 00:09:51
change of f as a function of t,
when you plug in these values
115
00:09:51 --> 00:09:57
of x, y, z,
you will get df over dt by
116
00:09:57 --> 00:10:05
dividing everything by dt in
here.
117
00:10:05 --> 00:10:21
The first thing we can do is
divide by something like dt to
118
00:10:21 --> 00:10:30
get infinitesimal rate of
change.
119
00:10:30 --> 00:10:43
Well, let me just say rate of
change.
120
00:10:43 --> 00:10:52
df over dt equals f sub x dx
over dt plus f sub y dy over dt
121
00:10:52 --> 00:11:00
plus f sub z dz over dt.
And that corresponds to the
122
00:11:00 --> 00:11:09
situation where x is a function
of t, y is a function of t and z
123
00:11:09 --> 00:11:14
is a function of t.
That means you can plug in
124
00:11:14 --> 00:11:18
these values into f to get,
well, the value of f will
125
00:11:18 --> 00:11:23
depend on t,
and then you can find the rate
126
00:11:23 --> 00:11:27
of change with t of a value of
f.
127
00:11:27 --> 00:11:35
These are the basic rules.
And this is known as the chain
128
00:11:35 --> 00:11:38
rule.
It is one instance of a chain
129
00:11:38 --> 00:11:40
rule,
which tells you when you have a
130
00:11:40 --> 00:11:42
function that depends on
something,
131
00:11:42 --> 00:11:45
and that something in turn
depends on something else,
132
00:11:45 --> 00:11:51
how to find the rate of change
of a function on the new
133
00:11:51 --> 00:11:56
variable in terms of the
derivatives of a function and
134
00:11:56 --> 00:12:01
also the dependence between the
various variables.
135
00:12:01 --> 00:12:08
Any questions so far?
No.
136
00:12:08 --> 00:12:11
OK.
A word of warming,
137
00:12:11 --> 00:12:15
in particular,
about what I said up here.
138
00:12:15 --> 00:12:19
It is kind of unfortunate,
but the textbook actually has a
139
00:12:19 --> 00:12:23
serious mistake on that.
I mean they do have a couple of
140
00:12:23 --> 00:12:29
formulas where they mix a d with
a delta, and I warn you not to
141
00:12:29 --> 00:12:32
do that, please.
I mean there are d's and there
142
00:12:32 --> 00:12:34
are delta's, and basically they
don't live in the same world.
143
00:12:34 --> 00:12:53
They don't see each other.
The textbook is lying to you.
144
00:12:53 --> 00:12:59
Let's see.
The first and the second
145
00:12:59 --> 00:13:01
claims,
I don't really need to justify
146
00:13:01 --> 00:13:05
because the first one is just
stating some general principle,
147
00:13:05 --> 00:13:08
but I am not making a precise
mathematical claim.
148
00:13:08 --> 00:13:11
The second one,
well, we know the approximation
149
00:13:11 --> 00:13:14
formula already,
so I don't need to justify it
150
00:13:14 --> 00:13:16
for you.
But, on the other hand,
151
00:13:16 --> 00:13:20
this formula here,
I mean, you probably have a
152
00:13:20 --> 00:13:24
right to expect some reason for
why this works.
153
00:13:24 --> 00:13:27
Why is this valid?
After all, I first told you we
154
00:13:27 --> 00:13:29
have these new mysterious
objects.
155
00:13:29 --> 00:13:32
And then I am telling you we
can do that, but I kind of
156
00:13:32 --> 00:13:44
pulled it out of my hat.
I mean I don't have a hat.
157
00:13:44 --> 00:13:53
Why is this valid?
How can I get to this?
158
00:13:53 --> 00:14:06
Here is a first attempt of
justifying how to get there.
159
00:14:06 --> 00:14:13
Let's see.
Well, we said df is f sub x dx
160
00:14:13 --> 00:14:25
plus f sub y dy plus f sub z dz.
But we know if x is a function
161
00:14:25 --> 00:14:37
of t then dx is x prime of t dt,
dy is y prime of t dt,
162
00:14:37 --> 00:14:47
dz is z prime of t dt.
If we plug these into that
163
00:14:47 --> 00:14:58
formula, we will get that df is
f sub x times x prime t dt plus
164
00:14:58 --> 00:15:08
f sub y y prime of t dt plus f
sub z z prime of t dt.
165
00:15:08 --> 00:15:14
And now I have a relation
between df and dt.
166
00:15:14 --> 00:15:17
See, I got df equals sometimes
times dt.
167
00:15:17 --> 00:15:23
That means the rate of change
of f with respect to t should be
168
00:15:23 --> 00:15:38
that coefficient.
If I divide by dt then I get
169
00:15:38 --> 00:15:46
the chain rule.
That kind of works,
170
00:15:46 --> 00:15:49
but that shouldn't be
completely satisfactory.
171
00:15:49 --> 00:15:53
Let's say that you are a true
skeptic and you don't believe in
172
00:15:53 --> 00:15:57
differentials yet then it is
maybe not very good that I
173
00:15:57 --> 00:16:01
actually used more of these
differential notations in
174
00:16:01 --> 00:16:05
deriving the answer.
That is actually not how it is
175
00:16:05 --> 00:16:08
proved.
The way in which you prove the
176
00:16:08 --> 00:16:13
chain rule is not this way
because we shouldn't have too
177
00:16:13 --> 00:16:16
much trust in differentials just
yet.
178
00:16:16 --> 00:16:18
I mean at the end of today's
lecture, yes,
179
00:16:18 --> 00:16:20
probably we should believe in
them,
180
00:16:20 --> 00:16:26
but so far we should be a
little bit reluctant to believe
181
00:16:26 --> 00:16:32
these kind of strange objects
telling us weird things.
182
00:16:32 --> 00:16:39
Here is a better way to think
about it.
183
00:16:39 --> 00:16:43
One thing that we have trust in
so far are approximation
184
00:16:43 --> 00:16:48
formulas.
We should have trust in them.
185
00:16:48 --> 00:16:54
We should believe that if we
change x a little bit,
186
00:16:54 --> 00:17:02
if we change y a little bit
then we are actually going to
187
00:17:02 --> 00:17:11
get a change in f that is
approximately given by these
188
00:17:11 --> 00:17:13
guys.
And this is true for any
189
00:17:13 --> 00:17:14
changes in x,
y, z,
190
00:17:14 --> 00:17:20
but in particular let's look at
the changes that we get if we
191
00:17:20 --> 00:17:26
just take these formulas as
function of time and change time
192
00:17:26 --> 00:17:32
a little bit by delta t.
We will actually use the
193
00:17:32 --> 00:17:39
changes in x,
y, z in a small time delta t.
194
00:17:39 --> 00:17:47
Let's divide everybody by delta
t.
195
00:17:47 --> 00:17:52
Here I am just dividing numbers
so I am not actually playing any
196
00:17:52 --> 00:17:54
tricks on you.
I mean we don't really know
197
00:17:54 --> 00:17:57
what it means to divide
differentials,
198
00:17:57 --> 00:17:59
but dividing numbers is
something we know.
199
00:17:59 --> 00:18:11
And now, if I take delta t very
small, this guy tends to the
200
00:18:11 --> 00:18:19
derivative, df over dt.
Remember, the definition of df
201
00:18:19 --> 00:18:23
over dt is the limit of this
ratio when the time interval
202
00:18:23 --> 00:18:28
delta t tends to zero.
That means if I choose smaller
203
00:18:28 --> 00:18:32
and smaller values of delta t
then these ratios of numbers
204
00:18:32 --> 00:18:35
will actually tend to some
value,
205
00:18:35 --> 00:18:41
and that value is the
derivative.
206
00:18:41 --> 00:18:51
Similarly, here delta x over
delta t, when delta t is really
207
00:18:51 --> 00:18:59
small, will tend to the
derivative dx/dt.
208
00:18:59 --> 00:19:00
And similarly for the others.
209
00:19:00 --> 00:19:18
210
00:19:18 --> 00:19:28
That means, in particular,
we take the limit as delta t
211
00:19:28 --> 00:19:35
tends to zero and we get df over
dt on one side and on the other
212
00:19:35 --> 00:19:42
side we get f sub x dx over dt
plus f sub y dy over dt plus f
213
00:19:42 --> 00:19:46
sub z dz over dt.
And the approximation becomes
214
00:19:46 --> 00:19:49
better and better.
Remember when we write
215
00:19:49 --> 00:19:53
approximately equal that means
it is not quite the same,
216
00:19:53 --> 00:19:57
but if we take smaller
variations then actually we will
217
00:19:57 --> 00:20:01
end up with values that are
closer and closer.
218
00:20:01 --> 00:20:04
When we take the limit,
as delta t tends to zero,
219
00:20:04 --> 00:20:06
eventually we get an equality.
220
00:20:06 --> 00:20:21
221
00:20:21 --> 00:20:24
I mean mathematicians have more
complicated words to justify
222
00:20:24 --> 00:20:28
this statement.
I will spare them for now,
223
00:20:28 --> 00:20:36
and you will see them when you
take analysis if you go in that
224
00:20:36 --> 00:20:42
direction.
Any questions so far?
225
00:20:42 --> 00:20:46
No.
OK.
226
00:20:46 --> 00:20:47
Let's check this with an
example.
227
00:20:47 --> 00:20:58
Let's say that we really don't
have any faith in these things
228
00:20:58 --> 00:21:06
so let's try to do it.
Let's say I give you a function
229
00:21:06 --> 00:21:14
that is x ^2 y z.
And let's say that maybe x will
230
00:21:14 --> 00:21:20
be t, y will be e^t and z will
be sin(t).
231
00:21:20 --> 00:21:34
232
00:21:34 --> 00:21:40
What does the chain rule say?
Well, the chain rule tells us
233
00:21:40 --> 00:21:46
that dw/dt is,
we start with partial w over
234
00:21:46 --> 00:21:51
partial x, well,
what is that?
235
00:21:51 --> 00:21:58
That is 2xy,
and maybe I should point out
236
00:21:58 --> 00:22:08
that this is w sub x,
times dx over dt plus -- Well,
237
00:22:08 --> 00:22:21
w sub y is x squared times dy
over dt plus w sub z,
238
00:22:21 --> 00:22:28
which is going to be just one,
dz over dt.
239
00:22:28 --> 00:22:33
And so now let's plug in the
actual values of these things.
240
00:22:33 --> 00:22:38
x is t and y is e^t,
so that will be 2t e to the t,
241
00:22:38 --> 00:22:47
dx over dt is one plus x
squared is t squared,
242
00:22:47 --> 00:23:00
dy over dt is e over t,
plus dz over dt is cosine t.
243
00:23:00 --> 00:23:06
At the end of calculation we
get 2t e to the t plus t squared
244
00:23:06 --> 00:23:11
e to the t plus cosine t.
That is what the chain rule
245
00:23:11 --> 00:23:16
tells us.
How else could we find that?
246
00:23:16 --> 00:23:20
Well, we could just plug in
values of x, y and z,
247
00:23:20 --> 00:23:23
x plus w is a function of t,
and take its derivative.
248
00:23:23 --> 00:23:26
Let's do that just for
verification.
249
00:23:26 --> 00:23:30
It should be exactly the same
answer.
250
00:23:30 --> 00:23:32
And, in fact,
in this case,
251
00:23:32 --> 00:23:35
the two calculations are
roughly equal in complication.
252
00:23:35 --> 00:23:39
But say that your function of
x, y, z was much more
253
00:23:39 --> 00:23:43
complicated than that,
or maybe you actually didn't
254
00:23:43 --> 00:23:45
know a formula for it,
you only knew its partial
255
00:23:45 --> 00:23:48
derivatives,
then you would need to use the
256
00:23:48 --> 00:23:51
chain rule.
So, sometimes plugging in
257
00:23:51 --> 00:23:54
values is easier but not always.
258
00:23:54 --> 00:24:13
259
00:24:13 --> 00:24:18
Let's just check quickly.
The other method would be to
260
00:24:18 --> 00:24:23
substitute.
W as a function of t.
261
00:24:23 --> 00:24:36
Remember w was x^2y z.
x was t, so you get t squared,
262
00:24:36 --> 00:24:41
y is e to the t,
plus z was sine t.
263
00:24:41 --> 00:24:47
dw over dt, we know how to take
the derivative using single
264
00:24:47 --> 00:24:50
variable calculus.
Well, we should know.
265
00:24:50 --> 00:24:55
If we don't know then we should
take a look at 18.01 again.
266
00:24:55 --> 00:25:02
The product rule that will be
derivative of t squared is 2t
267
00:25:02 --> 00:25:08
times e to the t plus t squared
time the derivative of e to the
268
00:25:08 --> 00:25:16
t is e to the t plus cosine t.
And that is the same answer as
269
00:25:16 --> 00:25:19
over there.
I ended up writing,
270
00:25:19 --> 00:25:23
you know, maybe I wrote
slightly more here,
271
00:25:23 --> 00:25:28
but actually the amount of
calculations really was pretty
272
00:25:28 --> 00:25:32
much the same.
Any questions about that?
273
00:25:32 --> 00:25:39
Yes?
What kind of object is w?
274
00:25:39 --> 00:25:43
Well, you can think of w as
just another variable that is
275
00:25:43 --> 00:25:47
given as a function of x,
y and z, for example.
276
00:25:47 --> 00:25:51
You would have a function of x,
y, z defined by this formula,
277
00:25:51 --> 00:25:57
and I call it w.
I call its value w so that I
278
00:25:57 --> 00:26:04
can substitute t instead of x,
y, z.
279
00:26:04 --> 00:26:07
Well, let's think of w as a
function of three variables.
280
00:26:07 --> 00:26:12
And then, when I plug in the
dependents of these three
281
00:26:12 --> 00:26:17
variables on t,
then it becomes just a function
282
00:26:17 --> 00:26:19
of t.
I mean, really,
283
00:26:19 --> 00:26:23
my w here is pretty much what I
called f before.
284
00:26:23 --> 00:26:31
There is no major difference
between the two.
285
00:26:31 --> 00:26:38
Any other questions?
No.
286
00:26:38 --> 00:26:45
OK.
Let's see.
287
00:26:45 --> 00:26:49
Here is an application of what
we have seen.
288
00:26:49 --> 00:26:53
Let's say that you want to
understand actually all these
289
00:26:53 --> 00:26:57
rules about taking derivatives
in single variable calculus.
290
00:26:57 --> 00:27:00
What I showed you at the
beginning, and then erased,
291
00:27:00 --> 00:27:04
basically justifies how to take
the derivative of a reciprocal
292
00:27:04 --> 00:27:06
function.
And for that you didn't need
293
00:27:06 --> 00:27:10
multivariable calculus.
But let's try to justify the
294
00:27:10 --> 00:27:12
product rule,
for example,
295
00:27:12 --> 00:27:21
for the derivative.
An application of this actually
296
00:27:21 --> 00:27:31
is to justify the product and
quotient rules.
297
00:27:31 --> 00:27:33
Let's think,
for example,
298
00:27:33 --> 00:27:39
of a function of two variables,
u and v, that is just the
299
00:27:39 --> 00:27:44
product uv.
And let's say that u and v are
300
00:27:44 --> 00:27:48
actually functions of one
variable t.
301
00:27:48 --> 00:28:00
Then, well, d of uv over dt is
given by the chain rule applied
302
00:28:00 --> 00:28:04
to f.
This is df over dt.
303
00:28:04 --> 00:28:15
So df over dt should be f sub q
du over dt plus f sub v plus dv
304
00:28:15 --> 00:28:19
over dt.
But now what is the partial of
305
00:28:19 --> 00:28:23
f with respect to u?
It is v.
306
00:28:23 --> 00:28:31
That is v du over dt.
And partial of f with respect
307
00:28:31 --> 00:28:38
to v is going to be just u,
dv over dt.
308
00:28:38 --> 00:28:42
So you get back the usual
product rule.
309
00:28:42 --> 00:28:46
That is a slightly complicated
way of deriving it,
310
00:28:46 --> 00:28:50
but that is a valid way of
understanding how to take the
311
00:28:50 --> 00:28:54
derivative of a product by
thinking of the product first as
312
00:28:54 --> 00:28:57
a function of variables,
which are u and v.
313
00:28:57 --> 00:29:00
And then say,
oh, but u and v were actually
314
00:29:00 --> 00:29:03
functions of a variable t.
And then you do the
315
00:29:03 --> 00:29:08
differentiation in two stages
using the chain rule.
316
00:29:08 --> 00:29:16
Similarly, you can do the
quotient rule just for practice.
317
00:29:16 --> 00:29:21
If I give you the function g
equals u of v.
318
00:29:21 --> 00:29:25
Right now I am thinking of it
as a function of two variables,
319
00:29:25 --> 00:29:29
u and v.
U and v themselves are actually
320
00:29:29 --> 00:29:39
going to be functions of t.
Then, well, dg over dt is going
321
00:29:39 --> 00:29:44
to be partial g,
partial u.
322
00:29:44 --> 00:29:48
How much is that?
How much is partial g,
323
00:29:48 --> 00:29:53
partial u?
One over v times du over dt
324
00:29:53 --> 00:29:58
plus -- Well,
next we need to have partial g
325
00:29:58 --> 00:30:01
over partial v.
Well, what is the derivative of
326
00:30:01 --> 00:30:04
this with respect to v?
Here we need to know how to
327
00:30:04 --> 00:30:11
differentiate the inverse.
It is minus u over v squared
328
00:30:11 --> 00:30:20
times dv over dt.
And that is actually the usual
329
00:30:20 --> 00:30:28
quotient rule just written in a
slightly different way.
330
00:30:28 --> 00:30:30
I mean, just in case you really
want to see it,
331
00:30:30 --> 00:30:36
if you clear denominators for v
squared then you will see
332
00:30:36 --> 00:30:41
basically u prime times v minus
v prime times u.
333
00:30:41 --> 00:31:25
334
00:31:25 --> 00:31:32
Now let's go to something even
more crazy.
335
00:31:32 --> 00:31:45
I claim we can do chain rules
with more variables.
336
00:31:45 --> 00:31:50
Let's say that I have a
quantity.
337
00:31:50 --> 00:31:55
Let's call it w for now.
Let's say I have quantity w as
338
00:31:55 --> 00:31:58
a function of say variables x
and y.
339
00:31:58 --> 00:32:02
And so in the previous setup x
and y depended on some
340
00:32:02 --> 00:32:04
parameters t.
But, actually,
341
00:32:04 --> 00:32:07
let's now look at the case
where x and y themselves are
342
00:32:07 --> 00:32:10
functions of several variables.
Let's say of two more variables.
343
00:32:10 --> 00:32:25
Let's call them u and v.
I am going to stay with these
344
00:32:25 --> 00:32:27
abstract letters,
but if it bothers you,
345
00:32:27 --> 00:32:31
if it sounds completely
unmotivated think about it maybe
346
00:32:31 --> 00:32:33
in terms of something you might
now.
347
00:32:33 --> 00:32:36
Say, polar coordinates.
Let's say that I have a
348
00:32:36 --> 00:32:40
function but is defined in terms
of the polar coordinate
349
00:32:40 --> 00:32:43
variables on theta.
And then I know I want to
350
00:32:43 --> 00:32:45
switch to usual coordinates x
and y.
351
00:32:45 --> 00:32:49
Or, the other way around,
I have a function of x and y
352
00:32:49 --> 00:32:53
and I want to express it in
terms of the polar coordinates r
353
00:32:53 --> 00:32:57
and theta.
Then I would want to know maybe
354
00:32:57 --> 00:33:02
how the derivatives,
with respect to the various
355
00:33:02 --> 00:33:07
sets of variables,
related to each other.
356
00:33:07 --> 00:33:10
One way I could do it is,
of course,
357
00:33:10 --> 00:33:16
to say now if I plug the
formula for x and the formula
358
00:33:16 --> 00:33:23
for y into the formula for f
then w becomes a function of u
359
00:33:23 --> 00:33:27
and v,
and it can try to take partial
360
00:33:27 --> 00:33:29
derivatives.
If I have explicit formulas,
361
00:33:29 --> 00:33:32
well, that could work.
But maybe the formulas are
362
00:33:32 --> 00:33:35
complicated.
Typically, if I switch between
363
00:33:35 --> 00:33:37
rectangular and polar
coordinates,
364
00:33:37 --> 00:33:41
there might be inverse trig,
there might be maybe arctangent
365
00:33:41 --> 00:33:45
to express the polar angle in
terms of x and y.
366
00:33:45 --> 00:33:51
And when I don't really want to
actually substitute arctangents
367
00:33:51 --> 00:33:56
everywhere, maybe I would rather
deal with the derivatives.
368
00:33:56 --> 00:34:03
How do I do that?
The question is what are
369
00:34:03 --> 00:34:11
partial w over partial u and
partial w over partial v in
370
00:34:11 --> 00:34:17
terms of, let's see,
what do we need to know to
371
00:34:17 --> 00:34:22
understand that?
Well, probably we should know
372
00:34:22 --> 00:34:28
how w depends on x and y.
If we don't know that then we
373
00:34:28 --> 00:34:32
are probably toast.
Partial w over partial x,
374
00:34:32 --> 00:34:36
partial w over partial y should
be required.
375
00:34:36 --> 00:34:39
What else should we know?
Well, it would probably help to
376
00:34:39 --> 00:34:42
know how x and y depend on u and
v.
377
00:34:42 --> 00:34:46
If we don't know that then we
don't really know how to do it.
378
00:34:46 --> 00:34:55
We need also x sub u,
x sub v, y sub u,
379
00:34:55 --> 00:35:00
y sub v.
We have a lot of partials in
380
00:35:00 --> 00:35:07
there.
Well, let's see how we can do
381
00:35:07 --> 00:35:13
that.
Let's start by writing dw.
382
00:35:13 --> 00:35:19
We know that dw is partial f,
well, I don't know why I have
383
00:35:19 --> 00:35:25
two names, w and f.
I mean w and f are really the
384
00:35:25 --> 00:35:30
same thing here,
but let's say f sub x dx plus f
385
00:35:30 --> 00:35:35
sub y dy.
So far that is our new friend,
386
00:35:35 --> 00:35:39
the differential.
Now what do we want to do with
387
00:35:39 --> 00:35:42
it?
Well, we would like to get rid
388
00:35:42 --> 00:35:47
of dx and dy because we like to
express things in terms of,
389
00:35:47 --> 00:35:50
you know, the question we are
asking ourselves is let's say
390
00:35:50 --> 00:35:55
that I change u a little bit,
how does w change?
391
00:35:55 --> 00:35:58
Of course, what happens,
if I change u a little bit,
392
00:35:58 --> 00:36:01
is y and y will change.
How do they change?
393
00:36:01 --> 00:36:05
Well, that is given to me by
the differential.
394
00:36:05 --> 00:36:13
dx is going to be,
well, I can use the
395
00:36:13 --> 00:36:19
differential again.
Well, x is a function of u and
396
00:36:19 --> 00:36:24
v.
That will be x sub u times du
397
00:36:24 --> 00:36:28
plus x sub v times dv.
That is, again,
398
00:36:28 --> 00:36:31
taking the differential of a
function of two variables.
399
00:36:31 --> 00:36:37
Does that make sense?
And then we have the other guy,
400
00:36:37 --> 00:36:39
f sub y times,
what is dy?
401
00:36:39 --> 00:36:49
Well, similarly dy is y sub u
du plus y sub v dv.
402
00:36:49 --> 00:36:54
And now we have a relation
between dw and du and dv.
403
00:36:54 --> 00:37:00
We are expressing how w reacts
to changes in u and v,
404
00:37:00 --> 00:37:04
which was our goal.
Now, let's actually collect
405
00:37:04 --> 00:37:08
terms so that we see it a bit
better.
406
00:37:08 --> 00:37:19
It is going to be f sub x times
x sub u times f sub y times y
407
00:37:19 --> 00:37:28
sub u du plus f sub x,
x sub v plus f sub y y sub v
408
00:37:28 --> 00:37:32
dv.
Now we have dw equals something
409
00:37:32 --> 00:37:38
du plus something dv.
Well, the coefficient here has
410
00:37:38 --> 00:37:44
to be partial f over partial u.
What else could it be?
411
00:37:44 --> 00:37:49
That's the rate of change of w
with respect to u if I forget
412
00:37:49 --> 00:37:54
what happens when I change v.
That is the definition of a
413
00:37:54 --> 00:37:58
partial.
Similarly, this one has to be
414
00:37:58 --> 00:38:04
partial f over partial v.
That is because it is the rate
415
00:38:04 --> 00:38:09
of change with respect to v,
if I keep u constant,
416
00:38:09 --> 00:38:13
so that these guys are
completely ignored.
417
00:38:13 --> 00:38:16
Now you see how the total
differential accounts for,
418
00:38:16 --> 00:38:21
somehow, all the partial
derivatives that come as
419
00:38:21 --> 00:38:27
coefficients of the individual
variables in these expressions.
420
00:38:27 --> 00:38:33
Let me maybe rewrite these
formulas in a more visible way
421
00:38:33 --> 00:38:40
and then re-explain them to you.
Here is the chain rule for this
422
00:38:40 --> 00:38:46
situation, with two intermediate
variables and two variables that
423
00:38:46 --> 00:38:50
you express these in terms of.
In our setting,
424
00:38:50 --> 00:38:56
we get partial f over partial u
equals partial f over partial x
425
00:38:56 --> 00:39:02
time partial x over partial u
plus partial f over partial y
426
00:39:02 --> 00:39:08
times partial y over partial u.
And the other one,
427
00:39:08 --> 00:39:15
the same thing with v instead
of u,
428
00:39:15 --> 00:39:22
partial f over partial x times
partial x over partial v plus
429
00:39:22 --> 00:39:28
partial f over partial u partial
y over partial v.
430
00:39:28 --> 00:39:31
I have to explain various
things about these formulas
431
00:39:31 --> 00:39:34
because they look complicated.
And, actually,
432
00:39:34 --> 00:39:39
they are not that complicated.
A couple of things to know.
433
00:39:39 --> 00:39:42
The first thing,
how do we remember a formula
434
00:39:42 --> 00:39:44
like that?
Well, that is easy.
435
00:39:44 --> 00:39:47
We want to know how f depends
on u.
436
00:39:47 --> 00:39:51
Well, what does f depend on?
It depends on x and y.
437
00:39:51 --> 00:39:55
So we will put partial f over
partial x and partial f over
438
00:39:55 --> 00:39:59
partial y.
Now, x and y, why are they here?
439
00:39:59 --> 00:40:01
Well, they are here because
they actually depend on u as
440
00:40:01 --> 00:40:04
well.
How does x depend on u?
441
00:40:04 --> 00:40:06
Well, the answer is partial x
over partial u.
442
00:40:06 --> 00:40:10
How does y depend on u?
The answer is partial y over
443
00:40:10 --> 00:40:12
partial u.
See, the structure of this
444
00:40:12 --> 00:40:16
formula is simple.
To find the partial of f with
445
00:40:16 --> 00:40:20
respect to some new variable you
use the partials with respect to
446
00:40:20 --> 00:40:24
the variables that f was
initially defined in terms of x
447
00:40:24 --> 00:40:28
and y.
And you multiply them by the
448
00:40:28 --> 00:40:33
partials of x and y in terms of
the new variable that you want
449
00:40:33 --> 00:40:37
to look at, v here,
and you sum these things
450
00:40:37 --> 00:40:40
together.
That is the structure of the
451
00:40:40 --> 00:40:42
formula.
Why does it work?
452
00:40:42 --> 00:40:45
Well, let me explain it to you
in a slightly different
453
00:40:45 --> 00:40:49
language.
This asks us how does f change
454
00:40:49 --> 00:40:54
if I change u a little bit?
Well, why would f change if u
455
00:40:54 --> 00:40:57
changes a little bit?
Well, it would change because f
456
00:40:57 --> 00:41:00
actually depends on x and y and
x and y depend on u.
457
00:41:00 --> 00:41:03
If I change u,
how quickly does x change?
458
00:41:03 --> 00:41:06
Well, the answer is partial x
over partial u.
459
00:41:06 --> 00:41:09
And now, if I change x at this
rate, how does that have to
460
00:41:09 --> 00:41:13
change?
Well, the answer is partial f
461
00:41:13 --> 00:41:17
over partial x times this guy.
Well, at the same time,
462
00:41:17 --> 00:41:21
y is also changing.
How fast is y changing if I
463
00:41:21 --> 00:41:24
change u?
Well, at the rate of partial y
464
00:41:24 --> 00:41:27
over partial u.
But now if I change this how
465
00:41:27 --> 00:41:30
does f change?
Well, the rate of change is
466
00:41:30 --> 00:41:34
partial f over partial y.
The product is the effect of
467
00:41:34 --> 00:41:37
how you change it,
changing u, and therefore
468
00:41:37 --> 00:41:40
changing f.
Now, what happens in real life,
469
00:41:40 --> 00:41:43
if I change u a little bit?
Well, both x and y change at
470
00:41:43 --> 00:41:46
the same time.
So how does f change?
471
00:41:46 --> 00:41:50
Well, it is the sum of the two
effects.
472
00:41:50 --> 00:41:54
Does that make sense?
Good.
473
00:41:54 --> 00:42:00
Of course, if f depends on more
variables then you just have
474
00:42:00 --> 00:42:02
more terms in here.
OK.
475
00:42:02 --> 00:42:05
Here is another thing that may
be a little bit confusing.
476
00:42:05 --> 00:42:09
What is tempting?
Well, what is tempting here
477
00:42:09 --> 00:42:12
would be to simplify these
formulas by removing these
478
00:42:12 --> 00:42:15
partial x's.
Let's simplify by partial x.
479
00:42:15 --> 00:42:18
Let's simplify by partial y.
We get partial f over partial u
480
00:42:18 --> 00:42:21
equals partial f over partial u
plus partial f over partial u.
481
00:42:21 --> 00:42:25
Something is not working
properly.
482
00:42:25 --> 00:42:28
Why doesn't it work?
The answer is precisely because
483
00:42:28 --> 00:42:32
these are partial derivatives.
These are not total derivatives.
484
00:42:32 --> 00:42:36
And so you cannot simplify them
in that way.
485
00:42:36 --> 00:42:39
And that is actually the reason
why we use this curly d rather
486
00:42:39 --> 00:42:41
than a straight d.
It is to remind us,
487
00:42:41 --> 00:42:44
beware, there are these
simplifications that we can do
488
00:42:44 --> 00:42:47
with straight d's that are not
legal here.
489
00:42:47 --> 00:42:52
Somehow, when you have a
partial derivative,
490
00:42:52 --> 00:42:57
you must resist the urge of
simplifying things.
491
00:42:57 --> 00:43:02
No simplifications in here.
That is the simplest formula
492
00:43:02 --> 00:43:10
you can get.
Any questions at this point?
493
00:43:10 --> 00:43:21
No.
Yes?
494
00:43:21 --> 00:43:23
When would you use this and
what does it describe?
495
00:43:23 --> 00:43:26
Well, it is basically when you
have a function given in terms
496
00:43:26 --> 00:43:29
of a certain set of variables
because maybe there is a simply
497
00:43:29 --> 00:43:31
expression in terms of those
variables.
498
00:43:31 --> 00:43:35
But ultimately what you care
about is not those variables,
499
00:43:35 --> 00:43:39
z and y, but another set of
variables, here u and v.
500
00:43:39 --> 00:43:42
So x and y are giving you a
nice formula for f,
501
00:43:42 --> 00:43:46
but actually the relevant
variables for your problem are u
502
00:43:46 --> 00:43:48
and v.
And you know x and y are
503
00:43:48 --> 00:43:50
related to u and v.
So, of course,
504
00:43:50 --> 00:43:53
what you could do is plug the
formulas the way that we did
505
00:43:53 --> 00:43:55
substituting.
But maybe that will give you
506
00:43:55 --> 00:43:59
very complicated expressions.
And maybe it is actually easier
507
00:43:59 --> 00:44:02
to just work with the derivates.
The important claim here is
508
00:44:02 --> 00:44:05
basically we don't need to know
the actual formulas.
509
00:44:05 --> 00:44:07
All we need to know are the
rate of changes.
510
00:44:07 --> 00:44:11
If we know all these rates of
change then we know how to take
511
00:44:11 --> 00:44:14
these derivatives without
actually having to plug in
512
00:44:14 --> 00:44:22
values.
Yes?
513
00:44:22 --> 00:44:25
Yes, you could certain do the
same things in terms of t.
514
00:44:25 --> 00:44:29
If x and y were functions of t
instead of being functions of u
515
00:44:29 --> 00:44:31
and v then it would be the same
thing.
516
00:44:31 --> 00:44:34
And you would have the same
formulas that I had,
517
00:44:34 --> 00:44:37
well, over there I still have
it.
518
00:44:37 --> 00:44:39
Why does that one have straight
d's?
519
00:44:39 --> 00:44:42
Well, the answer is I could put
curly d's if I wanted,
520
00:44:42 --> 00:44:45
but I end up with a function of
a single variable.
521
00:44:45 --> 00:44:48
If you have a single variable
then the partial,
522
00:44:48 --> 00:44:50
with respect to that variable,
is the same thing as the usual
523
00:44:50 --> 00:44:53
derivative.
We don't actually need to worry
524
00:44:53 --> 00:44:57
about curly in that case.
But that one is indeed special
525
00:44:57 --> 00:45:00
case of this one where instead
of x and y depending on two
526
00:45:00 --> 00:45:03
variables, u and v,
they depend on a single
527
00:45:03 --> 00:45:04
variable t.
Now, of course,
528
00:45:04 --> 00:45:06
you can call variables any name
you want.
529
00:45:06 --> 00:45:12
It doesn't matter.
This is just a slight
530
00:45:12 --> 00:45:16
generalization of that.
Well, not quite because here I
531
00:45:16 --> 00:45:18
also had a z.
See, I am trying to just
532
00:45:18 --> 00:45:21
confuse you by giving you
functions that depend on various
533
00:45:21 --> 00:45:25
numbers of variables.
If you have a function of 30
534
00:45:25 --> 00:45:28
variables, things work the same
way, just longer,
535
00:45:28 --> 00:45:33
and you are going to run out of
letters in the alphabet before
536
00:45:33 --> 00:45:38
the end.
Any other questions?
537
00:45:38 --> 00:45:43
No.
What?
538
00:45:43 --> 00:45:51
Yes?
If u and v themselves depended
539
00:45:51 --> 00:45:55
on another variable then you
would continue with your chain
540
00:45:55 --> 00:45:58
rules.
Maybe you would know to express
541
00:45:58 --> 00:46:02
partial x over partial u in
terms using that chain rule.
542
00:46:02 --> 00:46:05
Sorry.
If u and v are dependent on yet
543
00:46:05 --> 00:46:08
another variable then you could
get the derivative with respect
544
00:46:08 --> 00:46:11
to that using first the chain
rule to pass from u v to that
545
00:46:11 --> 00:46:14
new variable,
and then you would plug in
546
00:46:14 --> 00:46:17
these formulas for partials of f
with respect to u and v.
547
00:46:17 --> 00:46:19
In fact, if you have several
substitutions to do,
548
00:46:19 --> 00:46:21
you can always arrange to use
one chain rule at a time.
549
00:46:21 --> 00:46:25
You just have to do them in
sequence.
550
00:46:25 --> 00:46:28
That's why we don't actually
learn that, but you can just do
551
00:46:28 --> 00:46:32
it be repeating the process.
I mean, probably at that stage,
552
00:46:32 --> 00:46:35
the easiest to not get confused
actually is to manipulate
553
00:46:35 --> 00:46:38
differentials because that is
probably easier.
554
00:46:38 --> 00:46:47
Yes?
Curly f does not exist.
555
00:46:47 --> 00:46:50
That's easy.
Curly f makes no sense by
556
00:46:50 --> 00:46:52
itself.
It doesn't exist alone.
557
00:46:52 --> 00:46:58
What exists is only curly df
over curly d some variable.
558
00:46:58 --> 00:47:02
And then that accounts only for
the rate of change with respect
559
00:47:02 --> 00:47:05
to that variable leaving the
others fixed,
560
00:47:05 --> 00:47:11
while straight df is somehow a
total variation of f.
561
00:47:11 --> 00:47:16
It accounts for all of the
partial derivatives and their
562
00:47:16 --> 00:47:25
combined effects.
OK. Any more questions? No.
563
00:47:25 --> 00:47:29
Let me just finish up very
quickly by telling you again one
564
00:47:29 --> 00:47:33
example where completely you
might want to do this.
565
00:47:33 --> 00:47:40
You have a function that you
want to switch between
566
00:47:40 --> 00:47:45
rectangular and polar
coordinates.
567
00:47:45 --> 00:47:48
To make things a little bit
concrete.
568
00:47:48 --> 00:47:55
If you have polar coordinates
that means in the plane,
569
00:47:55 --> 00:48:00
instead of using x and y,
you will use coordinates r,
570
00:48:00 --> 00:48:05
distance to the origin,
and theta, the angles from the
571
00:48:05 --> 00:48:08
x-axis.
The change of variables for
572
00:48:08 --> 00:48:14
that is x equals r cosine theta
and y equals r sine theta.
573
00:48:14 --> 00:48:21
And so that means if you have a
function f that depends on x and
574
00:48:21 --> 00:48:29
y, in fact, you can plug these
in as a function of r and theta.
575
00:48:29 --> 00:48:34
Then you can ask yourself,
well, what is partial f over
576
00:48:34 --> 00:48:37
partial r?
And that is going to be,
577
00:48:37 --> 00:48:42
well, you want to take partial
f over partial x times partial x
578
00:48:42 --> 00:48:48
partial r plus partial f over
partial y times partial y over
579
00:48:48 --> 00:48:53
partial r.
That will end up being actually
580
00:48:53 --> 00:48:59
f sub x times cosine theta plus
f sub y times sine theta.
581
00:48:59 --> 00:49:02
And you can do the same thing
to find partial f,
582
00:49:02 --> 00:49:05
partial theta.
And so you can express
583
00:49:05 --> 00:49:10
derivatives either in terms of
x, y or in terms of r and theta
584
00:49:10 --> 00:49:13
with simple relations between
them.
585
00:49:13 --> 00:49:20
And the one last thing I should
say.
586
00:49:20 --> 00:49:23
On Thursday we will learn about
more tricks we can play with
587
00:49:23 --> 00:49:27
variations of functions.
And one that is important,
588
00:49:27 --> 00:49:29
because you need to know it
actually to do the p-set,
589
00:49:29 --> 00:49:38
is the gradient vector.
The gradient vector is simply a
590
00:49:38 --> 00:49:41
vector.
You use this downward pointing
591
00:49:41 --> 00:49:44
triangle as the notation for the
gradient.
592
00:49:44 --> 00:49:49
It is simply is a vector whose
components are the partial
593
00:49:49 --> 00:49:53
derivatives of a function.
I mean, in a way,
594
00:49:53 --> 00:49:56
you can think of a differential
as a way to package partial
595
00:49:56 --> 00:49:59
derivatives together into some
weird object.
596
00:49:59 --> 00:50:01
Well, the gradient is also a
way to package partials
597
00:50:01 --> 00:50:04
together.
We will see on Thursday what it
598
00:50:04 --> 00:50:07
is good for, but some of the
problems on the p-set use it.
599
00:50:07 --> 00:50:09