1
00:00:00,060 --> 00:00:01,780
The following
content is provided
2
00:00:01,780 --> 00:00:04,019
under a Creative
Commons license.
3
00:00:04,019 --> 00:00:06,870
Your support will help MIT
OpenCourseWare continue
4
00:00:06,870 --> 00:00:10,730
to offer high quality
educational resources for free.
5
00:00:10,730 --> 00:00:13,330
To make a donation or
view additional materials
6
00:00:13,330 --> 00:00:17,217
from hundreds of MIT courses,
visit MIT OpenCourseWare
7
00:00:17,217 --> 00:00:17,842
at ocw.mit.edu.
8
00:00:20,870 --> 00:00:25,050
PROFESSOR: We established that,
essentially, what we want to do
9
00:00:25,050 --> 00:00:29,500
is to describe the
properties of a system that
10
00:00:29,500 --> 00:00:30,253
is in equilibrium.
11
00:00:33,340 --> 00:00:39,060
And a system in equilibrium
is characterized
12
00:00:39,060 --> 00:00:41,920
by a certain number
of parameters.
13
00:00:41,920 --> 00:00:48,090
We discussed
displacement and forces
14
00:00:48,090 --> 00:00:52,200
that are used for
mechanical properties.
15
00:00:52,200 --> 00:00:56,570
We described how when systems
are in thermal equilibrium,
16
00:00:56,570 --> 00:01:01,800
the exchange of heat requires
that there is temperature
17
00:01:01,800 --> 00:01:04,590
that will be the
same between them.
18
00:01:04,590 --> 00:01:07,000
So that was where the
Zeroth Law came and told us
19
00:01:07,000 --> 00:01:11,410
that there is another
function of state.
20
00:01:11,410 --> 00:01:15,040
Then, we saw that,
from the First Law,
21
00:01:15,040 --> 00:01:18,020
there was energy, which is
another important function
22
00:01:18,020 --> 00:01:19,180
of state.
23
00:01:19,180 --> 00:01:26,445
And from the Second Law,
we arrived at entropy.
24
00:01:29,920 --> 00:01:33,000
And then by
manipulating these, we
25
00:01:33,000 --> 00:01:37,100
generated a whole set of
other functions, free energy,
26
00:01:37,100 --> 00:01:41,830
enthalpy, Gibbs free energy, the
grand potential, and the list
27
00:01:41,830 --> 00:01:43,850
goes on.
28
00:01:43,850 --> 00:01:46,670
And when the system
is in equilibrium,
29
00:01:46,670 --> 00:01:49,920
it has a well-defined
values of these quantities.
30
00:01:49,920 --> 00:01:52,670
You go from one equilibrium
to another equilibrium,
31
00:01:52,670 --> 00:01:54,740
and these quantities change.
32
00:01:54,740 --> 00:01:57,790
But of course, we saw that the
number of degrees of freedom
33
00:01:57,790 --> 00:02:00,700
that you have to
describe the system
34
00:02:00,700 --> 00:02:08,030
is indicated through looking
at the changes in energy, which
35
00:02:08,030 --> 00:02:10,740
if you were only
doing mechanical work,
36
00:02:10,740 --> 00:02:15,620
you would write as sum over all
possible ways of introducing
37
00:02:15,620 --> 00:02:17,755
mechanical work into the system.
38
00:02:20,296 --> 00:02:21,670
Then, we saw that
it was actually
39
00:02:21,670 --> 00:02:24,330
useful to separate
out the chemical work.
40
00:02:24,330 --> 00:02:26,410
So we could also
write this as a sum
41
00:02:26,410 --> 00:02:31,040
of an alpha chemical
potential number of particles.
42
00:02:31,040 --> 00:02:34,800
But there was also
ways of changing
43
00:02:34,800 --> 00:02:40,270
the energy of the system
through addition of heat.
44
00:02:40,270 --> 00:02:44,040
And so ultimately,
we saw that if there
45
00:02:44,040 --> 00:02:49,530
were n ways of doing
chemical and mechanical work,
46
00:02:49,530 --> 00:02:56,400
and one way of introducing heat
into the system, essentially
47
00:02:56,400 --> 00:03:00,440
n plus 1 variables are
sufficient to determine
48
00:03:00,440 --> 00:03:02,430
where you are in
this phase space.
49
00:03:02,430 --> 00:03:05,370
Once you have n
plus 1 of that list,
50
00:03:05,370 --> 00:03:08,990
you can input, in
principle, determine others
51
00:03:08,990 --> 00:03:11,380
as long as you have
not chosen things
52
00:03:11,380 --> 00:03:13,470
that are really
dependent on each other.
53
00:03:13,470 --> 00:03:16,060
So you have to choose
independent ones,
54
00:03:16,060 --> 00:03:19,650
and we had some discussion
of how that comes into play.
55
00:03:22,190 --> 00:03:26,080
So I said that today
we will briefly
56
00:03:26,080 --> 00:03:29,140
conclude with the
last, or the Third Law.
57
00:03:33,580 --> 00:03:35,790
This is the statement
about trying
58
00:03:35,790 --> 00:03:39,440
to calculate the
behavior of entropy
59
00:03:39,440 --> 00:03:41,880
as a function of temperature.
60
00:03:41,880 --> 00:03:46,650
And in principle,
you can imagine
61
00:03:46,650 --> 00:03:51,545
as a function of some coordinate
of your system-- capital X
62
00:03:51,545 --> 00:03:55,520
could indicate pressure,
volume, anything.
63
00:03:55,520 --> 00:03:58,590
You calculate that at
some particular value
64
00:03:58,590 --> 00:04:03,620
of temperature, T, T the
difference in entropy
65
00:04:03,620 --> 00:04:07,255
that you would have
between two points
66
00:04:07,255 --> 00:04:08,895
parametrized by X1 and X2.
67
00:04:11,590 --> 00:04:15,220
And in principle,
what you need to do
68
00:04:15,220 --> 00:04:21,149
is to find some kind of a
path for changing parameters
69
00:04:21,149 --> 00:04:28,340
from X1 to X2 and calculate,
in a reversible process, how
70
00:04:28,340 --> 00:04:31,370
much heat you have to
put into the system.
71
00:04:31,370 --> 00:04:35,955
Let's say at this fixed
temperature, T, divide by T. T
72
00:04:35,955 --> 00:04:43,350
is not changing along the
process from say X1 to X2.
73
00:04:43,350 --> 00:04:46,680
And this would be a
difference between the entropy
74
00:04:46,680 --> 00:04:51,490
that you would have
between these two
75
00:04:51,490 --> 00:04:56,540
quantities, between
these two points.
76
00:04:56,540 --> 00:04:59,690
You could, in principle,
then repeat this process
77
00:04:59,690 --> 00:05:06,540
at some lower temperature and
keep going all the way down
78
00:05:06,540 --> 00:05:09,530
to 0 temperature.
79
00:05:09,530 --> 00:05:20,120
What Nernst observed was that as
he went through this procedure
80
00:05:20,120 --> 00:05:28,250
to lower and lower temperatures,
this difference-- Let's
81
00:05:28,250 --> 00:05:35,840
call it delta s of T going
from X1 to X2 goes to 0.
82
00:05:42,320 --> 00:05:48,580
So it looks like, certainly
at this temperature,
83
00:05:48,580 --> 00:05:52,270
there is a change in entropy
going from one to another.
84
00:05:52,270 --> 00:05:53,680
There's also a change.
85
00:05:53,680 --> 00:05:56,530
This change gets
smaller and smaller
86
00:05:56,530 --> 00:05:59,330
as if, when you get
to 0 temperature,
87
00:05:59,330 --> 00:06:03,240
the value of your entropy is
independent of X. Whatever
88
00:06:03,240 --> 00:06:06,600
X you choose, you'll have
the same value of entropy.
89
00:06:09,390 --> 00:06:16,030
Now, that led to, after a while,
to a more ambitious version
90
00:06:16,030 --> 00:06:19,520
statement of the Third Law
that I will write down,
91
00:06:19,520 --> 00:06:31,515
which is that the entropy of
all substances at the zero
92
00:06:31,515 --> 00:06:44,680
of thermodynamic temperature is
the same and can be set to 0.
93
00:06:44,680 --> 00:06:52,500
Same universal
constant, set to 0.
94
00:06:56,200 --> 00:06:59,810
It's, in principle,
through these integration
95
00:06:59,810 --> 00:07:01,940
from one point to another
point, the only thing
96
00:07:01,940 --> 00:07:07,520
that you can calculate is the
difference between entropies.
97
00:07:07,520 --> 00:07:10,920
And essentially, this
suggests that the difference
98
00:07:10,920 --> 00:07:14,610
between entropies goes to 0,
but let's be more ambitious
99
00:07:14,610 --> 00:07:18,000
and say that even if you
look at different substances
100
00:07:18,000 --> 00:07:26,240
and you go to 0 temperature,
all of them have a unique value.
101
00:07:26,240 --> 00:07:30,920
And so there's more
evidence for being
102
00:07:30,920 --> 00:07:35,830
able to do this for
different substances via what
103
00:07:35,830 --> 00:07:41,930
is called allotropic state.
104
00:07:44,830 --> 00:07:50,450
So for example, some materials
can exist potentially
105
00:07:50,450 --> 00:07:53,440
in two different
crystalline states that
106
00:07:53,440 --> 00:07:58,390
are called allotropes,
for example, sulfur
107
00:07:58,390 --> 00:08:01,200
as a function of temperature.
108
00:08:01,200 --> 00:08:10,130
If you lower it's
temperature very slowly,
109
00:08:10,130 --> 00:08:19,740
it stays in some foreign all
the way down to 0 temperature.
110
00:08:19,740 --> 00:08:24,480
So if you change its temperature
rapidly, it stays in one form
111
00:08:24,480 --> 00:08:29,760
all the way to 0 temperature
in crystalline structure
112
00:08:29,760 --> 00:08:32,559
that is called monoclinic.
113
00:08:32,559 --> 00:08:35,500
If you cool it
very, very slowly,
114
00:08:35,500 --> 00:08:41,110
there is a temperature
around 40 degrees Celsius
115
00:08:41,110 --> 00:08:45,860
at which it makes a transition
to a different crystal
116
00:08:45,860 --> 00:08:48,140
structure.
117
00:08:48,140 --> 00:08:49,390
That is rhombohedral.
118
00:08:53,300 --> 00:08:57,116
And the thing that
I am plotting here,
119
00:08:57,116 --> 00:08:59,240
as a function of temperature,
is the heat capacity.
120
00:09:06,460 --> 00:09:11,320
And so if you are, let's
say, around room temperature,
121
00:09:11,320 --> 00:09:16,910
in principle you can say there's
two different forms of sulfur.
122
00:09:16,910 --> 00:09:20,670
One of them is truly stable,
and the other is metastable.
123
00:09:20,670 --> 00:09:25,950
That is, in principle, if
you rate what sufficiently
124
00:09:25,950 --> 00:09:27,700
is of the order of
[? centuries ?],
125
00:09:27,700 --> 00:09:31,900
you can get the transition from
this form to the stable form.
126
00:09:31,900 --> 00:09:34,770
But for our purposes,
at room temperature,
127
00:09:34,770 --> 00:09:37,470
you would say that
at the scale of times
128
00:09:37,470 --> 00:09:40,260
that I'm observing things, there
are these 2 possible states
129
00:09:40,260 --> 00:09:45,560
that are both equilibrium
states of the same substance.
130
00:09:45,560 --> 00:09:47,650
Now using these two
equilibrium states,
131
00:09:47,650 --> 00:09:52,780
I can start to test this
Nernst theorem generalized
132
00:09:52,780 --> 00:09:54,910
to different substances.
133
00:09:54,910 --> 00:09:57,500
If you, again, regard
these two different things
134
00:09:57,500 --> 00:09:59,440
as different substances.
135
00:09:59,440 --> 00:10:03,580
You could say that if I want
to calculate the entropy just
136
00:10:03,580 --> 00:10:08,270
slightly above the transition,
I can come from two paths.
137
00:10:08,270 --> 00:10:12,610
I can either come
from path number one.
138
00:10:12,610 --> 00:10:15,930
Along path number
one, I would say
139
00:10:15,930 --> 00:10:22,200
that the entropy at
this Tc plus is obtained
140
00:10:22,200 --> 00:10:27,520
by integrating
degree heat capacity,
141
00:10:27,520 --> 00:10:39,020
so integral dT Cx
of T divided by T.
142
00:10:39,020 --> 00:10:42,760
This combination is
none other than dQ.
143
00:10:42,760 --> 00:10:47,180
Basically, the combination
of heat capacity dT
144
00:10:47,180 --> 00:10:49,240
is the amount of
heat that you have
145
00:10:49,240 --> 00:10:52,040
to put the substance to
change its temperature.
146
00:10:52,040 --> 00:10:58,100
And you do this all the
way from 0 to Tc plus.
147
00:10:58,100 --> 00:11:00,130
Let's say we go
along this path that
148
00:11:00,130 --> 00:11:05,290
corresponds to this
monoclinic way.
149
00:11:05,290 --> 00:11:11,260
And I'm using this Cm that
corresponds to this as opposed
150
00:11:11,260 --> 00:11:15,400
to 0 that corresponds to this.
151
00:11:15,400 --> 00:11:19,850
Another thing that I can
do-- and I made a mistake
152
00:11:19,850 --> 00:11:24,050
because what I really need
to do is to, in principle,
153
00:11:24,050 --> 00:11:27,710
add to this some
entropy that I would
154
00:11:27,710 --> 00:11:32,140
assign to this green state at 0
because this is the difference.
155
00:11:32,140 --> 00:11:36,270
So this is the
entropy that I would
156
00:11:36,270 --> 00:11:41,080
assign to the monoclinic
state at T close to 0.
157
00:11:41,080 --> 00:11:44,740
Going along the
orange path, I would
158
00:11:44,740 --> 00:11:49,820
say that S evaluated
at Tc plus is
159
00:11:49,820 --> 00:11:54,300
obtained by integrating from 0.
160
00:11:54,300 --> 00:11:59,110
Let's say to Tc
minus dT, the heat
161
00:11:59,110 --> 00:12:01,730
capacity of this rhombic phase.
162
00:12:06,580 --> 00:12:10,160
But when I get to just
below the transition,
163
00:12:10,160 --> 00:12:13,650
I want to go to just
above the transition.
164
00:12:13,650 --> 00:12:18,080
I have to actually be put in
certain amount of latent heat.
165
00:12:18,080 --> 00:12:25,130
So here I have to add
latent heat L, always
166
00:12:25,130 --> 00:12:29,120
at the temperatures Tc, to
gradually make the substance
167
00:12:29,120 --> 00:12:31,650
transition from
one to the other.
168
00:12:31,650 --> 00:12:35,470
So I have to add here L of Tc.
169
00:12:35,470 --> 00:12:40,650
This would be the
integration of dQ,
170
00:12:40,650 --> 00:12:46,160
but then I would have to add
the entropy that I would assign
171
00:12:46,160 --> 00:12:49,910
to the orange state
at 0 temperature.
172
00:12:49,910 --> 00:12:53,040
So this is something that
you can do experimentally.
173
00:12:53,040 --> 00:12:55,930
You can evaluate at these
integrals, and what you'll find
174
00:12:55,930 --> 00:12:59,970
is that these two
things are the same.
175
00:12:59,970 --> 00:13:03,250
So this is yet
another justification
176
00:13:03,250 --> 00:13:09,210
of this entropy
being independent
177
00:13:09,210 --> 00:13:13,000
of where you start
at 0 temperature.
178
00:13:13,000 --> 00:13:16,030
Again at this
point, if you like,
179
00:13:16,030 --> 00:13:18,100
you can by [INAUDIBLE]
state that this
180
00:13:18,100 --> 00:13:21,215
is 0 for everything
will start with 0.
181
00:13:27,730 --> 00:13:35,130
So this is a supposed new
law of thermodynamics.
182
00:13:35,130 --> 00:13:36,000
Is it useful?
183
00:13:36,000 --> 00:13:38,770
What can we deduce from that?
184
00:13:38,770 --> 00:13:40,830
So let's look at
the consequences.
185
00:13:46,280 --> 00:13:48,740
First thing is so what
I have established
186
00:13:48,740 --> 00:13:54,890
is that the limit
as T goes to 0 of S,
187
00:13:54,890 --> 00:13:57,440
irrespective of whatever
set of parameters
188
00:13:57,440 --> 00:14:02,920
I have-- so I pick T as one
of my n plus one coordinates,
189
00:14:02,920 --> 00:14:06,490
and I put some other
bunch of coordinates here.
190
00:14:06,490 --> 00:14:08,820
I take the limit
of this going to 0.
191
00:14:08,820 --> 00:14:09,870
This becomes 0.
192
00:14:13,410 --> 00:14:17,790
So that means, almost
by construction,
193
00:14:17,790 --> 00:14:24,540
that if I take the derivative of
S with respect to any of these
194
00:14:24,540 --> 00:14:30,650
coordinates-- if I take then
the limit as T goes to 0,
195
00:14:30,650 --> 00:14:33,790
this would be
fixed T. This is 0.
196
00:14:39,690 --> 00:14:40,190
Fine.
197
00:14:40,190 --> 00:14:44,220
So basically, this
is another way
198
00:14:44,220 --> 00:14:49,290
of stating that entropy
differences go through 0.
199
00:14:49,290 --> 00:14:52,890
But it does have a consequence
because one thing that you will
200
00:14:52,890 --> 00:14:55,797
frequently measure
are quantities,
201
00:14:55,797 --> 00:14:56,630
such as extensivity.
202
00:15:01,160 --> 00:15:03,700
What do I mean by that?
203
00:15:03,700 --> 00:15:05,050
Let's pick a displacement.
204
00:15:05,050 --> 00:15:07,650
Could be the length of a wire.
205
00:15:07,650 --> 00:15:11,210
Could be the volume of a gas.
206
00:15:11,210 --> 00:15:17,860
And we can ask if I were
to change temperature,
207
00:15:17,860 --> 00:15:21,030
how does that quantity change?
208
00:15:21,030 --> 00:15:26,230
So these are quantities
typically called alpha.
209
00:15:26,230 --> 00:15:29,020
Actually, usually
you would also divide
210
00:15:29,020 --> 00:15:35,115
by x to make them intensive
because otherwise x
211
00:15:35,115 --> 00:15:36,910
being extensive,
the whole quantity
212
00:15:36,910 --> 00:15:39,960
would have been extensive.
213
00:15:39,960 --> 00:15:44,840
Let's say we do this at fixed
corresponding displacement.
214
00:15:44,840 --> 00:15:48,430
So something that
is very relevant
215
00:15:48,430 --> 00:15:51,550
is you take the volume of
gas who changes temperature
216
00:15:51,550 --> 00:15:54,350
at fixed pressure, and
the volume of the gas
217
00:15:54,350 --> 00:15:59,070
will shrink or expand
according to this extensive.
218
00:15:59,070 --> 00:16:04,970
Now, this can be related to this
through Maxwell relationship.
219
00:16:04,970 --> 00:16:07,620
So let's see what I have to do.
220
00:16:07,620 --> 00:16:13,520
I have that dE is
something like Jdx plus,
221
00:16:13,520 --> 00:16:17,990
according to what I
have over there, TdS.
222
00:16:17,990 --> 00:16:21,940
I want to be able to write
a Maxwell relation that
223
00:16:21,940 --> 00:16:25,420
relates a derivative of x.
224
00:16:25,420 --> 00:16:28,510
So I want to make x
into a first derivative.
225
00:16:28,510 --> 00:16:31,640
So I look at E minus Jx.
226
00:16:31,640 --> 00:16:36,240
And this Jdx becomes minus xdJ.
227
00:16:36,240 --> 00:16:40,070
But I want to take a derivative
of x with respect not
228
00:16:40,070 --> 00:16:44,590
s, but with respect
to T. So I'll do that.
229
00:16:44,590 --> 00:16:48,130
This becomes a minus SdT.
230
00:16:48,130 --> 00:16:52,070
So now, I immediately see
that I will have a Maxwell
231
00:16:52,070 --> 00:16:58,110
relation that says dx
by dT at constant J
232
00:16:58,110 --> 00:17:02,850
is the same thing as
dS by dJ at constant T.
233
00:17:02,850 --> 00:17:09,321
So this is the same thing by
the Maxwell relation as dS
234
00:17:09,321 --> 00:17:23,000
by dJ at constant T. All right?
235
00:17:23,000 --> 00:17:27,599
This is one of these quantities,
therefore, as T goes 0,
236
00:17:27,599 --> 00:17:28,590
this goes to 0.
237
00:17:28,590 --> 00:17:33,370
And therefore, the
expansivity should go to 0.
238
00:17:33,370 --> 00:17:38,700
So any quantity that measures
expansion, contraction,
239
00:17:38,700 --> 00:17:42,070
or some other change as a
function of temperature,
240
00:17:42,070 --> 00:17:45,230
according to this law, as
you go through 0 temperature,
241
00:17:45,230 --> 00:17:46,350
should go to 0.
242
00:17:49,780 --> 00:17:54,100
There's one other quantity
that also goes to 0,
243
00:17:54,100 --> 00:17:57,810
and that's the heat capacity.
244
00:17:57,810 --> 00:18:04,160
So if I want to calculate the
difference between entropy
245
00:18:04,160 --> 00:18:10,820
at some temperature T
and some temperature
246
00:18:10,820 --> 00:18:15,570
at 0 along some particular
path corresponding
247
00:18:15,570 --> 00:18:18,810
to some constant
x for example, you
248
00:18:18,810 --> 00:18:20,640
would say that what
I need to do is
249
00:18:20,640 --> 00:18:27,310
to integrate from 0 to
T the heat that I have
250
00:18:27,310 --> 00:18:31,370
to put into the
system at constant x.
251
00:18:31,370 --> 00:18:34,700
And so if I do
that slowly enough,
252
00:18:34,700 --> 00:18:38,720
this heat I can write as CxdT.
253
00:18:38,720 --> 00:18:41,370
Cx, potentially,
is a function of T.
254
00:18:41,370 --> 00:18:44,750
Actually, since I'm indicating
T as the other point
255
00:18:44,750 --> 00:18:48,710
of integration, let me call
the variable of integration T
256
00:18:48,710 --> 00:18:50,610
prime.
257
00:18:50,610 --> 00:18:56,620
So I take a path in which
I change temperature.
258
00:18:56,620 --> 00:19:00,320
I calculate the heat
capacity at constant x.
259
00:19:00,320 --> 00:19:02,560
Integrate it.
260
00:19:02,560 --> 00:19:06,434
Multiply by dT to convert
it to T, and get the result.
261
00:19:09,220 --> 00:19:13,380
So all of these results that
they have been formulating
262
00:19:13,380 --> 00:19:18,630
suggest that the result that you
would get as a function of T,
263
00:19:18,630 --> 00:19:27,600
for entropy, is something that
as T goes to 0, approaches 0.
264
00:19:27,600 --> 00:19:31,640
So it should be a perfectly
nice, well-defined value
265
00:19:31,640 --> 00:19:35,210
at any finite temperature.
266
00:19:35,210 --> 00:19:38,170
Now, if you integrate
a constant divided
267
00:19:38,170 --> 00:19:42,420
by T, divided by dT, then
essentially the constant
268
00:19:42,420 --> 00:19:44,020
would give you a logarithm.
269
00:19:44,020 --> 00:19:48,100
And the logarithm would blow
up as we go to 0 temperature.
270
00:19:48,100 --> 00:19:52,240
So the only way that
this integral does not
271
00:19:52,240 --> 00:20:01,590
blow up on you-- so this is
finite only if the limit as T
272
00:20:01,590 --> 00:20:08,390
goes to 0 of the heat
capacities should also go to 0.
273
00:20:08,390 --> 00:20:12,550
So any heat capacity should
also essentially vanish
274
00:20:12,550 --> 00:20:14,890
as you go to lower
and lower temperature.
275
00:20:14,890 --> 00:20:17,940
This is something that you
will see many, many times when
276
00:20:17,940 --> 00:20:20,880
you look at different
heat capacities
277
00:20:20,880 --> 00:20:22,490
in the rest of the course.
278
00:20:25,030 --> 00:20:27,150
There is one other
aspect of this
279
00:20:27,150 --> 00:20:30,900
that I will not really explain,
but you can go and look
280
00:20:30,900 --> 00:20:36,410
at the notes or elsewhere, which
is that another consequence is
281
00:20:36,410 --> 00:20:47,222
unattainability of T equals
to 0 by any finite set
282
00:20:47,222 --> 00:20:47,805
of operations.
283
00:20:55,900 --> 00:21:00,130
Essentially, if you want
to get to 0 temperature,
284
00:21:00,130 --> 00:21:03,840
you'll have to do something
that cools you step by step.
285
00:21:03,840 --> 00:21:05,880
And the steps become
smaller and smaller,
286
00:21:05,880 --> 00:21:08,680
and you have to repeat
that many times.
287
00:21:08,680 --> 00:21:13,290
But that is another consequence.
288
00:21:13,290 --> 00:21:15,055
We'll leave that
for the time being.
289
00:21:18,780 --> 00:21:25,430
I would like to, however, end
by discussing some distinctions
290
00:21:25,430 --> 00:21:28,350
that are between
these different laws.
291
00:21:31,390 --> 00:21:36,140
So if you think
about whatever could
292
00:21:36,140 --> 00:21:42,210
be the microscopic
origin, after all, I
293
00:21:42,210 --> 00:21:46,240
have emphasized
that thermodynamics
294
00:21:46,240 --> 00:21:49,960
is a set of rules that you look
at substances as black boxes
295
00:21:49,960 --> 00:21:53,550
and you try to deduce a
certain number of things
296
00:21:53,550 --> 00:21:58,210
based on observations, such
as what Nernst did over here.
297
00:21:58,210 --> 00:22:00,730
But you say, these
black boxes, I
298
00:22:00,730 --> 00:22:03,750
know what is inside
them in principle.
299
00:22:03,750 --> 00:22:07,820
It's composed of atoms,
molecules, light, quark,
300
00:22:07,820 --> 00:22:09,670
whatever the
microscope theory is
301
00:22:09,670 --> 00:22:14,380
that you want to assign to
the components of that box.
302
00:22:14,380 --> 00:22:16,530
And I know the
dynamics that governs
303
00:22:16,530 --> 00:22:18,770
these microscopic
degrees of freedom.
304
00:22:18,770 --> 00:22:22,740
I should be able to get
the laws of thermodynamics
305
00:22:22,740 --> 00:22:27,000
starting from the
microscopic laws.
306
00:22:27,000 --> 00:22:30,170
Eventually, we will do
that, and as we do that,
307
00:22:30,170 --> 00:22:34,560
we will find the origin
of these different laws.
308
00:22:34,560 --> 00:22:39,670
Now, you won't be surprised
that the First Law is intimately
309
00:22:39,670 --> 00:22:44,390
connected to the fact that
any microscopic set of rules
310
00:22:44,390 --> 00:22:48,120
that you write down embodies
the conservation of energy.
311
00:22:52,570 --> 00:22:56,780
And all you have to make
sure is to understand
312
00:22:56,780 --> 00:23:01,750
precisely what heat is
as a form of energy.
313
00:23:01,750 --> 00:23:07,060
And then if we regard heat
as another form of energy,
314
00:23:07,060 --> 00:23:10,250
another component, it's
really the conservation law
315
00:23:10,250 --> 00:23:10,870
that we have.
316
00:23:14,630 --> 00:23:17,190
Then, you have the Zeroth
Law and the Second Law.
317
00:23:21,650 --> 00:23:25,530
The Zeroth Law and Second Law
have to do with equilibrium
318
00:23:25,530 --> 00:23:28,980
and being able to go in
some particular direction.
319
00:23:28,980 --> 00:23:35,170
And that always runs a fall of
the microscopic laws of motion
320
00:23:35,170 --> 00:23:39,370
that are typically things
that are time reversible where
321
00:23:39,370 --> 00:23:42,360
as the Zeroth Law and
Second Law are not.
322
00:23:42,360 --> 00:23:46,390
And what we will see later on,
through statistical mechanics,
323
00:23:46,390 --> 00:23:49,630
is that the origin
of these laws is
324
00:23:49,630 --> 00:24:00,180
that we are dealing with large
numbers of degrees of freedom.
325
00:24:03,760 --> 00:24:09,580
And once we adapt the
proper perspective
326
00:24:09,580 --> 00:24:12,960
to looking at properties
of large numbers of degrees
327
00:24:12,960 --> 00:24:15,330
of freedom, which
will be a start
328
00:24:15,330 --> 00:24:19,000
to do the elements of that
[? prescription ?] today,
329
00:24:19,000 --> 00:24:20,870
the Zeroth Law and
Second Law emerge.
330
00:24:23,990 --> 00:24:32,790
Now the Third Law, you
all know that once we
331
00:24:32,790 --> 00:24:37,590
go through this process,
eventually for example,
332
00:24:37,590 --> 00:24:41,560
we get things for the
description of entropy, which
333
00:24:41,560 --> 00:24:44,970
is related to some
number of states
334
00:24:44,970 --> 00:24:48,740
that the system
has indicated by g.
335
00:24:48,740 --> 00:24:55,880
And if you then want to
have S going through 0,
336
00:24:55,880 --> 00:24:59,690
you would require that
g goes to something
337
00:24:59,690 --> 00:25:04,600
that is order of 1-- of 1 if
you like-- as T goes to 0.
338
00:25:08,120 --> 00:25:10,520
And typically, you
would say that systems
339
00:25:10,520 --> 00:25:16,890
adopt their ground state, lowest
energy state, at 0 temperature.
340
00:25:16,890 --> 00:25:19,340
And so this is
somewhat a statement
341
00:25:19,340 --> 00:25:23,470
about the uniqueness of the
state of all possible systems
342
00:25:23,470 --> 00:25:26,120
at low temperature.
343
00:25:26,120 --> 00:25:31,420
Now, if you think about
the gas in this room,
344
00:25:31,420 --> 00:25:36,520
and let's imagine that
the particles of this gas
345
00:25:36,520 --> 00:25:40,830
either don't interact, which is
maybe a little bit unrealistic,
346
00:25:40,830 --> 00:25:42,640
but maybe repel each other.
347
00:25:42,640 --> 00:25:44,810
So let's say you have
a bunch of particles
348
00:25:44,810 --> 00:25:46,910
that just repel each other.
349
00:25:46,910 --> 00:25:51,180
Then, there is
really no reason why,
350
00:25:51,180 --> 00:25:54,850
as I go to lower and
lower temperatures,
351
00:25:54,850 --> 00:25:59,930
the number of configurations of
the molecules should decrease.
352
00:25:59,930 --> 00:26:03,180
All configurations that I
draw that they don't overlap
353
00:26:03,180 --> 00:26:05,900
have roughly the same energy.
354
00:26:05,900 --> 00:26:11,250
And indeed, if I look at say
any one of these properties,
355
00:26:11,250 --> 00:26:15,330
like the expansivity of a
gas at constant pressure
356
00:26:15,330 --> 00:26:19,230
which is given in fact
with a minus sign.
357
00:26:19,230 --> 00:26:23,850
dV by dT at constant pressure
would be the analog of one
358
00:26:23,850 --> 00:26:25,820
of these extensivities.
359
00:26:25,820 --> 00:26:32,290
If I use the Ideal Gas
Law-- So for ideal gas,
360
00:26:32,290 --> 00:26:35,530
we've seen that PV is
proportional to let's
361
00:26:35,530 --> 00:26:38,480
say some temperature.
362
00:26:38,480 --> 00:26:42,290
Then, dV by dT at
constant pressure
363
00:26:42,290 --> 00:26:46,390
is none other than V over
T. So this would give me
364
00:26:46,390 --> 00:26:59,270
1 over V, V over T. Probably
don't need it on this.
365
00:26:59,270 --> 00:27:03,000
This is going to
give me 1 over T.
366
00:27:03,000 --> 00:27:07,410
So not only doesn't it
go to 0 at 0 temperature,
367
00:27:07,410 --> 00:27:10,960
if the Ideal Gas
Law was satisfied,
368
00:27:10,960 --> 00:27:13,140
the extensivity would
actually diverge
369
00:27:13,140 --> 00:27:17,590
at 0 temperature as
different as you want.
370
00:27:17,590 --> 00:27:21,760
So clearly the Ideal Gas
Law, if it was applicable
371
00:27:21,760 --> 00:27:24,210
all the way down
to 0 temperature,
372
00:27:24,210 --> 00:27:27,330
would violate the Third
Law of thermodynamics.
373
00:27:27,330 --> 00:27:30,560
Again, not surprising
given that I have told you
374
00:27:30,560 --> 00:27:33,770
that a gas of classical
particles with repulsion
375
00:27:33,770 --> 00:27:36,880
has many states.
376
00:27:36,880 --> 00:27:39,850
Now, we will see
later on in the course
377
00:27:39,850 --> 00:27:48,220
that once we include quantum
mechanics, then as you
378
00:27:48,220 --> 00:27:50,910
go to 0 temperature,
these particles
379
00:27:50,910 --> 00:27:54,472
will have a unique state.
380
00:27:54,472 --> 00:27:59,177
If they are bosons, they will be
together in one wave function.
381
00:27:59,177 --> 00:28:01,260
If they are fermions, they
will arrange themselves
382
00:28:01,260 --> 00:28:05,190
appropriately so that,
because of quantum mechanics,
383
00:28:05,190 --> 00:28:08,530
all of these laws would
certainly breakdown at T equals
384
00:28:08,530 --> 00:28:09,750
to 0.
385
00:28:09,750 --> 00:28:13,080
You will get 0 entropy, and
you would get consistency
386
00:28:13,080 --> 00:28:16,010
with all of these things.
387
00:28:16,010 --> 00:28:18,650
So somehow, the nature
of the Third Law
388
00:28:18,650 --> 00:28:22,470
is different from the other
laws because its validity
389
00:28:22,470 --> 00:28:28,365
rests on being able to
be living in a world
390
00:28:28,365 --> 00:28:31,010
where quantum mechanics applies.
391
00:28:31,010 --> 00:28:34,350
So in principle, you could have
imagined some other universe
392
00:28:34,350 --> 00:28:36,700
where h-bar equals
to 0, and then
393
00:28:36,700 --> 00:28:39,580
the Third Law of thermodynamics
would not hold there
394
00:28:39,580 --> 00:28:41,990
whereas the Zeroth Law
and Second Law would hold.
395
00:28:41,990 --> 00:28:42,839
Yes?
396
00:28:42,839 --> 00:28:45,616
AUDIENCE: Are there any known
exceptions to the Third Law?
397
00:28:45,616 --> 00:28:47,240
Are we going to
[? account for them? ?]
398
00:28:52,330 --> 00:28:55,160
PROFESSOR: For
equilibrium-- So this
399
00:28:55,160 --> 00:28:59,010
is actually an
interesting question.
400
00:28:59,010 --> 00:29:03,050
What do I know
about-- classically,
401
00:29:03,050 --> 00:29:07,550
I can certainly come up with
lots of examples that violate.
402
00:29:07,550 --> 00:29:11,320
So your question then amounts if
I say that quantum mechanics is
403
00:29:11,320 --> 00:29:15,130
necessary, do I
know that the ground
404
00:29:15,130 --> 00:29:18,790
state of a quantum
mechanical system is unique.
405
00:29:18,790 --> 00:29:22,830
And I don't know of a proof of
that for interacting system.
406
00:29:22,830 --> 00:29:27,380
I don't know of a case that's
violated, but as far as I know,
407
00:29:27,380 --> 00:29:32,240
there is no proof that I give
you an interacting Hamiltonian
408
00:29:32,240 --> 00:29:36,550
for a quantum system, and
there's a unique ground state.
409
00:29:36,550 --> 00:29:39,110
And I should say
that there'd be no--
410
00:29:39,110 --> 00:29:42,350
and I'm sure you know of cases
where the ground state is not
411
00:29:42,350 --> 00:29:44,880
unique like a ferromagnet.
412
00:29:44,880 --> 00:29:50,040
But the point is not that
g should be exactly one,
413
00:29:50,040 --> 00:29:54,460
but that the limit
of log g divided
414
00:29:54,460 --> 00:29:56,460
by the number of
degrees of freedom
415
00:29:56,460 --> 00:30:02,280
that you have should go to
0 as n goes to infinity.
416
00:30:02,280 --> 00:30:07,620
So something like a ferromagnet
may have many ground states,
417
00:30:07,620 --> 00:30:09,590
but the number of
ground states is not
418
00:30:09,590 --> 00:30:13,440
proportional to the number of
sites, the number of spins,
419
00:30:13,440 --> 00:30:15,770
and this entity will go to 0.
420
00:30:15,770 --> 00:30:19,330
So all the cases that we
know, the ground state
421
00:30:19,330 --> 00:30:24,100
is either unique
or is order of one.
422
00:30:24,100 --> 00:30:27,300
But I don't know a theorem that
says that should be the case.
423
00:30:34,600 --> 00:30:36,550
So this is the last
thing that I wanted
424
00:30:36,550 --> 00:30:39,100
to say about thermodynamics.
425
00:30:39,100 --> 00:30:40,910
Are there any
questions in general?
426
00:30:45,850 --> 00:30:50,170
So I laid out the
necessity of having
427
00:30:50,170 --> 00:30:56,240
some kind of a description of
microscopic degrees of freedom
428
00:30:56,240 --> 00:31:00,680
that ultimately will
allow us to prove
429
00:31:00,680 --> 00:31:03,140
the laws of thermodynamics.
430
00:31:03,140 --> 00:31:08,570
And that will come through
statistical mechanics, which
431
00:31:08,570 --> 00:31:12,410
as the name implies, has
to have certain amount
432
00:31:12,410 --> 00:31:16,010
of statistic characters to it.
433
00:31:16,010 --> 00:31:18,230
What does that mean?
434
00:31:18,230 --> 00:31:22,240
It means that you have to
abandon a description of motion
435
00:31:22,240 --> 00:31:26,360
that is fully
deterministic for one
436
00:31:26,360 --> 00:31:27,725
that is based on probability.
437
00:31:30,860 --> 00:31:34,760
Now, I could have told you
first the degrees of freedom
438
00:31:34,760 --> 00:31:37,090
and what is the
description that we
439
00:31:37,090 --> 00:31:40,610
need for them to
be probabilistic,
440
00:31:40,610 --> 00:31:43,740
but I find it more
useful to first lay out
441
00:31:43,740 --> 00:31:45,720
what the language
of probability is
442
00:31:45,720 --> 00:31:48,915
that we will be
using and then bring
443
00:31:48,915 --> 00:31:52,200
in the description of the
microscopic degrees of freedom
444
00:31:52,200 --> 00:31:55,690
within this language.
445
00:31:55,690 --> 00:32:06,590
So if we go first
with definitions--
446
00:32:06,590 --> 00:32:12,610
and you could, for example, go
to the branch of mathematics
447
00:32:12,610 --> 00:32:16,290
that deals with probability,
and you will encounter something
448
00:32:16,290 --> 00:32:21,970
like this that what probability
describes is a random variable.
449
00:32:27,050 --> 00:32:36,470
Let's call it X, which has a
number of possible outcomes,
450
00:32:36,470 --> 00:32:49,740
which we put together
into a set of outcomes, S.
451
00:32:49,740 --> 00:33:00,410
And this set can be discrete as
would be the case if you were
452
00:33:00,410 --> 00:33:05,030
tossing a coin, and
the outcomes would
453
00:33:05,030 --> 00:33:11,680
be either a head or a tail,
or we were throwing a dice,
454
00:33:11,680 --> 00:33:15,215
and the outcomes would
be the faces 1 through 6.
455
00:33:19,050 --> 00:33:23,180
And we will encounter
mostly actually cases
456
00:33:23,180 --> 00:33:25,150
where S is continuous.
457
00:33:28,610 --> 00:33:34,300
Like for example, if I want to
describe the velocity of a gas
458
00:33:34,300 --> 00:33:39,950
particle in this room, I need
to specify the three components
459
00:33:39,950 --> 00:33:44,530
of velocity that can
be anywhere, let's say,
460
00:33:44,530 --> 00:33:46,210
in the range of real numbers.
461
00:33:51,130 --> 00:34:10,929
And again, mathematicians
would say that to each event,
462
00:34:10,929 --> 00:34:20,080
which is a subset of
possible outcomes,
463
00:34:20,080 --> 00:34:30,480
is assigned a value which we
must satisfy the following
464
00:34:30,480 --> 00:34:30,980
properties.
465
00:34:35,360 --> 00:34:41,210
First thing is the
probability of anything
466
00:34:41,210 --> 00:34:44,440
is a positive number.
467
00:34:44,440 --> 00:34:46,763
And so this is positivity.
468
00:34:55,100 --> 00:34:57,550
The second thing is additivity.
469
00:35:01,360 --> 00:35:08,180
That is the probability
of two events, A or B,
470
00:35:08,180 --> 00:35:13,360
is the sum total of
the probabilities
471
00:35:13,360 --> 00:35:17,395
if A and B are
disjoint or distinct.
472
00:35:23,389 --> 00:35:24,930
And finally, there's
a normalization.
473
00:35:31,110 --> 00:35:33,980
That if you're event is
that something should happen
474
00:35:33,980 --> 00:35:38,888
the entire set, the probability
that you assign to that is 1.
475
00:35:42,310 --> 00:35:45,100
So these are formal statements.
476
00:35:45,100 --> 00:35:49,200
And if you are a mathematician,
you start from there,
477
00:35:49,200 --> 00:35:52,020
and you prove theorems.
478
00:35:52,020 --> 00:35:59,900
But from our perspective,
the first question to ask
479
00:35:59,900 --> 00:36:08,960
is how to determine this
quantity probability
480
00:36:08,960 --> 00:36:12,070
that something should happen.
481
00:36:12,070 --> 00:36:18,670
If it is useful and I want to do
something real world about it,
482
00:36:18,670 --> 00:36:23,340
I should be able to measure
it or assign values to it.
483
00:36:23,340 --> 00:36:28,290
And very roughly
again, in theory,
484
00:36:28,290 --> 00:36:34,240
we can assign probabilities
two different ways.
485
00:36:34,240 --> 00:36:36,940
One way is called objective.
486
00:36:41,010 --> 00:36:45,300
And from the perspective
of us as physicists
487
00:36:45,300 --> 00:36:48,390
corresponds to what would be
an experimental procedure.
488
00:36:51,700 --> 00:37:06,020
And if it is assigning p of e
as the frequency of outcomes
489
00:37:06,020 --> 00:37:16,400
in large number of
trials, i.e. you
490
00:37:16,400 --> 00:37:22,770
would say that the probability
that event A is obtained
491
00:37:22,770 --> 00:37:27,940
is the number of times you
would get outcome A divided
492
00:37:27,940 --> 00:37:34,040
by the total number of
trials as n goes to infinity.
493
00:37:34,040 --> 00:37:40,790
So for example, if you want to
assign a probability that when
494
00:37:40,790 --> 00:37:48,120
you throw a dice that face 1
comes up, what you could do
495
00:37:48,120 --> 00:37:52,590
is you could make a table
of the number of times
496
00:37:52,590 --> 00:37:57,690
1 shows up divided by the number
of times you throw the dice.
497
00:37:57,690 --> 00:38:04,050
Maybe you throw it 100
times, and you get 15.
498
00:38:04,050 --> 00:38:08,767
You throw it 200
times, and you get--
499
00:38:08,767 --> 00:38:09,850
that is probably too much.
500
00:38:09,850 --> 00:38:14,340
Let's say 15-- you get 35.
501
00:38:14,340 --> 00:38:19,765
And you do it 300 times, and
you get something close to 48.
502
00:38:19,765 --> 00:38:24,640
The ratio of these
things, as the number
503
00:38:24,640 --> 00:38:27,330
gets larger and
larger, hopefully
504
00:38:27,330 --> 00:38:30,130
will converge to something that
you would call the probability.
505
00:38:32,770 --> 00:38:38,380
Now, it turns out that
in statistical physics,
506
00:38:38,380 --> 00:38:42,660
we will assign things through
a totally different procedure
507
00:38:42,660 --> 00:38:46,330
which is subjective.
508
00:38:46,330 --> 00:38:51,140
If you like, it's
more theoretical,
509
00:38:51,140 --> 00:39:07,020
which is based on uncertainty
among all outcomes.
510
00:39:11,340 --> 00:39:16,224
Because if I were to
subjectively assign
511
00:39:16,224 --> 00:39:21,740
to throwing the dice and
coming up with value of 1,
512
00:39:21,740 --> 00:39:28,010
I would say, well, there's six
possible faces for the dice.
513
00:39:28,010 --> 00:39:30,900
I don't know anything about
this dice being loaded,
514
00:39:30,900 --> 00:39:34,750
so I will say they
are all equally alike.
515
00:39:34,750 --> 00:39:37,570
Now, that may or may not
be a correct assumption.
516
00:39:37,570 --> 00:39:38,520
You could test it.
517
00:39:38,520 --> 00:39:40,180
You could maybe
throw it many times.
518
00:39:40,180 --> 00:39:43,250
You will find that either the
dice is loaded or not loaded
519
00:39:43,250 --> 00:39:45,050
and this is correct or not.
520
00:39:45,050 --> 00:39:48,830
But you begin by
making this assumption.
521
00:39:48,830 --> 00:39:52,620
And this is actually, we
will see later on, exactly
522
00:39:52,620 --> 00:39:54,890
the type of assumption
that you would
523
00:39:54,890 --> 00:39:57,210
be making in
statistical physics.
524
00:40:05,979 --> 00:40:07,520
Any question about
these definitions?
525
00:40:10,980 --> 00:40:17,020
So let's again
proceed slowly to get
526
00:40:17,020 --> 00:40:25,430
some definitions established by
looking at one random variable.
527
00:40:25,430 --> 00:40:31,890
So this is the next section
on one random variable.
528
00:40:34,770 --> 00:40:38,770
And I will assume that
I'll look at the case
529
00:40:38,770 --> 00:40:40,670
of the continuous
random variable.
530
00:40:40,670 --> 00:40:49,270
So x can be any real number
minus infinity to infinity.
531
00:40:49,270 --> 00:40:52,170
Now, a number of definitions.
532
00:40:52,170 --> 00:40:59,260
I will use the term
Cumulative-- make
533
00:40:59,260 --> 00:41:25,420
sure I'll use the-- Cumulative
Probability Function, CPF,
534
00:41:25,420 --> 00:41:28,690
that for this one
random variable,
535
00:41:28,690 --> 00:41:32,460
I will indicate
by capital P of x.
536
00:41:35,070 --> 00:41:37,710
And the meaning of
this is that capital P
537
00:41:37,710 --> 00:41:46,700
of x is the probability
of outcome less than x.
538
00:41:54,130 --> 00:42:00,390
So generically,
we say that x can
539
00:42:00,390 --> 00:42:04,020
take all values
along the real line.
540
00:42:04,020 --> 00:42:05,820
And there is this
function that I
541
00:42:05,820 --> 00:42:12,710
want to plot that I will
call big P of x Now big P
542
00:42:12,710 --> 00:42:16,290
of x is a probability,
therefore,
543
00:42:16,290 --> 00:42:20,030
it has to be positive according
to the first item that we
544
00:42:20,030 --> 00:42:21,840
have over there.
545
00:42:21,840 --> 00:42:27,020
And it will be less than 1
because the net probability
546
00:42:27,020 --> 00:42:30,440
for everything toward
here is equal to 1.
547
00:42:30,440 --> 00:42:33,320
So asymptotically,
where I go all the way
548
00:42:33,320 --> 00:42:35,450
to infinity, the
probability that I
549
00:42:35,450 --> 00:42:41,590
will get some number along the
line-- I have to get something,
550
00:42:41,590 --> 00:42:45,530
so it should automatically
go to 1 here.
551
00:42:45,530 --> 00:42:51,450
And every element of
probability is positive,
552
00:42:51,450 --> 00:42:55,590
so it's a function that
should gradually go down.
553
00:42:55,590 --> 00:42:58,454
And presumably, it
will behave something
554
00:42:58,454 --> 00:42:59,370
like this generically.
555
00:43:06,170 --> 00:43:10,110
Once we have the Cumulative
Probability Function,
556
00:43:10,110 --> 00:43:24,530
we can immediately construct the
Probability Density Function,
557
00:43:24,530 --> 00:43:33,350
PDF, which is the
derivative of the above.
558
00:43:33,350 --> 00:43:42,290
P of x is the derivative of big
P of x with respect to the x.
559
00:43:42,290 --> 00:43:48,730
And so if I just take here
the curve that I have above
560
00:43:48,730 --> 00:43:51,850
and take its derivative,
the derivative
561
00:43:51,850 --> 00:43:54,360
will look something like this.
562
00:44:01,470 --> 00:44:07,340
Essentially, clearly by the
definition of the derivative,
563
00:44:07,340 --> 00:44:13,260
this quantity is
therefore ability
564
00:44:13,260 --> 00:44:23,120
of outcome in the
interval x to x plus dx
565
00:44:23,120 --> 00:44:25,274
divided by the size
of the interval dx.
566
00:44:29,050 --> 00:44:34,680
couple of things to
remind you of, one of them
567
00:44:34,680 --> 00:44:39,690
is that the Cumulative
Probability is a probability.
568
00:44:39,690 --> 00:44:43,540
It's a dimensionless
number between 0 and 1.
569
00:44:43,540 --> 00:44:48,440
Probability Density is obtained
by taking a derivative,
570
00:44:48,440 --> 00:44:54,570
so it has dimensions that are
inverse of whatever this x is.
571
00:44:54,570 --> 00:44:59,270
So if I change my variable
from meters to centimeters,
572
00:44:59,270 --> 00:45:01,570
let's say, the value
of this function
573
00:45:01,570 --> 00:45:04,530
would change by a factor of 100.
574
00:45:04,530 --> 00:45:09,490
And secondly, while
the Probability Density
575
00:45:09,490 --> 00:45:14,680
is positive, its
value is not bounded.
576
00:45:14,680 --> 00:45:16,648
It can be anywhere
that you like.
577
00:45:23,490 --> 00:45:27,050
One other, again,
minor definition
578
00:45:27,050 --> 00:45:28,600
is expectation value.
579
00:45:34,810 --> 00:45:39,990
So I can pick some
function of x.
580
00:45:39,990 --> 00:45:42,000
This could be x itself.
581
00:45:42,000 --> 00:45:43,970
It could be x squared.
582
00:45:43,970 --> 00:45:49,810
It could be sine x, x
cubed minus x squared.
583
00:45:49,810 --> 00:45:52,880
The expectation value
of this is defined
584
00:45:52,880 --> 00:45:58,310
by integrating the
Probability Density
585
00:45:58,310 --> 00:46:00,906
against the value
of the function.
586
00:46:06,620 --> 00:46:15,260
So essentially,
what that says is
587
00:46:15,260 --> 00:46:30,440
that if I pick some
function of x-- function
588
00:46:30,440 --> 00:46:33,160
can be positive,
negative, et cetera.
589
00:46:33,160 --> 00:46:42,640
So maybe I have a function
such as this-- then the value
590
00:46:42,640 --> 00:46:44,430
of x is random.
591
00:46:44,430 --> 00:46:47,460
If x is in this
interval, this would
592
00:46:47,460 --> 00:46:50,870
be the corresponding
contribution to f of x.
593
00:46:50,870 --> 00:46:55,605
And I have to look at
all possible values of x.
594
00:47:00,100 --> 00:47:02,710
Question?
595
00:47:02,710 --> 00:47:06,935
Now, very associated to this
is a change of variables.
596
00:47:14,660 --> 00:47:23,040
You would say that if x is
random, then f of x is random.
597
00:47:23,040 --> 00:47:26,420
So if I ask you what is
the value of x squared,
598
00:47:26,420 --> 00:47:29,780
and for one random
variable, I get this.
599
00:47:29,780 --> 00:47:32,220
The value of x
squared would be this.
600
00:47:32,220 --> 00:47:34,660
If I get this, the
value of x squared
601
00:47:34,660 --> 00:47:37,160
would be something else.
602
00:47:37,160 --> 00:47:41,035
So if x is random, f of x
is itself a random variable.
603
00:47:45,210 --> 00:47:52,510
So f of x is a random
variable, and you
604
00:47:52,510 --> 00:47:55,220
can ask what is the
probability, let's
605
00:47:55,220 --> 00:48:00,000
say, the Probability Density
Function that I would associate
606
00:48:00,000 --> 00:48:02,040
with the value of this.
607
00:48:02,040 --> 00:48:05,320
Let's say what's the
probability that I will find it
608
00:48:05,320 --> 00:48:10,710
in the interval between
small f and small f plus df.
609
00:48:10,710 --> 00:48:12,916
This will be Pf f of f.
610
00:48:17,052 --> 00:48:21,950
You would say that the
probability that I would find
611
00:48:21,950 --> 00:48:27,560
the value of the function
that is in this interval
612
00:48:27,560 --> 00:48:33,190
corresponds to finding a value
of x that is in this interval.
613
00:48:33,190 --> 00:48:36,020
So what I can do,
the probability
614
00:48:36,020 --> 00:48:38,990
that I find the value
of f in this interval,
615
00:48:38,990 --> 00:48:43,260
according to what I have here,
is the Probability Density
616
00:48:43,260 --> 00:48:44,912
multiplied by df.
617
00:48:44,912 --> 00:48:45,745
Is there a question?
618
00:48:48,250 --> 00:48:50,790
No.
619
00:48:50,790 --> 00:48:53,380
So the probability that
I'm in this interval
620
00:48:53,380 --> 00:48:56,520
translates to the probability
that I'm in this interval.
621
00:48:56,520 --> 00:49:01,965
So that's probability p of x dx.
622
00:49:04,650 --> 00:49:07,500
But that's boring.
623
00:49:07,500 --> 00:49:09,920
I want to look at the
situation maybe where
624
00:49:09,920 --> 00:49:11,565
the function is
something like this.
625
00:49:15,350 --> 00:49:19,730
Then, you say that f is
in this interval provided
626
00:49:19,730 --> 00:49:27,220
that x is either
here or here or here.
627
00:49:27,220 --> 00:49:33,964
So what I really need
to do is to solve
628
00:49:33,964 --> 00:49:40,200
f of x equals to f for x.
629
00:49:40,200 --> 00:49:43,010
And maybe there
will be solutions
630
00:49:43,010 --> 00:49:48,030
that will be x1,
x2, x3, et cetera.
631
00:49:48,030 --> 00:49:50,960
And what I need to
do is to sum over
632
00:49:50,960 --> 00:49:53,850
the contributions of
all of those solutions.
633
00:49:57,420 --> 00:49:58,830
So here, it's three solutions.
634
00:50:01,560 --> 00:50:08,010
Then, you would say
the Probability Density
635
00:50:08,010 --> 00:50:15,440
is the sum over i P
of xi, the xi by df,
636
00:50:15,440 --> 00:50:17,460
which is really the slopes.
637
00:50:17,460 --> 00:50:21,050
The slopes translate the
size of this interval
638
00:50:21,050 --> 00:50:23,010
to the size of that interval.
639
00:50:23,010 --> 00:50:25,770
You can see that here,
the slope is very sharp.
640
00:50:25,770 --> 00:50:28,190
The size of this
interval is small.
641
00:50:28,190 --> 00:50:31,450
It could be wider
accordingly, so I
642
00:50:31,450 --> 00:50:34,937
need to multiply by dxi by df.
643
00:50:39,160 --> 00:50:43,940
So I have to multiply by
dx by df evaluated at xi.
644
00:50:43,940 --> 00:50:47,165
That's essentially the value
of the derivative of f.
645
00:50:50,430 --> 00:50:57,070
Now, sometimes, it is easy
to forget these things
646
00:50:57,070 --> 00:50:59,300
that I write over here.
647
00:50:59,300 --> 00:51:01,640
And you would say,
well obviously,
648
00:51:01,640 --> 00:51:06,000
the probability of
something that is positive.
649
00:51:06,000 --> 00:51:09,890
But without being careful,
it is easy to violate
650
00:51:09,890 --> 00:51:11,620
such basic condition.
651
00:51:11,620 --> 00:51:13,020
And I violated it here.
652
00:51:16,740 --> 00:51:19,160
Anybody see where I violated it.
653
00:51:22,120 --> 00:51:24,710
Yeah, the slope
here is positive.
654
00:51:24,710 --> 00:51:26,060
The slope here is positive.
655
00:51:26,060 --> 00:51:28,240
The slope here is negative.
656
00:51:28,240 --> 00:51:33,130
So I am subtracting
a probability here.
657
00:51:33,130 --> 00:51:34,920
So what I really
should do-- it really
658
00:51:34,920 --> 00:51:37,490
doesn't matter whether the
slope is this way or that way.
659
00:51:37,490 --> 00:51:39,590
I will pick up
the same interval,
660
00:51:39,590 --> 00:51:44,144
so make sure you don't forget
the absolute values that
661
00:51:44,144 --> 00:51:45,596
go accordingly.
662
00:51:45,596 --> 00:51:48,210
So this is the
standard way that you
663
00:51:48,210 --> 00:51:50,480
would make change of variables.
664
00:51:50,480 --> 00:51:52,162
Yes?
665
00:51:52,162 --> 00:51:52,828
AUDIENCE: Sorry.
666
00:51:52,828 --> 00:51:56,701
In the center of that board,
on the second line, it says Pf.
667
00:51:56,701 --> 00:51:58,084
Is that an x or a times?
668
00:52:00,715 --> 00:52:02,340
PROFESSOR: In the
center of this board?
669
00:52:02,340 --> 00:52:03,678
This one?
670
00:52:03,678 --> 00:52:06,140
AUDIENCE: Yeah.
671
00:52:06,140 --> 00:52:10,460
PROFESSOR: So the
value of the function
672
00:52:10,460 --> 00:52:12,930
is a random variable, right?
673
00:52:12,930 --> 00:52:14,540
It can come up to be here.
674
00:52:14,540 --> 00:52:16,390
It can come up to be here.
675
00:52:16,390 --> 00:52:21,720
And so there is, as any other
one parameter random variable,
676
00:52:21,720 --> 00:52:25,210
a Probability Density
associated with that.
677
00:52:25,210 --> 00:52:30,350
That Probability Density
I have called P of f
678
00:52:30,350 --> 00:52:32,440
to indicate that
it is the variable
679
00:52:32,440 --> 00:52:34,910
f that I'm
considering as opposed
680
00:52:34,910 --> 00:52:37,370
to what I wrote
originally that was
681
00:52:37,370 --> 00:52:39,810
associated with the value of x.
682
00:52:39,810 --> 00:52:42,282
AUDIENCE: But what you have
written on the left-hand side,
683
00:52:42,282 --> 00:52:44,750
it looks like your
x [? is random. ?]
684
00:52:44,750 --> 00:52:47,160
PROFESSOR: Oh, this was
supposed to be a multiplication
685
00:52:47,160 --> 00:52:48,110
sign, so sorry.
686
00:52:48,110 --> 00:52:49,635
AUDIENCE: Thank you.
687
00:52:49,635 --> 00:52:50,510
PROFESSOR: Thank you.
688
00:52:56,445 --> 00:52:56,945
Yes?
689
00:52:56,945 --> 00:53:01,900
AUDIENCE: CP-- that function,
is this [INAUDIBLE]?
690
00:53:01,900 --> 00:53:04,020
PROFESSOR: Yes.
691
00:53:04,020 --> 00:53:07,670
So you're asking whether this--
so I constructed something,
692
00:53:07,670 --> 00:53:11,660
and my statement is that the
integral from minus infinity
693
00:53:11,660 --> 00:53:21,100
to infinity df Pf of f better be
one which is the normalization.
694
00:53:21,100 --> 00:53:24,420
So if you're asking
about this, essentially,
695
00:53:24,420 --> 00:53:30,020
you would say the
integral dx p of x
696
00:53:30,020 --> 00:53:34,200
is the integral dx
dP by dx, right?
697
00:53:34,200 --> 00:53:36,710
That was the definition p of x.
698
00:53:36,710 --> 00:53:38,930
And the integral
of the derivative
699
00:53:38,930 --> 00:53:43,830
is the value of the function
evaluated at its two extremes.
700
00:53:43,830 --> 00:53:47,360
And this is one minus 0.
701
00:53:47,360 --> 00:53:51,845
So by construction, it
is, of course, normalized
702
00:53:51,845 --> 00:53:54,100
in this fashion.
703
00:53:54,100 --> 00:53:55,310
Is that what you were asking?
704
00:53:55,310 --> 00:53:58,226
AUDIENCE: I was asking
about the first possibility
705
00:53:58,226 --> 00:54:00,970
of cumulative
probability function.
706
00:54:00,970 --> 00:54:04,220
PROFESSOR: So the
cumulative probability,
707
00:54:04,220 --> 00:54:09,540
its constraint is that
the limit as its variable
708
00:54:09,540 --> 00:54:14,030
goes to infinity,
it should go to 1.
709
00:54:14,030 --> 00:54:15,650
That's the normalization.
710
00:54:15,650 --> 00:54:19,660
The normalization here
is that the probability
711
00:54:19,660 --> 00:54:23,300
of the entire set is 1.
712
00:54:23,300 --> 00:54:26,000
Cumulative adds
the probabilities
713
00:54:26,000 --> 00:54:29,270
to be anywhere up to point x.
714
00:54:29,270 --> 00:54:32,160
So I have achieved being
anywhere on the line
715
00:54:32,160 --> 00:54:35,700
by going through this point.
716
00:54:35,700 --> 00:54:41,477
But certainly, the integral
of P of xdx is not equal to 1
717
00:54:41,477 --> 00:54:42,685
if that's what you're asking.
718
00:54:45,370 --> 00:54:48,494
The integral of
small p of x is 1.
719
00:54:51,973 --> 00:54:53,464
Yes?
720
00:54:53,464 --> 00:54:56,446
AUDIENCE: Are we assuming
the function is invertible?
721
00:55:00,430 --> 00:55:03,380
PROFESSOR: Well, rigorously
speaking, this function
722
00:55:03,380 --> 00:55:08,090
is not invertible
because for a value of f,
723
00:55:08,090 --> 00:55:10,810
there are three
possible values of x.
724
00:55:10,810 --> 00:55:14,150
So it's not a function,
but you can certainly
725
00:55:14,150 --> 00:55:18,070
solve for f of x equals to
f to find particular values.
726
00:55:27,630 --> 00:55:31,520
So again, maybe it
is useful to work
727
00:55:31,520 --> 00:55:33,230
through one example of this.
728
00:55:37,320 --> 00:55:44,220
So let's say that you
have a probability that
729
00:55:44,220 --> 00:55:50,890
is of the form e to the minus
lambda absolute value of x.
730
00:55:50,890 --> 00:55:58,550
So as a function of x,
the Probability Density
731
00:55:58,550 --> 00:56:04,430
falls off exponentially
on both sides.
732
00:56:04,430 --> 00:56:08,340
And again, I have to ensure that
when I integrate this from 0
733
00:56:08,340 --> 00:56:11,280
to infinity, I will get one.
734
00:56:11,280 --> 00:56:14,090
The integral from
0 to infinity is
735
00:56:14,090 --> 00:56:16,580
1 over lambda,
from minus infinity
736
00:56:16,580 --> 00:56:19,420
to zero by symmetry
is 1 over lambda.
737
00:56:19,420 --> 00:56:26,940
So it's really I have to divide
by 2 lambda-- to lambda over 2.
738
00:56:26,940 --> 00:56:27,440
Sorry.
739
00:56:34,680 --> 00:56:39,650
Now, suppose I change variables
to F, which is x squared.
740
00:56:42,350 --> 00:56:46,840
So I want to know what
the probability is
741
00:56:46,840 --> 00:56:53,070
for a particular value of x
squared that I will call f.
742
00:56:53,070 --> 00:56:57,720
So then what I have to
do is to solve this.
743
00:56:57,720 --> 00:57:03,920
And this will give me x is minus
plus square root of small f.
744
00:57:03,920 --> 00:57:10,440
If I ask for what f of-- for
what value of x, x squared
745
00:57:10,440 --> 00:57:15,500
equals to f, then I have
these two solutions.
746
00:57:15,500 --> 00:57:18,670
So according to the
formula that I have,
747
00:57:18,670 --> 00:57:22,480
I have to, first of
all, evaluate this
748
00:57:22,480 --> 00:57:26,540
at these two possible routes.
749
00:57:26,540 --> 00:57:33,160
In both cases, I will get
minus lambda square root of f.
750
00:57:33,160 --> 00:57:35,610
Because of the absolute
value, both of them
751
00:57:35,610 --> 00:57:38,290
will give you the same thing.
752
00:57:38,290 --> 00:57:43,350
And then I have to look
at this derivative.
753
00:57:43,350 --> 00:57:52,394
So if I look at this, I can
see that df by dx equals to 2x.
754
00:57:52,394 --> 00:57:57,180
The locations that I have to
evaluate are at plus minus
755
00:57:57,180 --> 00:57:58,800
square root of f.
756
00:57:58,800 --> 00:58:04,470
So the value of the slope is
minus plus to square root of f.
757
00:58:04,470 --> 00:58:07,840
And according to that
formula, what I have to do
758
00:58:07,840 --> 00:58:09,600
is to put the inverse of that.
759
00:58:09,600 --> 00:58:13,680
So I have to put for one
solution, 1 over 2 square root
760
00:58:13,680 --> 00:58:15,460
of f.
761
00:58:15,460 --> 00:58:19,920
For the other one, I have to
put 1 over minus 2 square root
762
00:58:19,920 --> 00:58:24,560
of f, which would be a disaster
if I didn't convert this
763
00:58:24,560 --> 00:58:27,650
to an absolute value.
764
00:58:27,650 --> 00:58:30,900
And if I did convert that
to an absolute value,
765
00:58:30,900 --> 00:58:36,322
what I would get is lambda
over 2 square root of f
766
00:58:36,322 --> 00:58:41,000
e to the minus lambda root f.
767
00:58:41,000 --> 00:58:46,180
It is important to note
that this solution will only
768
00:58:46,180 --> 00:58:51,340
exist only if f is positive.
769
00:58:51,340 --> 00:58:55,040
And there's no solution
if f is negative,
770
00:58:55,040 --> 00:59:00,200
which means that if I
wanted to plot a Probability
771
00:59:00,200 --> 00:59:03,770
Density for this
function f, which
772
00:59:03,770 --> 00:59:08,420
is x squared as a function
of f, it will only
773
00:59:08,420 --> 00:59:12,510
have values for positive
values of x squared.
774
00:59:12,510 --> 00:59:16,300
There's nothing for
negative values.
775
00:59:16,300 --> 00:59:19,300
For positive values,
I have this function
776
00:59:19,300 --> 00:59:22,200
that's exponentially decays.
777
00:59:22,200 --> 00:59:27,380
Yet at f equals to 0 diverges.
778
00:59:27,380 --> 00:59:30,770
One reason I chose
that example is
779
00:59:30,770 --> 00:59:34,170
to emphasize that these
Probability Density
780
00:59:34,170 --> 00:59:39,206
functions can even go
all the way infinity.
781
00:59:39,206 --> 00:59:42,880
The requirement,
however, is that you
782
00:59:42,880 --> 00:59:47,790
should be able to integrate
across the infinity because
783
00:59:47,790 --> 00:59:51,840
integrating across the infinity
should give you a finite number
784
00:59:51,840 --> 00:59:53,660
less than 1.
785
00:59:53,660 --> 00:59:58,490
And so the type of divergence
that you could have is limited.
786
00:59:58,490 --> 01:00:00,500
1 over square root of f is fine.
787
01:00:00,500 --> 01:00:02,000
1/f is not accepted.
788
01:00:06,380 --> 01:00:07,555
Yes?
789
01:00:07,555 --> 01:00:09,680
AUDIENCE: I have a doubt
about [? the postulate. ?]
790
01:00:09,680 --> 01:00:16,850
It says that if you raise
the value of f slowly,
791
01:00:16,850 --> 01:00:20,260
you will eventually get to--
yeah, that point right there.
792
01:00:20,260 --> 01:00:22,982
So if the prescription that
we have of summing over
793
01:00:22,982 --> 01:00:27,025
the different roots, at
some point, the roots,
794
01:00:27,025 --> 01:00:27,715
they converge.
795
01:00:27,715 --> 01:00:28,340
PROFESSOR: Yes.
796
01:00:28,340 --> 01:00:30,552
AUDIENCE: So at some point,
we stop summing over 2
797
01:00:30,552 --> 01:00:31,956
and we start summing over 1.
798
01:00:31,956 --> 01:00:35,240
It just seems a
little bit strange.
799
01:00:35,240 --> 01:00:36,070
PROFESSOR: Yeah.
800
01:00:36,070 --> 01:00:40,280
If you are up here, you have
only one term in the sum.
801
01:00:40,280 --> 01:00:43,200
If you are down here,
you have three terms.
802
01:00:43,200 --> 01:00:46,120
And that's really just
the property of the curve
803
01:00:46,120 --> 01:00:47,870
that I have drawn.
804
01:00:47,870 --> 01:00:51,530
And so over here, I
have only one root.
805
01:00:51,530 --> 01:00:53,040
Over here, I have three roots.
806
01:00:53,040 --> 01:00:54,960
And this is not surprising.
807
01:00:54,960 --> 01:00:58,190
There are many situations
in mathematics or physics
808
01:00:58,190 --> 01:01:01,140
where you encounter
situations where,
809
01:01:01,140 --> 01:01:04,670
as you change some parameters,
new solutions, new roots,
810
01:01:04,670 --> 01:01:05,650
appear.
811
01:01:05,650 --> 01:01:11,030
And so if this was really some
kind of a physical system,
812
01:01:11,030 --> 01:01:13,690
you would probably
encounter some kind
813
01:01:13,690 --> 01:01:16,712
of a singularity of phase
transitions at this point.
814
01:01:20,960 --> 01:01:21,825
Yes?
815
01:01:21,825 --> 01:01:24,675
AUDIENCE: But how does the
equation deal with that when
816
01:01:24,675 --> 01:01:27,060
[INAUDIBLE]?
817
01:01:27,060 --> 01:01:28,570
PROFESSOR: Let's see.
818
01:01:28,570 --> 01:01:34,390
So if I am approaching
that point, what I find
819
01:01:34,390 --> 01:01:40,540
is that the f by
the x goes to 0.
820
01:01:40,540 --> 01:01:44,500
So the x by df has some kind
of infinity or singularity,
821
01:01:44,500 --> 01:01:46,590
so we have to deal with that.
822
01:01:46,590 --> 01:01:49,670
If you want, we can
choose a particular form
823
01:01:49,670 --> 01:01:52,140
of that function and
see what happens.
824
01:01:52,140 --> 01:01:55,230
But actually, we have
that already over here
825
01:01:55,230 --> 01:01:58,440
because the function f
that I plotted for you
826
01:01:58,440 --> 01:02:05,740
as a function of x
has this behavior
827
01:02:05,740 --> 01:02:09,035
that, for some range of
f, you have two solutions.
828
01:02:14,280 --> 01:02:17,940
So for negative values
of f, I have no solution.
829
01:02:17,940 --> 01:02:21,000
So this curve, after
having rotated,
830
01:02:21,000 --> 01:02:24,390
is precisely an example
of what is happening here.
831
01:02:24,390 --> 01:02:26,950
And you see what the
consequence of that is.
832
01:02:26,950 --> 01:02:29,910
The consequence of that
is that as I approach here
833
01:02:29,910 --> 01:02:33,810
and the two solutions merge,
I have the singularity that
834
01:02:33,810 --> 01:02:36,679
is ultimately
manifested in here.
835
01:02:47,450 --> 01:02:49,060
So in principle, yes.
836
01:02:49,060 --> 01:02:51,230
When you make these
changes of variables
837
01:02:51,230 --> 01:02:55,630
and you have functions
that have multiple solution
838
01:02:55,630 --> 01:02:58,840
behavior like that, you
have to worry about this.
839
01:03:02,940 --> 01:03:04,065
Let me go down here.
840
01:03:07,690 --> 01:03:10,860
One other definition
that, again, you've
841
01:03:10,860 --> 01:03:13,350
probably seen, before
we go through something
842
01:03:13,350 --> 01:03:16,483
that I hope you
haven't seen, moment.
843
01:03:19,430 --> 01:03:22,110
A form of this expectation
value-- actually, here we
844
01:03:22,110 --> 01:03:24,950
did with x squared,
but in general, we
845
01:03:24,950 --> 01:03:29,480
can calculate the expectation
value of x to the m.
846
01:03:29,480 --> 01:03:37,120
And sometimes, that is called
mth moment is the integral 0
847
01:03:37,120 --> 01:03:41,130
to infinity dx x
to the m p of x.
848
01:03:56,760 --> 01:04:00,300
Now, I expect that
after this point,
849
01:04:00,300 --> 01:04:01,940
you would have seen everything.
850
01:04:01,940 --> 01:04:06,820
But next one maybe
half of you have seen.
851
01:04:06,820 --> 01:04:10,870
And the next item,
which we will use a lot,
852
01:04:10,870 --> 01:04:12,720
is the characteristic function.
853
01:04:24,860 --> 01:04:29,590
So given that I have some
probability distribution
854
01:04:29,590 --> 01:04:34,220
p of x, I can calculate
various expectation values.
855
01:04:34,220 --> 01:04:38,470
I calculate the expectation
value of e to the minus ikx.
856
01:04:43,200 --> 01:04:47,140
This is, by definition
that you have,
857
01:04:47,140 --> 01:04:50,750
I have to integrate
over the domain of x--
858
01:04:50,750 --> 01:04:55,140
let's say from minus infinity
to infinity-- p of x against e
859
01:04:55,140 --> 01:04:56,140
to the minus ikx.
860
01:05:01,550 --> 01:05:05,560
And you say, well, what's
special about that?
861
01:05:05,560 --> 01:05:11,470
I know that to be the
Fourier transform of p of x.
862
01:05:11,470 --> 01:05:13,220
And it is true.
863
01:05:13,220 --> 01:05:17,790
And you also know how to
invert the Fourier transform.
864
01:05:17,790 --> 01:05:20,460
That is if you know the
characteristic function, which
865
01:05:20,460 --> 01:05:23,170
is another name for the Fourier
transform of a probability
866
01:05:23,170 --> 01:05:26,770
distribution, you
would get the p of x
867
01:05:26,770 --> 01:05:33,490
back by the integral
over k divided by 2pi,
868
01:05:33,490 --> 01:05:39,460
the way that I chose the things,
into the ikx p tilde of k.
869
01:05:39,460 --> 01:05:42,600
Basically, this is the
standard relationship
870
01:05:42,600 --> 01:05:45,520
between these objects.
871
01:05:45,520 --> 01:05:50,090
So this is just a
Fourrier transform.
872
01:05:50,090 --> 01:05:56,320
Now, something that appears a
lot in statistical calculations
873
01:05:56,320 --> 01:05:58,530
and implicit in lots
of things that we
874
01:05:58,530 --> 01:06:03,185
do in statistical mechanics
is a generating function.
875
01:06:12,250 --> 01:06:17,880
I can take the characteristic
function p tilde of k.
876
01:06:17,880 --> 01:06:21,620
It's a function of this
Fourrier variable, k.
877
01:06:21,620 --> 01:06:24,360
And I can do an
expansion in that.
878
01:06:24,360 --> 01:06:29,000
I can do the expansion
inside the expectation value
879
01:06:29,000 --> 01:06:34,460
because e to the minus ikx I can
write as a sum over n running
880
01:06:34,460 --> 01:06:39,810
from 0 to infinity minus ik
to the power of m divided by n
881
01:06:39,810 --> 01:06:42,540
factorial x to the nth.
882
01:06:45,330 --> 01:06:48,880
This is the expansion
of the exponential.
883
01:06:48,880 --> 01:06:55,170
The variable here is x, so I can
take everything else outside.
884
01:06:55,170 --> 01:07:01,040
And what I see is that
if I make an expansion
885
01:07:01,040 --> 01:07:06,370
of the characteristic
function, the coefficient
886
01:07:06,370 --> 01:07:11,240
of k to the n up to some
trivial factor of n factorial
887
01:07:11,240 --> 01:07:14,130
will give me the nth moment.
888
01:07:14,130 --> 01:07:16,910
That is once you have
calculated the Fourrier
889
01:07:16,910 --> 01:07:20,610
transform, or the characteristic
function, you can expand it.
890
01:07:20,610 --> 01:07:23,430
And you can, from out
of that expansion, you
891
01:07:23,430 --> 01:07:26,690
can extract all the
moments essentially.
892
01:07:26,690 --> 01:07:30,550
So this is expansion
generates for you the moments,
893
01:07:30,550 --> 01:07:34,250
hence the generating function.
894
01:07:34,250 --> 01:07:37,220
You could even do
something like this.
895
01:07:37,220 --> 01:07:47,880
You could multiply e to the
ikx0 for some x0 p tilde of k.
896
01:07:47,880 --> 01:07:51,510
And that would be the
expectation value of e
897
01:07:51,510 --> 01:07:56,780
to the ikx minus x0.
898
01:07:56,780 --> 01:08:00,450
And you can expand
that, and you would
899
01:08:00,450 --> 01:08:06,600
generate all of the moments
not around the origin,
900
01:08:06,600 --> 01:08:08,350
but around the point x0.
901
01:08:13,960 --> 01:08:17,830
So simple manipulations of
the characteristic function
902
01:08:17,830 --> 01:08:23,840
can shift and give you
other set of moments
903
01:08:23,840 --> 01:08:25,000
around different points.
904
01:08:34,800 --> 01:08:38,560
So the Fourier transform,
or characteristic function,
905
01:08:38,560 --> 01:08:42,290
is the generator of moments.
906
01:08:42,290 --> 01:08:46,240
An even more
important property is
907
01:08:46,240 --> 01:08:48,986
possessed by the cumulant
generating function.
908
01:08:58,229 --> 01:09:07,260
So you have the
characteristic function,
909
01:09:07,260 --> 01:09:08,899
the Fourier transform.
910
01:09:08,899 --> 01:09:13,609
You take its log, so
another function of k.
911
01:09:13,609 --> 01:09:16,810
You start expanding this
function in covers of k.
912
01:09:21,189 --> 01:09:29,370
Add the coefficients of
that, you call cumulants.
913
01:09:33,420 --> 01:09:38,060
So I essentially repeated the
definition that I had up there.
914
01:09:38,060 --> 01:09:44,497
I took a log, and all I did
is I put this subscript c
915
01:09:44,497 --> 01:09:47,729
to go from moments to cumulants.
916
01:09:47,729 --> 01:09:53,890
And also, I have to start the
series from 1 as opposed to 0.
917
01:09:53,890 --> 01:10:00,960
And essentially, I can find the
relationship between cumulants
918
01:10:00,960 --> 01:10:04,370
and moments by
writing this as a log
919
01:10:04,370 --> 01:10:08,980
of the characteristic
function, which
920
01:10:08,980 --> 01:10:14,150
is 1 plus some n
plus 1 to infinity
921
01:10:14,150 --> 01:10:21,190
of minus ik to the n over n
factorial, the nth moments.
922
01:10:21,190 --> 01:10:26,120
So inside the log,
I have the moments.
923
01:10:26,120 --> 01:10:30,020
Outside the log, I
have the cumulants.
924
01:10:30,020 --> 01:10:35,910
And if I have a log
of 1 plus epsilon,
925
01:10:35,910 --> 01:10:41,180
I can use the expansion
of this as epsilon minus
926
01:10:41,180 --> 01:10:45,380
epsilon squared over 2 epsilon
cubed over 3 minus epsilon
927
01:10:45,380 --> 01:10:49,500
to the fourth over 4, et cetera.
928
01:10:49,500 --> 01:10:55,940
And this will enable me to
then match powers of minus ik
929
01:10:55,940 --> 01:11:00,710
on the left and powers
of minus ik on the right.
930
01:11:00,710 --> 01:11:03,530
You can see that the first
thing that I will find
931
01:11:03,530 --> 01:11:09,600
is that the expectation
value of x-- the first power,
932
01:11:09,600 --> 01:11:13,800
the first term that I have
here is minus ik to the mean.
933
01:11:13,800 --> 01:11:16,630
Take the log, I will get that.
934
01:11:16,630 --> 01:11:22,230
So essentially, what I get
is that the first cumulant
935
01:11:22,230 --> 01:11:26,760
on the left is the
first moment that I
936
01:11:26,760 --> 01:11:29,660
will get from the
expansion on the right.
937
01:11:29,660 --> 01:11:32,460
And this is, of course, called
the mean of the distribution.
938
01:11:34,970 --> 01:11:40,680
The second cumulant, I will
have two contributions,
939
01:11:40,680 --> 01:11:45,304
one from epsilon, the other from
minus epsilon squared over 2.
940
01:11:45,304 --> 01:11:48,220
And If you go through
that, you will
941
01:11:48,220 --> 01:11:52,680
get that it is expectation
value of x squared
942
01:11:52,680 --> 01:11:57,320
minus the average of x,
the mean squared, which
943
01:11:57,320 --> 01:12:01,450
is none other than the
expectation value of x
944
01:12:01,450 --> 01:12:06,910
around the mean squared, which
is clearly a positive quantity.
945
01:12:06,910 --> 01:12:08,410
And this is the variance.
946
01:12:14,420 --> 01:12:16,300
And you can keep going.
947
01:12:16,300 --> 01:12:26,360
The third cumulant is x cubed
minus 3 average of x squared
948
01:12:26,360 --> 01:12:32,402
average of x plus 2
average of x itself cubed.
949
01:12:32,402 --> 01:12:33,915
It is called the skewness.
950
01:12:36,900 --> 01:12:40,340
I don't write the
formula for the next one
951
01:12:40,340 --> 01:12:42,050
which is called
a [? cortosis ?].
952
01:12:42,050 --> 01:12:45,910
And you keep going and so forth.
953
01:12:53,390 --> 01:13:01,220
So it turns out that this
hierarchy of cumulants,
954
01:13:01,220 --> 01:13:04,710
essentially, is a hierarchy
of the most important things
955
01:13:04,710 --> 01:13:09,540
that you can know about
a random variable.
956
01:13:09,540 --> 01:13:17,140
So if I tell you that the
outcome of some experiment
957
01:13:17,140 --> 01:13:23,409
is some number x,
distribute it somehow-- I
958
01:13:23,409 --> 01:13:25,450
guess the first thing that
you would like to know
959
01:13:25,450 --> 01:13:28,375
is whether the typical
values that you get
960
01:13:28,375 --> 01:13:33,100
are of the order of 1, are of
the order of million, whatever.
961
01:13:33,100 --> 01:13:36,380
So somehow, the
mean is something
962
01:13:36,380 --> 01:13:40,510
that tells you something that is
most important is zeroth order
963
01:13:40,510 --> 01:13:45,030
thing that you want to
know about the variable.
964
01:13:45,030 --> 01:13:47,130
But the next thing that
you might want to know
965
01:13:47,130 --> 01:13:50,250
is, well, what's the spread?
966
01:13:50,250 --> 01:13:52,830
How far does this thing go?
967
01:13:52,830 --> 01:13:58,090
And then the variance will tell
you something about the spread.
968
01:13:58,090 --> 01:13:59,750
So the next thing
that you want to do
969
01:13:59,750 --> 01:14:02,770
is maybe if given
the spread, am I
970
01:14:02,770 --> 01:14:06,080
more likely to get things
that are on one side or things
971
01:14:06,080 --> 01:14:08,390
that are on the other side.
972
01:14:08,390 --> 01:14:12,780
So the measure of its
asymmetry, right versus left,
973
01:14:12,780 --> 01:14:16,329
is provided by the third
cumulant, which is the skewness
974
01:14:16,329 --> 01:14:16,870
and so forth.
975
01:14:20,180 --> 01:14:24,380
So typically, the
very first few members
976
01:14:24,380 --> 01:14:28,340
of this hierarchy of
cumulants tells you
977
01:14:28,340 --> 01:14:30,985
the most important
information that you
978
01:14:30,985 --> 01:14:32,110
need about the probability.
979
01:14:37,100 --> 01:14:38,860
Now, I will mention
to you, and I
980
01:14:38,860 --> 01:14:44,040
guess we probably will deal
with it more next time around,
981
01:14:44,040 --> 01:14:51,700
the result that is in some
sense the backbone or granddaddy
982
01:14:51,700 --> 01:14:57,080
of all graphical expansions
that are carrying [INAUDIBLE].
983
01:14:57,080 --> 01:15:00,800
And that's a relationship
between the moments
984
01:15:00,800 --> 01:15:04,770
and cumulants that I
will express graphically.
985
01:15:04,770 --> 01:15:14,970
So this is graphical
representation
986
01:15:14,970 --> 01:15:20,860
of moments in
terms of cumulants.
987
01:15:26,470 --> 01:15:29,440
Essentially, what I'm
saying is that you
988
01:15:29,440 --> 01:15:33,100
can go through the
procedure as I outlined.
989
01:15:33,100 --> 01:15:37,240
And if you want to calculate
minus ik to the fifth power
990
01:15:37,240 --> 01:15:41,740
so that you find the description
of the fifth cumulant in terms
991
01:15:41,740 --> 01:15:44,475
of the moment, you'll
have to do a lot of work
992
01:15:44,475 --> 01:15:49,460
in expanding the log and powers
of this object and making sure
993
01:15:49,460 --> 01:15:53,790
that you don't make any
mistakes in the coefficient.
994
01:15:53,790 --> 01:15:58,570
There is a way to
circumvent that graphically
995
01:15:58,570 --> 01:16:00,460
and get the relationship.
996
01:16:00,460 --> 01:16:03,710
So how do we do that?
997
01:16:03,710 --> 01:16:16,010
You'll represent nth cumulant
as let's say a bag of endpoints.
998
01:16:20,100 --> 01:16:28,640
So let's say this entity will
represent the third cumulant.
999
01:16:28,640 --> 01:16:31,700
It's a bag with three points.
1000
01:16:31,700 --> 01:16:37,520
This-- one, two, three,
four, five, six--
1001
01:16:37,520 --> 01:16:39,391
will represent the
sixth cumulant.
1002
01:16:42,340 --> 01:16:54,645
Then, the nth moment
is some of all ways
1003
01:16:54,645 --> 01:17:04,680
of distributing end
points amongst bags.
1004
01:17:11,360 --> 01:17:13,664
So what do I mean?
1005
01:17:13,664 --> 01:17:19,140
So I want to calculate
the first moment x.
1006
01:17:19,140 --> 01:17:22,370
That would correspond
to one point.
1007
01:17:22,370 --> 01:17:25,260
And really, there's
only one diagram
1008
01:17:25,260 --> 01:17:28,470
I can put the bag
around it or not
1009
01:17:28,470 --> 01:17:31,280
that would correspond to this.
1010
01:17:31,280 --> 01:17:35,410
And that corresponds
to the first cumulant,
1011
01:17:35,410 --> 01:17:39,120
basically rewriting
what I had before.
1012
01:17:39,120 --> 01:17:43,010
If I want to look at the second
moment, the second moment
1013
01:17:43,010 --> 01:17:44,650
I need two points.
1014
01:17:44,650 --> 01:17:48,770
The two points I can either
put in the same bag or I
1015
01:17:48,770 --> 01:17:52,180
can put into two separate bags.
1016
01:17:52,180 --> 01:17:56,510
And the first one
corresponds to calculating
1017
01:17:56,510 --> 01:17:59,650
the second cumulant.
1018
01:17:59,650 --> 01:18:02,390
The second term
corresponds to two ways
1019
01:18:02,390 --> 01:18:05,460
in which their first
cumulant has appeared,
1020
01:18:05,460 --> 01:18:07,090
so I have to squared x.
1021
01:18:10,010 --> 01:18:17,080
if I want to calculate the
third moment, I need three dots.
1022
01:18:17,080 --> 01:18:21,400
The three dots I can
either put in one bag
1023
01:18:21,400 --> 01:18:27,950
or I can take one of them out
and keep two of them in a bag.
1024
01:18:27,950 --> 01:18:30,000
And here I had the
choice of three things
1025
01:18:30,000 --> 01:18:32,770
that I could've pulled out.
1026
01:18:32,770 --> 01:18:38,370
Or, I could have all of them in
individual bags of their own.
1027
01:18:38,370 --> 01:18:43,290
And mathematically, the first
term corresponds to x cubed c.
1028
01:18:43,290 --> 01:18:46,680
The third term corresponds
to three versions
1029
01:18:46,680 --> 01:18:49,460
of the variance times the mean.
1030
01:18:49,460 --> 01:18:54,040
And the last term is
just the mean cubed.
1031
01:18:54,040 --> 01:18:57,740
And you can massage
this expression
1032
01:18:57,740 --> 01:19:03,200
to see that I get the expression
that I have for the skewness.
1033
01:19:03,200 --> 01:19:05,860
I didn't offhand
remember the relationship
1034
01:19:05,860 --> 01:19:09,790
that I have to write down
for the fourth cumulant.
1035
01:19:09,790 --> 01:19:12,210
But I can graphically,
immediately get
1036
01:19:12,210 --> 01:19:15,140
the relationship for
the fourth moment
1037
01:19:15,140 --> 01:19:20,020
in terms of the fourth
cumulant which is this entity.
1038
01:19:20,020 --> 01:19:23,860
Or, four ways that I
can take one of the back
1039
01:19:23,860 --> 01:19:29,440
and maintain three in the bag,
three ways in which I have
1040
01:19:29,440 --> 01:19:37,190
two bags of two, six ways in
which I can have a bag of two
1041
01:19:37,190 --> 01:19:42,165
and two things that are
individually apart, and one
1042
01:19:42,165 --> 01:19:44,970
way in which there
are four things that
1043
01:19:44,970 --> 01:19:47,350
are independent of each other.
1044
01:19:47,350 --> 01:19:53,700
And this becomes x to the fourth
cumulant, the fourth cumulant,
1045
01:19:53,700 --> 01:19:58,800
4 times the third
cumulant times the mean,
1046
01:19:58,800 --> 01:20:03,740
3 times the square
of the variance,
1047
01:20:03,740 --> 01:20:10,010
6 times the variance
multiplied by the mean squared,
1048
01:20:10,010 --> 01:20:15,630
and the mean raised
to the fourth power.
1049
01:20:15,630 --> 01:20:16,690
And you can keep going.
1050
01:20:24,642 --> 01:20:29,130
AUDIENCE: Is the variance not
squared in the third term?
1051
01:20:29,130 --> 01:20:31,030
PROFESSOR: Did I forget that?
1052
01:20:31,030 --> 01:20:32,030
Yes, thank you.
1053
01:20:42,340 --> 01:20:43,830
All right.
1054
01:20:43,830 --> 01:20:48,010
So the proof of this is really
just the two-line algebra
1055
01:20:48,010 --> 01:20:51,960
exponentiating these expressions
that we have over here.
1056
01:20:51,960 --> 01:20:55,710
But it's much nicer to
represent that graphically.
1057
01:20:55,710 --> 01:20:59,700
And so now you can go
between things very easily.
1058
01:20:59,700 --> 01:21:05,310
And what I will show next time
is how, using this machinery,
1059
01:21:05,310 --> 01:21:09,960
you can calculate any
moment of a Gaussian,
1060
01:21:09,960 --> 01:21:12,650
for example, in just a
matter of seconds as opposed
1061
01:21:12,650 --> 01:21:16,980
to having to do integrations
and things like that.
1062
01:21:16,980 --> 01:21:19,900
So that's what we
will do next time will
1063
01:21:19,900 --> 01:21:23,120
be to apply this machinery
to various probability
1064
01:21:23,120 --> 01:21:25,290
distribution, such
as a Gaussian,
1065
01:21:25,290 --> 01:21:28,590
that we are likely to
encounter again and again.