WEBVTT
00:00:00.060 --> 00:00:01.780
The following
content is provided
00:00:01.780 --> 00:00:04.019
under a Creative
Commons license.
00:00:04.019 --> 00:00:06.870
Your support will help MIT
OpenCourseWare continue
00:00:06.870 --> 00:00:10.730
to offer high quality
educational resources for free.
00:00:10.730 --> 00:00:13.330
To make a donation or
view additional materials
00:00:13.330 --> 00:00:17.217
from hundreds of MIT courses,
visit MIT OpenCourseWare
00:00:17.217 --> 00:00:17.842
at ocw.mit.edu.
00:00:20.870 --> 00:00:25.050
PROFESSOR: We established that,
essentially, what we want to do
00:00:25.050 --> 00:00:29.500
is to describe the
properties of a system that
00:00:29.500 --> 00:00:30.253
is in equilibrium.
00:00:33.340 --> 00:00:39.060
And a system in equilibrium
is characterized
00:00:39.060 --> 00:00:41.920
by a certain number
of parameters.
00:00:41.920 --> 00:00:48.090
We discussed
displacement and forces
00:00:48.090 --> 00:00:52.200
that are used for
mechanical properties.
00:00:52.200 --> 00:00:56.570
We described how when systems
are in thermal equilibrium,
00:00:56.570 --> 00:01:01.800
the exchange of heat requires
that there is temperature
00:01:01.800 --> 00:01:04.590
that will be the
same between them.
00:01:04.590 --> 00:01:07.000
So that was where the
Zeroth Law came and told us
00:01:07.000 --> 00:01:11.410
that there is another
function of state.
00:01:11.410 --> 00:01:15.040
Then, we saw that,
from the First Law,
00:01:15.040 --> 00:01:18.020
there was energy, which is
another important function
00:01:18.020 --> 00:01:19.180
of state.
00:01:19.180 --> 00:01:26.445
And from the Second Law,
we arrived at entropy.
00:01:29.920 --> 00:01:33.000
And then by
manipulating these, we
00:01:33.000 --> 00:01:37.100
generated a whole set of
other functions, free energy,
00:01:37.100 --> 00:01:41.830
enthalpy, Gibbs free energy, the
grand potential, and the list
00:01:41.830 --> 00:01:43.850
goes on.
00:01:43.850 --> 00:01:46.670
And when the system
is in equilibrium,
00:01:46.670 --> 00:01:49.920
it has a well-defined
values of these quantities.
00:01:49.920 --> 00:01:52.670
You go from one equilibrium
to another equilibrium,
00:01:52.670 --> 00:01:54.740
and these quantities change.
00:01:54.740 --> 00:01:57.790
But of course, we saw that the
number of degrees of freedom
00:01:57.790 --> 00:02:00.700
that you have to
describe the system
00:02:00.700 --> 00:02:08.030
is indicated through looking
at the changes in energy, which
00:02:08.030 --> 00:02:10.740
if you were only
doing mechanical work,
00:02:10.740 --> 00:02:15.620
you would write as sum over all
possible ways of introducing
00:02:15.620 --> 00:02:17.755
mechanical work into the system.
00:02:20.296 --> 00:02:21.670
Then, we saw that
it was actually
00:02:21.670 --> 00:02:24.330
useful to separate
out the chemical work.
00:02:24.330 --> 00:02:26.410
So we could also
write this as a sum
00:02:26.410 --> 00:02:31.040
of an alpha chemical
potential number of particles.
00:02:31.040 --> 00:02:34.800
But there was also
ways of changing
00:02:34.800 --> 00:02:40.270
the energy of the system
through addition of heat.
00:02:40.270 --> 00:02:44.040
And so ultimately,
we saw that if there
00:02:44.040 --> 00:02:49.530
were n ways of doing
chemical and mechanical work,
00:02:49.530 --> 00:02:56.400
and one way of introducing heat
into the system, essentially
00:02:56.400 --> 00:03:00.440
n plus 1 variables are
sufficient to determine
00:03:00.440 --> 00:03:02.430
where you are in
this phase space.
00:03:02.430 --> 00:03:05.370
Once you have n
plus 1 of that list,
00:03:05.370 --> 00:03:08.990
you can input, in
principle, determine others
00:03:08.990 --> 00:03:11.380
as long as you have
not chosen things
00:03:11.380 --> 00:03:13.470
that are really
dependent on each other.
00:03:13.470 --> 00:03:16.060
So you have to choose
independent ones,
00:03:16.060 --> 00:03:19.650
and we had some discussion
of how that comes into play.
00:03:22.190 --> 00:03:26.080
So I said that today
we will briefly
00:03:26.080 --> 00:03:29.140
conclude with the
last, or the Third Law.
00:03:33.580 --> 00:03:35.790
This is the statement
about trying
00:03:35.790 --> 00:03:39.440
to calculate the
behavior of entropy
00:03:39.440 --> 00:03:41.880
as a function of temperature.
00:03:41.880 --> 00:03:46.650
And in principle,
you can imagine
00:03:46.650 --> 00:03:51.545
as a function of some coordinate
of your system-- capital X
00:03:51.545 --> 00:03:55.520
could indicate pressure,
volume, anything.
00:03:55.520 --> 00:03:58.590
You calculate that at
some particular value
00:03:58.590 --> 00:04:03.620
of temperature, T, T the
difference in entropy
00:04:03.620 --> 00:04:07.255
that you would have
between two points
00:04:07.255 --> 00:04:08.895
parametrized by X1 and X2.
00:04:11.590 --> 00:04:15.220
And in principle,
what you need to do
00:04:15.220 --> 00:04:21.149
is to find some kind of a
path for changing parameters
00:04:21.149 --> 00:04:28.340
from X1 to X2 and calculate,
in a reversible process, how
00:04:28.340 --> 00:04:31.370
much heat you have to
put into the system.
00:04:31.370 --> 00:04:35.955
Let's say at this fixed
temperature, T, divide by T. T
00:04:35.955 --> 00:04:43.350
is not changing along the
process from say X1 to X2.
00:04:43.350 --> 00:04:46.680
And this would be a
difference between the entropy
00:04:46.680 --> 00:04:51.490
that you would have
between these two
00:04:51.490 --> 00:04:56.540
quantities, between
these two points.
00:04:56.540 --> 00:04:59.690
You could, in principle,
then repeat this process
00:04:59.690 --> 00:05:06.540
at some lower temperature and
keep going all the way down
00:05:06.540 --> 00:05:09.530
to 0 temperature.
00:05:09.530 --> 00:05:20.120
What Nernst observed was that as
he went through this procedure
00:05:20.120 --> 00:05:28.250
to lower and lower temperatures,
this difference-- Let's
00:05:28.250 --> 00:05:35.840
call it delta s of T going
from X1 to X2 goes to 0.
00:05:42.320 --> 00:05:48.580
So it looks like, certainly
at this temperature,
00:05:48.580 --> 00:05:52.270
there is a change in entropy
going from one to another.
00:05:52.270 --> 00:05:53.680
There's also a change.
00:05:53.680 --> 00:05:56.530
This change gets
smaller and smaller
00:05:56.530 --> 00:05:59.330
as if, when you get
to 0 temperature,
00:05:59.330 --> 00:06:03.240
the value of your entropy is
independent of X. Whatever
00:06:03.240 --> 00:06:06.600
X you choose, you'll have
the same value of entropy.
00:06:09.390 --> 00:06:16.030
Now, that led to, after a while,
to a more ambitious version
00:06:16.030 --> 00:06:19.520
statement of the Third Law
that I will write down,
00:06:19.520 --> 00:06:31.515
which is that the entropy of
all substances at the zero
00:06:31.515 --> 00:06:44.680
of thermodynamic temperature is
the same and can be set to 0.
00:06:44.680 --> 00:06:52.500
Same universal
constant, set to 0.
00:06:56.200 --> 00:06:59.810
It's, in principle,
through these integration
00:06:59.810 --> 00:07:01.940
from one point to another
point, the only thing
00:07:01.940 --> 00:07:07.520
that you can calculate is the
difference between entropies.
00:07:07.520 --> 00:07:10.920
And essentially, this
suggests that the difference
00:07:10.920 --> 00:07:14.610
between entropies goes to 0,
but let's be more ambitious
00:07:14.610 --> 00:07:18.000
and say that even if you
look at different substances
00:07:18.000 --> 00:07:26.240
and you go to 0 temperature,
all of them have a unique value.
00:07:26.240 --> 00:07:30.920
And so there's more
evidence for being
00:07:30.920 --> 00:07:35.830
able to do this for
different substances via what
00:07:35.830 --> 00:07:41.930
is called allotropic state.
00:07:44.830 --> 00:07:50.450
So for example, some materials
can exist potentially
00:07:50.450 --> 00:07:53.440
in two different
crystalline states that
00:07:53.440 --> 00:07:58.390
are called allotropes,
for example, sulfur
00:07:58.390 --> 00:08:01.200
as a function of temperature.
00:08:01.200 --> 00:08:10.130
If you lower it's
temperature very slowly,
00:08:10.130 --> 00:08:19.740
it stays in some foreign all
the way down to 0 temperature.
00:08:19.740 --> 00:08:24.480
So if you change its temperature
rapidly, it stays in one form
00:08:24.480 --> 00:08:29.760
all the way to 0 temperature
in crystalline structure
00:08:29.760 --> 00:08:32.559
that is called monoclinic.
00:08:32.559 --> 00:08:35.500
If you cool it
very, very slowly,
00:08:35.500 --> 00:08:41.110
there is a temperature
around 40 degrees Celsius
00:08:41.110 --> 00:08:45.860
at which it makes a transition
to a different crystal
00:08:45.860 --> 00:08:48.140
structure.
00:08:48.140 --> 00:08:49.390
That is rhombohedral.
00:08:53.300 --> 00:08:57.116
And the thing that
I am plotting here,
00:08:57.116 --> 00:08:59.240
as a function of temperature,
is the heat capacity.
00:09:06.460 --> 00:09:11.320
And so if you are, let's
say, around room temperature,
00:09:11.320 --> 00:09:16.910
in principle you can say there's
two different forms of sulfur.
00:09:16.910 --> 00:09:20.670
One of them is truly stable,
and the other is metastable.
00:09:20.670 --> 00:09:25.950
That is, in principle, if
you rate what sufficiently
00:09:25.950 --> 00:09:27.700
is of the order of
[? centuries ?],
00:09:27.700 --> 00:09:31.900
you can get the transition from
this form to the stable form.
00:09:31.900 --> 00:09:34.770
But for our purposes,
at room temperature,
00:09:34.770 --> 00:09:37.470
you would say that
at the scale of times
00:09:37.470 --> 00:09:40.260
that I'm observing things, there
are these 2 possible states
00:09:40.260 --> 00:09:45.560
that are both equilibrium
states of the same substance.
00:09:45.560 --> 00:09:47.650
Now using these two
equilibrium states,
00:09:47.650 --> 00:09:52.780
I can start to test this
Nernst theorem generalized
00:09:52.780 --> 00:09:54.910
to different substances.
00:09:54.910 --> 00:09:57.500
If you, again, regard
these two different things
00:09:57.500 --> 00:09:59.440
as different substances.
00:09:59.440 --> 00:10:03.580
You could say that if I want
to calculate the entropy just
00:10:03.580 --> 00:10:08.270
slightly above the transition,
I can come from two paths.
00:10:08.270 --> 00:10:12.610
I can either come
from path number one.
00:10:12.610 --> 00:10:15.930
Along path number
one, I would say
00:10:15.930 --> 00:10:22.200
that the entropy at
this Tc plus is obtained
00:10:22.200 --> 00:10:27.520
by integrating
degree heat capacity,
00:10:27.520 --> 00:10:39.020
so integral dT Cx
of T divided by T.
00:10:39.020 --> 00:10:42.760
This combination is
none other than dQ.
00:10:42.760 --> 00:10:47.180
Basically, the combination
of heat capacity dT
00:10:47.180 --> 00:10:49.240
is the amount of
heat that you have
00:10:49.240 --> 00:10:52.040
to put the substance to
change its temperature.
00:10:52.040 --> 00:10:58.100
And you do this all the
way from 0 to Tc plus.
00:10:58.100 --> 00:11:00.130
Let's say we go
along this path that
00:11:00.130 --> 00:11:05.290
corresponds to this
monoclinic way.
00:11:05.290 --> 00:11:11.260
And I'm using this Cm that
corresponds to this as opposed
00:11:11.260 --> 00:11:15.400
to 0 that corresponds to this.
00:11:15.400 --> 00:11:19.850
Another thing that I can
do-- and I made a mistake
00:11:19.850 --> 00:11:24.050
because what I really need
to do is to, in principle,
00:11:24.050 --> 00:11:27.710
add to this some
entropy that I would
00:11:27.710 --> 00:11:32.140
assign to this green state at 0
because this is the difference.
00:11:32.140 --> 00:11:36.270
So this is the
entropy that I would
00:11:36.270 --> 00:11:41.080
assign to the monoclinic
state at T close to 0.
00:11:41.080 --> 00:11:44.740
Going along the
orange path, I would
00:11:44.740 --> 00:11:49.820
say that S evaluated
at Tc plus is
00:11:49.820 --> 00:11:54.300
obtained by integrating from 0.
00:11:54.300 --> 00:11:59.110
Let's say to Tc
minus dT, the heat
00:11:59.110 --> 00:12:01.730
capacity of this rhombic phase.
00:12:06.580 --> 00:12:10.160
But when I get to just
below the transition,
00:12:10.160 --> 00:12:13.650
I want to go to just
above the transition.
00:12:13.650 --> 00:12:18.080
I have to actually be put in
certain amount of latent heat.
00:12:18.080 --> 00:12:25.130
So here I have to add
latent heat L, always
00:12:25.130 --> 00:12:29.120
at the temperatures Tc, to
gradually make the substance
00:12:29.120 --> 00:12:31.650
transition from
one to the other.
00:12:31.650 --> 00:12:35.470
So I have to add here L of Tc.
00:12:35.470 --> 00:12:40.650
This would be the
integration of dQ,
00:12:40.650 --> 00:12:46.160
but then I would have to add
the entropy that I would assign
00:12:46.160 --> 00:12:49.910
to the orange state
at 0 temperature.
00:12:49.910 --> 00:12:53.040
So this is something that
you can do experimentally.
00:12:53.040 --> 00:12:55.930
You can evaluate at these
integrals, and what you'll find
00:12:55.930 --> 00:12:59.970
is that these two
things are the same.
00:12:59.970 --> 00:13:03.250
So this is yet
another justification
00:13:03.250 --> 00:13:09.210
of this entropy
being independent
00:13:09.210 --> 00:13:13.000
of where you start
at 0 temperature.
00:13:13.000 --> 00:13:16.030
Again at this
point, if you like,
00:13:16.030 --> 00:13:18.100
you can by [INAUDIBLE]
state that this
00:13:18.100 --> 00:13:21.215
is 0 for everything
will start with 0.
00:13:27.730 --> 00:13:35.130
So this is a supposed new
law of thermodynamics.
00:13:35.130 --> 00:13:36.000
Is it useful?
00:13:36.000 --> 00:13:38.770
What can we deduce from that?
00:13:38.770 --> 00:13:40.830
So let's look at
the consequences.
00:13:46.280 --> 00:13:48.740
First thing is so what
I have established
00:13:48.740 --> 00:13:54.890
is that the limit
as T goes to 0 of S,
00:13:54.890 --> 00:13:57.440
irrespective of whatever
set of parameters
00:13:57.440 --> 00:14:02.920
I have-- so I pick T as one
of my n plus one coordinates,
00:14:02.920 --> 00:14:06.490
and I put some other
bunch of coordinates here.
00:14:06.490 --> 00:14:08.820
I take the limit
of this going to 0.
00:14:08.820 --> 00:14:09.870
This becomes 0.
00:14:13.410 --> 00:14:17.790
So that means, almost
by construction,
00:14:17.790 --> 00:14:24.540
that if I take the derivative of
S with respect to any of these
00:14:24.540 --> 00:14:30.650
coordinates-- if I take then
the limit as T goes to 0,
00:14:30.650 --> 00:14:33.790
this would be
fixed T. This is 0.
00:14:39.690 --> 00:14:40.190
Fine.
00:14:40.190 --> 00:14:44.220
So basically, this
is another way
00:14:44.220 --> 00:14:49.290
of stating that entropy
differences go through 0.
00:14:49.290 --> 00:14:52.890
But it does have a consequence
because one thing that you will
00:14:52.890 --> 00:14:55.797
frequently measure
are quantities,
00:14:55.797 --> 00:14:56.630
such as extensivity.
00:15:01.160 --> 00:15:03.700
What do I mean by that?
00:15:03.700 --> 00:15:05.050
Let's pick a displacement.
00:15:05.050 --> 00:15:07.650
Could be the length of a wire.
00:15:07.650 --> 00:15:11.210
Could be the volume of a gas.
00:15:11.210 --> 00:15:17.860
And we can ask if I were
to change temperature,
00:15:17.860 --> 00:15:21.030
how does that quantity change?
00:15:21.030 --> 00:15:26.230
So these are quantities
typically called alpha.
00:15:26.230 --> 00:15:29.020
Actually, usually
you would also divide
00:15:29.020 --> 00:15:35.115
by x to make them intensive
because otherwise x
00:15:35.115 --> 00:15:36.910
being extensive,
the whole quantity
00:15:36.910 --> 00:15:39.960
would have been extensive.
00:15:39.960 --> 00:15:44.840
Let's say we do this at fixed
corresponding displacement.
00:15:44.840 --> 00:15:48.430
So something that
is very relevant
00:15:48.430 --> 00:15:51.550
is you take the volume of
gas who changes temperature
00:15:51.550 --> 00:15:54.350
at fixed pressure, and
the volume of the gas
00:15:54.350 --> 00:15:59.070
will shrink or expand
according to this extensive.
00:15:59.070 --> 00:16:04.970
Now, this can be related to this
through Maxwell relationship.
00:16:04.970 --> 00:16:07.620
So let's see what I have to do.
00:16:07.620 --> 00:16:13.520
I have that dE is
something like Jdx plus,
00:16:13.520 --> 00:16:17.990
according to what I
have over there, TdS.
00:16:17.990 --> 00:16:21.940
I want to be able to write
a Maxwell relation that
00:16:21.940 --> 00:16:25.420
relates a derivative of x.
00:16:25.420 --> 00:16:28.510
So I want to make x
into a first derivative.
00:16:28.510 --> 00:16:31.640
So I look at E minus Jx.
00:16:31.640 --> 00:16:36.240
And this Jdx becomes minus xdJ.
00:16:36.240 --> 00:16:40.070
But I want to take a derivative
of x with respect not
00:16:40.070 --> 00:16:44.590
s, but with respect
to T. So I'll do that.
00:16:44.590 --> 00:16:48.130
This becomes a minus SdT.
00:16:48.130 --> 00:16:52.070
So now, I immediately see
that I will have a Maxwell
00:16:52.070 --> 00:16:58.110
relation that says dx
by dT at constant J
00:16:58.110 --> 00:17:02.850
is the same thing as
dS by dJ at constant T.
00:17:02.850 --> 00:17:09.321
So this is the same thing by
the Maxwell relation as dS
00:17:09.321 --> 00:17:23.000
by dJ at constant T. All right?
00:17:23.000 --> 00:17:27.599
This is one of these quantities,
therefore, as T goes 0,
00:17:27.599 --> 00:17:28.590
this goes to 0.
00:17:28.590 --> 00:17:33.370
And therefore, the
expansivity should go to 0.
00:17:33.370 --> 00:17:38.700
So any quantity that measures
expansion, contraction,
00:17:38.700 --> 00:17:42.070
or some other change as a
function of temperature,
00:17:42.070 --> 00:17:45.230
according to this law, as
you go through 0 temperature,
00:17:45.230 --> 00:17:46.350
should go to 0.
00:17:49.780 --> 00:17:54.100
There's one other quantity
that also goes to 0,
00:17:54.100 --> 00:17:57.810
and that's the heat capacity.
00:17:57.810 --> 00:18:04.160
So if I want to calculate the
difference between entropy
00:18:04.160 --> 00:18:10.820
at some temperature T
and some temperature
00:18:10.820 --> 00:18:15.570
at 0 along some particular
path corresponding
00:18:15.570 --> 00:18:18.810
to some constant
x for example, you
00:18:18.810 --> 00:18:20.640
would say that what
I need to do is
00:18:20.640 --> 00:18:27.310
to integrate from 0 to
T the heat that I have
00:18:27.310 --> 00:18:31.370
to put into the
system at constant x.
00:18:31.370 --> 00:18:34.700
And so if I do
that slowly enough,
00:18:34.700 --> 00:18:38.720
this heat I can write as CxdT.
00:18:38.720 --> 00:18:41.370
Cx, potentially,
is a function of T.
00:18:41.370 --> 00:18:44.750
Actually, since I'm indicating
T as the other point
00:18:44.750 --> 00:18:48.710
of integration, let me call
the variable of integration T
00:18:48.710 --> 00:18:50.610
prime.
00:18:50.610 --> 00:18:56.620
So I take a path in which
I change temperature.
00:18:56.620 --> 00:19:00.320
I calculate the heat
capacity at constant x.
00:19:00.320 --> 00:19:02.560
Integrate it.
00:19:02.560 --> 00:19:06.434
Multiply by dT to convert
it to T, and get the result.
00:19:09.220 --> 00:19:13.380
So all of these results that
they have been formulating
00:19:13.380 --> 00:19:18.630
suggest that the result that you
would get as a function of T,
00:19:18.630 --> 00:19:27.600
for entropy, is something that
as T goes to 0, approaches 0.
00:19:27.600 --> 00:19:31.640
So it should be a perfectly
nice, well-defined value
00:19:31.640 --> 00:19:35.210
at any finite temperature.
00:19:35.210 --> 00:19:38.170
Now, if you integrate
a constant divided
00:19:38.170 --> 00:19:42.420
by T, divided by dT, then
essentially the constant
00:19:42.420 --> 00:19:44.020
would give you a logarithm.
00:19:44.020 --> 00:19:48.100
And the logarithm would blow
up as we go to 0 temperature.
00:19:48.100 --> 00:19:52.240
So the only way that
this integral does not
00:19:52.240 --> 00:20:01.590
blow up on you-- so this is
finite only if the limit as T
00:20:01.590 --> 00:20:08.390
goes to 0 of the heat
capacities should also go to 0.
00:20:08.390 --> 00:20:12.550
So any heat capacity should
also essentially vanish
00:20:12.550 --> 00:20:14.890
as you go to lower
and lower temperature.
00:20:14.890 --> 00:20:17.940
This is something that you
will see many, many times when
00:20:17.940 --> 00:20:20.880
you look at different
heat capacities
00:20:20.880 --> 00:20:22.490
in the rest of the course.
00:20:25.030 --> 00:20:27.150
There is one other
aspect of this
00:20:27.150 --> 00:20:30.900
that I will not really explain,
but you can go and look
00:20:30.900 --> 00:20:36.410
at the notes or elsewhere, which
is that another consequence is
00:20:36.410 --> 00:20:47.222
unattainability of T equals
to 0 by any finite set
00:20:47.222 --> 00:20:47.805
of operations.
00:20:55.900 --> 00:21:00.130
Essentially, if you want
to get to 0 temperature,
00:21:00.130 --> 00:21:03.840
you'll have to do something
that cools you step by step.
00:21:03.840 --> 00:21:05.880
And the steps become
smaller and smaller,
00:21:05.880 --> 00:21:08.680
and you have to repeat
that many times.
00:21:08.680 --> 00:21:13.290
But that is another consequence.
00:21:13.290 --> 00:21:15.055
We'll leave that
for the time being.
00:21:18.780 --> 00:21:25.430
I would like to, however, end
by discussing some distinctions
00:21:25.430 --> 00:21:28.350
that are between
these different laws.
00:21:31.390 --> 00:21:36.140
So if you think
about whatever could
00:21:36.140 --> 00:21:42.210
be the microscopic
origin, after all, I
00:21:42.210 --> 00:21:46.240
have emphasized
that thermodynamics
00:21:46.240 --> 00:21:49.960
is a set of rules that you look
at substances as black boxes
00:21:49.960 --> 00:21:53.550
and you try to deduce a
certain number of things
00:21:53.550 --> 00:21:58.210
based on observations, such
as what Nernst did over here.
00:21:58.210 --> 00:22:00.730
But you say, these
black boxes, I
00:22:00.730 --> 00:22:03.750
know what is inside
them in principle.
00:22:03.750 --> 00:22:07.820
It's composed of atoms,
molecules, light, quark,
00:22:07.820 --> 00:22:09.670
whatever the
microscope theory is
00:22:09.670 --> 00:22:14.380
that you want to assign to
the components of that box.
00:22:14.380 --> 00:22:16.530
And I know the
dynamics that governs
00:22:16.530 --> 00:22:18.770
these microscopic
degrees of freedom.
00:22:18.770 --> 00:22:22.740
I should be able to get
the laws of thermodynamics
00:22:22.740 --> 00:22:27.000
starting from the
microscopic laws.
00:22:27.000 --> 00:22:30.170
Eventually, we will do
that, and as we do that,
00:22:30.170 --> 00:22:34.560
we will find the origin
of these different laws.
00:22:34.560 --> 00:22:39.670
Now, you won't be surprised
that the First Law is intimately
00:22:39.670 --> 00:22:44.390
connected to the fact that
any microscopic set of rules
00:22:44.390 --> 00:22:48.120
that you write down embodies
the conservation of energy.
00:22:52.570 --> 00:22:56.780
And all you have to make
sure is to understand
00:22:56.780 --> 00:23:01.750
precisely what heat is
as a form of energy.
00:23:01.750 --> 00:23:07.060
And then if we regard heat
as another form of energy,
00:23:07.060 --> 00:23:10.250
another component, it's
really the conservation law
00:23:10.250 --> 00:23:10.870
that we have.
00:23:14.630 --> 00:23:17.190
Then, you have the Zeroth
Law and the Second Law.
00:23:21.650 --> 00:23:25.530
The Zeroth Law and Second Law
have to do with equilibrium
00:23:25.530 --> 00:23:28.980
and being able to go in
some particular direction.
00:23:28.980 --> 00:23:35.170
And that always runs a fall of
the microscopic laws of motion
00:23:35.170 --> 00:23:39.370
that are typically things
that are time reversible where
00:23:39.370 --> 00:23:42.360
as the Zeroth Law and
Second Law are not.
00:23:42.360 --> 00:23:46.390
And what we will see later on,
through statistical mechanics,
00:23:46.390 --> 00:23:49.630
is that the origin
of these laws is
00:23:49.630 --> 00:24:00.180
that we are dealing with large
numbers of degrees of freedom.
00:24:03.760 --> 00:24:09.580
And once we adapt the
proper perspective
00:24:09.580 --> 00:24:12.960
to looking at properties
of large numbers of degrees
00:24:12.960 --> 00:24:15.330
of freedom, which
will be a start
00:24:15.330 --> 00:24:19.000
to do the elements of that
[? prescription ?] today,
00:24:19.000 --> 00:24:20.870
the Zeroth Law and
Second Law emerge.
00:24:23.990 --> 00:24:32.790
Now the Third Law, you
all know that once we
00:24:32.790 --> 00:24:37.590
go through this process,
eventually for example,
00:24:37.590 --> 00:24:41.560
we get things for the
description of entropy, which
00:24:41.560 --> 00:24:44.970
is related to some
number of states
00:24:44.970 --> 00:24:48.740
that the system
has indicated by g.
00:24:48.740 --> 00:24:55.880
And if you then want to
have S going through 0,
00:24:55.880 --> 00:24:59.690
you would require that
g goes to something
00:24:59.690 --> 00:25:04.600
that is order of 1-- of 1 if
you like-- as T goes to 0.
00:25:08.120 --> 00:25:10.520
And typically, you
would say that systems
00:25:10.520 --> 00:25:16.890
adopt their ground state, lowest
energy state, at 0 temperature.
00:25:16.890 --> 00:25:19.340
And so this is
somewhat a statement
00:25:19.340 --> 00:25:23.470
about the uniqueness of the
state of all possible systems
00:25:23.470 --> 00:25:26.120
at low temperature.
00:25:26.120 --> 00:25:31.420
Now, if you think about
the gas in this room,
00:25:31.420 --> 00:25:36.520
and let's imagine that
the particles of this gas
00:25:36.520 --> 00:25:40.830
either don't interact, which is
maybe a little bit unrealistic,
00:25:40.830 --> 00:25:42.640
but maybe repel each other.
00:25:42.640 --> 00:25:44.810
So let's say you have
a bunch of particles
00:25:44.810 --> 00:25:46.910
that just repel each other.
00:25:46.910 --> 00:25:51.180
Then, there is
really no reason why,
00:25:51.180 --> 00:25:54.850
as I go to lower and
lower temperatures,
00:25:54.850 --> 00:25:59.930
the number of configurations of
the molecules should decrease.
00:25:59.930 --> 00:26:03.180
All configurations that I
draw that they don't overlap
00:26:03.180 --> 00:26:05.900
have roughly the same energy.
00:26:05.900 --> 00:26:11.250
And indeed, if I look at say
any one of these properties,
00:26:11.250 --> 00:26:15.330
like the expansivity of a
gas at constant pressure
00:26:15.330 --> 00:26:19.230
which is given in fact
with a minus sign.
00:26:19.230 --> 00:26:23.850
dV by dT at constant pressure
would be the analog of one
00:26:23.850 --> 00:26:25.820
of these extensivities.
00:26:25.820 --> 00:26:32.290
If I use the Ideal Gas
Law-- So for ideal gas,
00:26:32.290 --> 00:26:35.530
we've seen that PV is
proportional to let's
00:26:35.530 --> 00:26:38.480
say some temperature.
00:26:38.480 --> 00:26:42.290
Then, dV by dT at
constant pressure
00:26:42.290 --> 00:26:46.390
is none other than V over
T. So this would give me
00:26:46.390 --> 00:26:59.270
1 over V, V over T. Probably
don't need it on this.
00:26:59.270 --> 00:27:03.000
This is going to
give me 1 over T.
00:27:03.000 --> 00:27:07.410
So not only doesn't it
go to 0 at 0 temperature,
00:27:07.410 --> 00:27:10.960
if the Ideal Gas
Law was satisfied,
00:27:10.960 --> 00:27:13.140
the extensivity would
actually diverge
00:27:13.140 --> 00:27:17.590
at 0 temperature as
different as you want.
00:27:17.590 --> 00:27:21.760
So clearly the Ideal Gas
Law, if it was applicable
00:27:21.760 --> 00:27:24.210
all the way down
to 0 temperature,
00:27:24.210 --> 00:27:27.330
would violate the Third
Law of thermodynamics.
00:27:27.330 --> 00:27:30.560
Again, not surprising
given that I have told you
00:27:30.560 --> 00:27:33.770
that a gas of classical
particles with repulsion
00:27:33.770 --> 00:27:36.880
has many states.
00:27:36.880 --> 00:27:39.850
Now, we will see
later on in the course
00:27:39.850 --> 00:27:48.220
that once we include quantum
mechanics, then as you
00:27:48.220 --> 00:27:50.910
go to 0 temperature,
these particles
00:27:50.910 --> 00:27:54.472
will have a unique state.
00:27:54.472 --> 00:27:59.177
If they are bosons, they will be
together in one wave function.
00:27:59.177 --> 00:28:01.260
If they are fermions, they
will arrange themselves
00:28:01.260 --> 00:28:05.190
appropriately so that,
because of quantum mechanics,
00:28:05.190 --> 00:28:08.530
all of these laws would
certainly breakdown at T equals
00:28:08.530 --> 00:28:09.750
to 0.
00:28:09.750 --> 00:28:13.080
You will get 0 entropy, and
you would get consistency
00:28:13.080 --> 00:28:16.010
with all of these things.
00:28:16.010 --> 00:28:18.650
So somehow, the nature
of the Third Law
00:28:18.650 --> 00:28:22.470
is different from the other
laws because its validity
00:28:22.470 --> 00:28:28.365
rests on being able to
be living in a world
00:28:28.365 --> 00:28:31.010
where quantum mechanics applies.
00:28:31.010 --> 00:28:34.350
So in principle, you could have
imagined some other universe
00:28:34.350 --> 00:28:36.700
where h-bar equals
to 0, and then
00:28:36.700 --> 00:28:39.580
the Third Law of thermodynamics
would not hold there
00:28:39.580 --> 00:28:41.990
whereas the Zeroth Law
and Second Law would hold.
00:28:41.990 --> 00:28:42.839
Yes?
00:28:42.839 --> 00:28:45.616
AUDIENCE: Are there any known
exceptions to the Third Law?
00:28:45.616 --> 00:28:47.240
Are we going to
[? account for them? ?]
00:28:52.330 --> 00:28:55.160
PROFESSOR: For
equilibrium-- So this
00:28:55.160 --> 00:28:59.010
is actually an
interesting question.
00:28:59.010 --> 00:29:03.050
What do I know
about-- classically,
00:29:03.050 --> 00:29:07.550
I can certainly come up with
lots of examples that violate.
00:29:07.550 --> 00:29:11.320
So your question then amounts if
I say that quantum mechanics is
00:29:11.320 --> 00:29:15.130
necessary, do I
know that the ground
00:29:15.130 --> 00:29:18.790
state of a quantum
mechanical system is unique.
00:29:18.790 --> 00:29:22.830
And I don't know of a proof of
that for interacting system.
00:29:22.830 --> 00:29:27.380
I don't know of a case that's
violated, but as far as I know,
00:29:27.380 --> 00:29:32.240
there is no proof that I give
you an interacting Hamiltonian
00:29:32.240 --> 00:29:36.550
for a quantum system, and
there's a unique ground state.
00:29:36.550 --> 00:29:39.110
And I should say
that there'd be no--
00:29:39.110 --> 00:29:42.350
and I'm sure you know of cases
where the ground state is not
00:29:42.350 --> 00:29:44.880
unique like a ferromagnet.
00:29:44.880 --> 00:29:50.040
But the point is not that
g should be exactly one,
00:29:50.040 --> 00:29:54.460
but that the limit
of log g divided
00:29:54.460 --> 00:29:56.460
by the number of
degrees of freedom
00:29:56.460 --> 00:30:02.280
that you have should go to
0 as n goes to infinity.
00:30:02.280 --> 00:30:07.620
So something like a ferromagnet
may have many ground states,
00:30:07.620 --> 00:30:09.590
but the number of
ground states is not
00:30:09.590 --> 00:30:13.440
proportional to the number of
sites, the number of spins,
00:30:13.440 --> 00:30:15.770
and this entity will go to 0.
00:30:15.770 --> 00:30:19.330
So all the cases that we
know, the ground state
00:30:19.330 --> 00:30:24.100
is either unique
or is order of one.
00:30:24.100 --> 00:30:27.300
But I don't know a theorem that
says that should be the case.
00:30:34.600 --> 00:30:36.550
So this is the last
thing that I wanted
00:30:36.550 --> 00:30:39.100
to say about thermodynamics.
00:30:39.100 --> 00:30:40.910
Are there any
questions in general?
00:30:45.850 --> 00:30:50.170
So I laid out the
necessity of having
00:30:50.170 --> 00:30:56.240
some kind of a description of
microscopic degrees of freedom
00:30:56.240 --> 00:31:00.680
that ultimately will
allow us to prove
00:31:00.680 --> 00:31:03.140
the laws of thermodynamics.
00:31:03.140 --> 00:31:08.570
And that will come through
statistical mechanics, which
00:31:08.570 --> 00:31:12.410
as the name implies, has
to have certain amount
00:31:12.410 --> 00:31:16.010
of statistic characters to it.
00:31:16.010 --> 00:31:18.230
What does that mean?
00:31:18.230 --> 00:31:22.240
It means that you have to
abandon a description of motion
00:31:22.240 --> 00:31:26.360
that is fully
deterministic for one
00:31:26.360 --> 00:31:27.725
that is based on probability.
00:31:30.860 --> 00:31:34.760
Now, I could have told you
first the degrees of freedom
00:31:34.760 --> 00:31:37.090
and what is the
description that we
00:31:37.090 --> 00:31:40.610
need for them to
be probabilistic,
00:31:40.610 --> 00:31:43.740
but I find it more
useful to first lay out
00:31:43.740 --> 00:31:45.720
what the language
of probability is
00:31:45.720 --> 00:31:48.915
that we will be
using and then bring
00:31:48.915 --> 00:31:52.200
in the description of the
microscopic degrees of freedom
00:31:52.200 --> 00:31:55.690
within this language.
00:31:55.690 --> 00:32:06.590
So if we go first
with definitions--
00:32:06.590 --> 00:32:12.610
and you could, for example, go
to the branch of mathematics
00:32:12.610 --> 00:32:16.290
that deals with probability,
and you will encounter something
00:32:16.290 --> 00:32:21.970
like this that what probability
describes is a random variable.
00:32:27.050 --> 00:32:36.470
Let's call it X, which has a
number of possible outcomes,
00:32:36.470 --> 00:32:49.740
which we put together
into a set of outcomes, S.
00:32:49.740 --> 00:33:00.410
And this set can be discrete as
would be the case if you were
00:33:00.410 --> 00:33:05.030
tossing a coin, and
the outcomes would
00:33:05.030 --> 00:33:11.680
be either a head or a tail,
or we were throwing a dice,
00:33:11.680 --> 00:33:15.215
and the outcomes would
be the faces 1 through 6.
00:33:19.050 --> 00:33:23.180
And we will encounter
mostly actually cases
00:33:23.180 --> 00:33:25.150
where S is continuous.
00:33:28.610 --> 00:33:34.300
Like for example, if I want to
describe the velocity of a gas
00:33:34.300 --> 00:33:39.950
particle in this room, I need
to specify the three components
00:33:39.950 --> 00:33:44.530
of velocity that can
be anywhere, let's say,
00:33:44.530 --> 00:33:46.210
in the range of real numbers.
00:33:51.130 --> 00:34:10.929
And again, mathematicians
would say that to each event,
00:34:10.929 --> 00:34:20.080
which is a subset of
possible outcomes,
00:34:20.080 --> 00:34:30.480
is assigned a value which we
must satisfy the following
00:34:30.480 --> 00:34:30.980
properties.
00:34:35.360 --> 00:34:41.210
First thing is the
probability of anything
00:34:41.210 --> 00:34:44.440
is a positive number.
00:34:44.440 --> 00:34:46.763
And so this is positivity.
00:34:55.100 --> 00:34:57.550
The second thing is additivity.
00:35:01.360 --> 00:35:08.180
That is the probability
of two events, A or B,
00:35:08.180 --> 00:35:13.360
is the sum total of
the probabilities
00:35:13.360 --> 00:35:17.395
if A and B are
disjoint or distinct.
00:35:23.389 --> 00:35:24.930
And finally, there's
a normalization.
00:35:31.110 --> 00:35:33.980
That if you're event is
that something should happen
00:35:33.980 --> 00:35:38.888
the entire set, the probability
that you assign to that is 1.
00:35:42.310 --> 00:35:45.100
So these are formal statements.
00:35:45.100 --> 00:35:49.200
And if you are a mathematician,
you start from there,
00:35:49.200 --> 00:35:52.020
and you prove theorems.
00:35:52.020 --> 00:35:59.900
But from our perspective,
the first question to ask
00:35:59.900 --> 00:36:08.960
is how to determine this
quantity probability
00:36:08.960 --> 00:36:12.070
that something should happen.
00:36:12.070 --> 00:36:18.670
If it is useful and I want to do
something real world about it,
00:36:18.670 --> 00:36:23.340
I should be able to measure
it or assign values to it.
00:36:23.340 --> 00:36:28.290
And very roughly
again, in theory,
00:36:28.290 --> 00:36:34.240
we can assign probabilities
two different ways.
00:36:34.240 --> 00:36:36.940
One way is called objective.
00:36:41.010 --> 00:36:45.300
And from the perspective
of us as physicists
00:36:45.300 --> 00:36:48.390
corresponds to what would be
an experimental procedure.
00:36:51.700 --> 00:37:06.020
And if it is assigning p of e
as the frequency of outcomes
00:37:06.020 --> 00:37:16.400
in large number of
trials, i.e. you
00:37:16.400 --> 00:37:22.770
would say that the probability
that event A is obtained
00:37:22.770 --> 00:37:27.940
is the number of times you
would get outcome A divided
00:37:27.940 --> 00:37:34.040
by the total number of
trials as n goes to infinity.
00:37:34.040 --> 00:37:40.790
So for example, if you want to
assign a probability that when
00:37:40.790 --> 00:37:48.120
you throw a dice that face 1
comes up, what you could do
00:37:48.120 --> 00:37:52.590
is you could make a table
of the number of times
00:37:52.590 --> 00:37:57.690
1 shows up divided by the number
of times you throw the dice.
00:37:57.690 --> 00:38:04.050
Maybe you throw it 100
times, and you get 15.
00:38:04.050 --> 00:38:08.767
You throw it 200
times, and you get--
00:38:08.767 --> 00:38:09.850
that is probably too much.
00:38:09.850 --> 00:38:14.340
Let's say 15-- you get 35.
00:38:14.340 --> 00:38:19.765
And you do it 300 times, and
you get something close to 48.
00:38:19.765 --> 00:38:24.640
The ratio of these
things, as the number
00:38:24.640 --> 00:38:27.330
gets larger and
larger, hopefully
00:38:27.330 --> 00:38:30.130
will converge to something that
you would call the probability.
00:38:32.770 --> 00:38:38.380
Now, it turns out that
in statistical physics,
00:38:38.380 --> 00:38:42.660
we will assign things through
a totally different procedure
00:38:42.660 --> 00:38:46.330
which is subjective.
00:38:46.330 --> 00:38:51.140
If you like, it's
more theoretical,
00:38:51.140 --> 00:39:07.020
which is based on uncertainty
among all outcomes.
00:39:11.340 --> 00:39:16.224
Because if I were to
subjectively assign
00:39:16.224 --> 00:39:21.740
to throwing the dice and
coming up with value of 1,
00:39:21.740 --> 00:39:28.010
I would say, well, there's six
possible faces for the dice.
00:39:28.010 --> 00:39:30.900
I don't know anything about
this dice being loaded,
00:39:30.900 --> 00:39:34.750
so I will say they
are all equally alike.
00:39:34.750 --> 00:39:37.570
Now, that may or may not
be a correct assumption.
00:39:37.570 --> 00:39:38.520
You could test it.
00:39:38.520 --> 00:39:40.180
You could maybe
throw it many times.
00:39:40.180 --> 00:39:43.250
You will find that either the
dice is loaded or not loaded
00:39:43.250 --> 00:39:45.050
and this is correct or not.
00:39:45.050 --> 00:39:48.830
But you begin by
making this assumption.
00:39:48.830 --> 00:39:52.620
And this is actually, we
will see later on, exactly
00:39:52.620 --> 00:39:54.890
the type of assumption
that you would
00:39:54.890 --> 00:39:57.210
be making in
statistical physics.
00:40:05.979 --> 00:40:07.520
Any question about
these definitions?
00:40:10.980 --> 00:40:17.020
So let's again
proceed slowly to get
00:40:17.020 --> 00:40:25.430
some definitions established by
looking at one random variable.
00:40:25.430 --> 00:40:31.890
So this is the next section
on one random variable.
00:40:34.770 --> 00:40:38.770
And I will assume that
I'll look at the case
00:40:38.770 --> 00:40:40.670
of the continuous
random variable.
00:40:40.670 --> 00:40:49.270
So x can be any real number
minus infinity to infinity.
00:40:49.270 --> 00:40:52.170
Now, a number of definitions.
00:40:52.170 --> 00:40:59.260
I will use the term
Cumulative-- make
00:40:59.260 --> 00:41:25.420
sure I'll use the-- Cumulative
Probability Function, CPF,
00:41:25.420 --> 00:41:28.690
that for this one
random variable,
00:41:28.690 --> 00:41:32.460
I will indicate
by capital P of x.
00:41:35.070 --> 00:41:37.710
And the meaning of
this is that capital P
00:41:37.710 --> 00:41:46.700
of x is the probability
of outcome less than x.
00:41:54.130 --> 00:42:00.390
So generically,
we say that x can
00:42:00.390 --> 00:42:04.020
take all values
along the real line.
00:42:04.020 --> 00:42:05.820
And there is this
function that I
00:42:05.820 --> 00:42:12.710
want to plot that I will
call big P of x Now big P
00:42:12.710 --> 00:42:16.290
of x is a probability,
therefore,
00:42:16.290 --> 00:42:20.030
it has to be positive according
to the first item that we
00:42:20.030 --> 00:42:21.840
have over there.
00:42:21.840 --> 00:42:27.020
And it will be less than 1
because the net probability
00:42:27.020 --> 00:42:30.440
for everything toward
here is equal to 1.
00:42:30.440 --> 00:42:33.320
So asymptotically,
where I go all the way
00:42:33.320 --> 00:42:35.450
to infinity, the
probability that I
00:42:35.450 --> 00:42:41.590
will get some number along the
line-- I have to get something,
00:42:41.590 --> 00:42:45.530
so it should automatically
go to 1 here.
00:42:45.530 --> 00:42:51.450
And every element of
probability is positive,
00:42:51.450 --> 00:42:55.590
so it's a function that
should gradually go down.
00:42:55.590 --> 00:42:58.454
And presumably, it
will behave something
00:42:58.454 --> 00:42:59.370
like this generically.
00:43:06.170 --> 00:43:10.110
Once we have the Cumulative
Probability Function,
00:43:10.110 --> 00:43:24.530
we can immediately construct the
Probability Density Function,
00:43:24.530 --> 00:43:33.350
PDF, which is the
derivative of the above.
00:43:33.350 --> 00:43:42.290
P of x is the derivative of big
P of x with respect to the x.
00:43:42.290 --> 00:43:48.730
And so if I just take here
the curve that I have above
00:43:48.730 --> 00:43:51.850
and take its derivative,
the derivative
00:43:51.850 --> 00:43:54.360
will look something like this.
00:44:01.470 --> 00:44:07.340
Essentially, clearly by the
definition of the derivative,
00:44:07.340 --> 00:44:13.260
this quantity is
therefore ability
00:44:13.260 --> 00:44:23.120
of outcome in the
interval x to x plus dx
00:44:23.120 --> 00:44:25.274
divided by the size
of the interval dx.
00:44:29.050 --> 00:44:34.680
couple of things to
remind you of, one of them
00:44:34.680 --> 00:44:39.690
is that the Cumulative
Probability is a probability.
00:44:39.690 --> 00:44:43.540
It's a dimensionless
number between 0 and 1.
00:44:43.540 --> 00:44:48.440
Probability Density is obtained
by taking a derivative,
00:44:48.440 --> 00:44:54.570
so it has dimensions that are
inverse of whatever this x is.
00:44:54.570 --> 00:44:59.270
So if I change my variable
from meters to centimeters,
00:44:59.270 --> 00:45:01.570
let's say, the value
of this function
00:45:01.570 --> 00:45:04.530
would change by a factor of 100.
00:45:04.530 --> 00:45:09.490
And secondly, while
the Probability Density
00:45:09.490 --> 00:45:14.680
is positive, its
value is not bounded.
00:45:14.680 --> 00:45:16.648
It can be anywhere
that you like.
00:45:23.490 --> 00:45:27.050
One other, again,
minor definition
00:45:27.050 --> 00:45:28.600
is expectation value.
00:45:34.810 --> 00:45:39.990
So I can pick some
function of x.
00:45:39.990 --> 00:45:42.000
This could be x itself.
00:45:42.000 --> 00:45:43.970
It could be x squared.
00:45:43.970 --> 00:45:49.810
It could be sine x, x
cubed minus x squared.
00:45:49.810 --> 00:45:52.880
The expectation value
of this is defined
00:45:52.880 --> 00:45:58.310
by integrating the
Probability Density
00:45:58.310 --> 00:46:00.906
against the value
of the function.
00:46:06.620 --> 00:46:15.260
So essentially,
what that says is
00:46:15.260 --> 00:46:30.440
that if I pick some
function of x-- function
00:46:30.440 --> 00:46:33.160
can be positive,
negative, et cetera.
00:46:33.160 --> 00:46:42.640
So maybe I have a function
such as this-- then the value
00:46:42.640 --> 00:46:44.430
of x is random.
00:46:44.430 --> 00:46:47.460
If x is in this
interval, this would
00:46:47.460 --> 00:46:50.870
be the corresponding
contribution to f of x.
00:46:50.870 --> 00:46:55.605
And I have to look at
all possible values of x.
00:47:00.100 --> 00:47:02.710
Question?
00:47:02.710 --> 00:47:06.935
Now, very associated to this
is a change of variables.
00:47:14.660 --> 00:47:23.040
You would say that if x is
random, then f of x is random.
00:47:23.040 --> 00:47:26.420
So if I ask you what is
the value of x squared,
00:47:26.420 --> 00:47:29.780
and for one random
variable, I get this.
00:47:29.780 --> 00:47:32.220
The value of x
squared would be this.
00:47:32.220 --> 00:47:34.660
If I get this, the
value of x squared
00:47:34.660 --> 00:47:37.160
would be something else.
00:47:37.160 --> 00:47:41.035
So if x is random, f of x
is itself a random variable.
00:47:45.210 --> 00:47:52.510
So f of x is a random
variable, and you
00:47:52.510 --> 00:47:55.220
can ask what is the
probability, let's
00:47:55.220 --> 00:48:00.000
say, the Probability Density
Function that I would associate
00:48:00.000 --> 00:48:02.040
with the value of this.
00:48:02.040 --> 00:48:05.320
Let's say what's the
probability that I will find it
00:48:05.320 --> 00:48:10.710
in the interval between
small f and small f plus df.
00:48:10.710 --> 00:48:12.916
This will be Pf f of f.
00:48:17.052 --> 00:48:21.950
You would say that the
probability that I would find
00:48:21.950 --> 00:48:27.560
the value of the function
that is in this interval
00:48:27.560 --> 00:48:33.190
corresponds to finding a value
of x that is in this interval.
00:48:33.190 --> 00:48:36.020
So what I can do,
the probability
00:48:36.020 --> 00:48:38.990
that I find the value
of f in this interval,
00:48:38.990 --> 00:48:43.260
according to what I have here,
is the Probability Density
00:48:43.260 --> 00:48:44.912
multiplied by df.
00:48:44.912 --> 00:48:45.745
Is there a question?
00:48:48.250 --> 00:48:50.790
No.
00:48:50.790 --> 00:48:53.380
So the probability that
I'm in this interval
00:48:53.380 --> 00:48:56.520
translates to the probability
that I'm in this interval.
00:48:56.520 --> 00:49:01.965
So that's probability p of x dx.
00:49:04.650 --> 00:49:07.500
But that's boring.
00:49:07.500 --> 00:49:09.920
I want to look at the
situation maybe where
00:49:09.920 --> 00:49:11.565
the function is
something like this.
00:49:15.350 --> 00:49:19.730
Then, you say that f is
in this interval provided
00:49:19.730 --> 00:49:27.220
that x is either
here or here or here.
00:49:27.220 --> 00:49:33.964
So what I really need
to do is to solve
00:49:33.964 --> 00:49:40.200
f of x equals to f for x.
00:49:40.200 --> 00:49:43.010
And maybe there
will be solutions
00:49:43.010 --> 00:49:48.030
that will be x1,
x2, x3, et cetera.
00:49:48.030 --> 00:49:50.960
And what I need to
do is to sum over
00:49:50.960 --> 00:49:53.850
the contributions of
all of those solutions.
00:49:57.420 --> 00:49:58.830
So here, it's three solutions.
00:50:01.560 --> 00:50:08.010
Then, you would say
the Probability Density
00:50:08.010 --> 00:50:15.440
is the sum over i P
of xi, the xi by df,
00:50:15.440 --> 00:50:17.460
which is really the slopes.
00:50:17.460 --> 00:50:21.050
The slopes translate the
size of this interval
00:50:21.050 --> 00:50:23.010
to the size of that interval.
00:50:23.010 --> 00:50:25.770
You can see that here,
the slope is very sharp.
00:50:25.770 --> 00:50:28.190
The size of this
interval is small.
00:50:28.190 --> 00:50:31.450
It could be wider
accordingly, so I
00:50:31.450 --> 00:50:34.937
need to multiply by dxi by df.
00:50:39.160 --> 00:50:43.940
So I have to multiply by
dx by df evaluated at xi.
00:50:43.940 --> 00:50:47.165
That's essentially the value
of the derivative of f.
00:50:50.430 --> 00:50:57.070
Now, sometimes, it is easy
to forget these things
00:50:57.070 --> 00:50:59.300
that I write over here.
00:50:59.300 --> 00:51:01.640
And you would say,
well obviously,
00:51:01.640 --> 00:51:06.000
the probability of
something that is positive.
00:51:06.000 --> 00:51:09.890
But without being careful,
it is easy to violate
00:51:09.890 --> 00:51:11.620
such basic condition.
00:51:11.620 --> 00:51:13.020
And I violated it here.
00:51:16.740 --> 00:51:19.160
Anybody see where I violated it.
00:51:22.120 --> 00:51:24.710
Yeah, the slope
here is positive.
00:51:24.710 --> 00:51:26.060
The slope here is positive.
00:51:26.060 --> 00:51:28.240
The slope here is negative.
00:51:28.240 --> 00:51:33.130
So I am subtracting
a probability here.
00:51:33.130 --> 00:51:34.920
So what I really
should do-- it really
00:51:34.920 --> 00:51:37.490
doesn't matter whether the
slope is this way or that way.
00:51:37.490 --> 00:51:39.590
I will pick up
the same interval,
00:51:39.590 --> 00:51:44.144
so make sure you don't forget
the absolute values that
00:51:44.144 --> 00:51:45.596
go accordingly.
00:51:45.596 --> 00:51:48.210
So this is the
standard way that you
00:51:48.210 --> 00:51:50.480
would make change of variables.
00:51:50.480 --> 00:51:52.162
Yes?
00:51:52.162 --> 00:51:52.828
AUDIENCE: Sorry.
00:51:52.828 --> 00:51:56.701
In the center of that board,
on the second line, it says Pf.
00:51:56.701 --> 00:51:58.084
Is that an x or a times?
00:52:00.715 --> 00:52:02.340
PROFESSOR: In the
center of this board?
00:52:02.340 --> 00:52:03.678
This one?
00:52:03.678 --> 00:52:06.140
AUDIENCE: Yeah.
00:52:06.140 --> 00:52:10.460
PROFESSOR: So the
value of the function
00:52:10.460 --> 00:52:12.930
is a random variable, right?
00:52:12.930 --> 00:52:14.540
It can come up to be here.
00:52:14.540 --> 00:52:16.390
It can come up to be here.
00:52:16.390 --> 00:52:21.720
And so there is, as any other
one parameter random variable,
00:52:21.720 --> 00:52:25.210
a Probability Density
associated with that.
00:52:25.210 --> 00:52:30.350
That Probability Density
I have called P of f
00:52:30.350 --> 00:52:32.440
to indicate that
it is the variable
00:52:32.440 --> 00:52:34.910
f that I'm
considering as opposed
00:52:34.910 --> 00:52:37.370
to what I wrote
originally that was
00:52:37.370 --> 00:52:39.810
associated with the value of x.
00:52:39.810 --> 00:52:42.282
AUDIENCE: But what you have
written on the left-hand side,
00:52:42.282 --> 00:52:44.750
it looks like your
x [? is random. ?]
00:52:44.750 --> 00:52:47.160
PROFESSOR: Oh, this was
supposed to be a multiplication
00:52:47.160 --> 00:52:48.110
sign, so sorry.
00:52:48.110 --> 00:52:49.635
AUDIENCE: Thank you.
00:52:49.635 --> 00:52:50.510
PROFESSOR: Thank you.
00:52:56.445 --> 00:52:56.945
Yes?
00:52:56.945 --> 00:53:01.900
AUDIENCE: CP-- that function,
is this [INAUDIBLE]?
00:53:01.900 --> 00:53:04.020
PROFESSOR: Yes.
00:53:04.020 --> 00:53:07.670
So you're asking whether this--
so I constructed something,
00:53:07.670 --> 00:53:11.660
and my statement is that the
integral from minus infinity
00:53:11.660 --> 00:53:21.100
to infinity df Pf of f better be
one which is the normalization.
00:53:21.100 --> 00:53:24.420
So if you're asking
about this, essentially,
00:53:24.420 --> 00:53:30.020
you would say the
integral dx p of x
00:53:30.020 --> 00:53:34.200
is the integral dx
dP by dx, right?
00:53:34.200 --> 00:53:36.710
That was the definition p of x.
00:53:36.710 --> 00:53:38.930
And the integral
of the derivative
00:53:38.930 --> 00:53:43.830
is the value of the function
evaluated at its two extremes.
00:53:43.830 --> 00:53:47.360
And this is one minus 0.
00:53:47.360 --> 00:53:51.845
So by construction, it
is, of course, normalized
00:53:51.845 --> 00:53:54.100
in this fashion.
00:53:54.100 --> 00:53:55.310
Is that what you were asking?
00:53:55.310 --> 00:53:58.226
AUDIENCE: I was asking
about the first possibility
00:53:58.226 --> 00:54:00.970
of cumulative
probability function.
00:54:00.970 --> 00:54:04.220
PROFESSOR: So the
cumulative probability,
00:54:04.220 --> 00:54:09.540
its constraint is that
the limit as its variable
00:54:09.540 --> 00:54:14.030
goes to infinity,
it should go to 1.
00:54:14.030 --> 00:54:15.650
That's the normalization.
00:54:15.650 --> 00:54:19.660
The normalization here
is that the probability
00:54:19.660 --> 00:54:23.300
of the entire set is 1.
00:54:23.300 --> 00:54:26.000
Cumulative adds
the probabilities
00:54:26.000 --> 00:54:29.270
to be anywhere up to point x.
00:54:29.270 --> 00:54:32.160
So I have achieved being
anywhere on the line
00:54:32.160 --> 00:54:35.700
by going through this point.
00:54:35.700 --> 00:54:41.477
But certainly, the integral
of P of xdx is not equal to 1
00:54:41.477 --> 00:54:42.685
if that's what you're asking.
00:54:45.370 --> 00:54:48.494
The integral of
small p of x is 1.
00:54:51.973 --> 00:54:53.464
Yes?
00:54:53.464 --> 00:54:56.446
AUDIENCE: Are we assuming
the function is invertible?
00:55:00.430 --> 00:55:03.380
PROFESSOR: Well, rigorously
speaking, this function
00:55:03.380 --> 00:55:08.090
is not invertible
because for a value of f,
00:55:08.090 --> 00:55:10.810
there are three
possible values of x.
00:55:10.810 --> 00:55:14.150
So it's not a function,
but you can certainly
00:55:14.150 --> 00:55:18.070
solve for f of x equals to
f to find particular values.
00:55:27.630 --> 00:55:31.520
So again, maybe it
is useful to work
00:55:31.520 --> 00:55:33.230
through one example of this.
00:55:37.320 --> 00:55:44.220
So let's say that you
have a probability that
00:55:44.220 --> 00:55:50.890
is of the form e to the minus
lambda absolute value of x.
00:55:50.890 --> 00:55:58.550
So as a function of x,
the Probability Density
00:55:58.550 --> 00:56:04.430
falls off exponentially
on both sides.
00:56:04.430 --> 00:56:08.340
And again, I have to ensure that
when I integrate this from 0
00:56:08.340 --> 00:56:11.280
to infinity, I will get one.
00:56:11.280 --> 00:56:14.090
The integral from
0 to infinity is
00:56:14.090 --> 00:56:16.580
1 over lambda,
from minus infinity
00:56:16.580 --> 00:56:19.420
to zero by symmetry
is 1 over lambda.
00:56:19.420 --> 00:56:26.940
So it's really I have to divide
by 2 lambda-- to lambda over 2.
00:56:26.940 --> 00:56:27.440
Sorry.
00:56:34.680 --> 00:56:39.650
Now, suppose I change variables
to F, which is x squared.
00:56:42.350 --> 00:56:46.840
So I want to know what
the probability is
00:56:46.840 --> 00:56:53.070
for a particular value of x
squared that I will call f.
00:56:53.070 --> 00:56:57.720
So then what I have to
do is to solve this.
00:56:57.720 --> 00:57:03.920
And this will give me x is minus
plus square root of small f.
00:57:03.920 --> 00:57:10.440
If I ask for what f of-- for
what value of x, x squared
00:57:10.440 --> 00:57:15.500
equals to f, then I have
these two solutions.
00:57:15.500 --> 00:57:18.670
So according to the
formula that I have,
00:57:18.670 --> 00:57:22.480
I have to, first of
all, evaluate this
00:57:22.480 --> 00:57:26.540
at these two possible routes.
00:57:26.540 --> 00:57:33.160
In both cases, I will get
minus lambda square root of f.
00:57:33.160 --> 00:57:35.610
Because of the absolute
value, both of them
00:57:35.610 --> 00:57:38.290
will give you the same thing.
00:57:38.290 --> 00:57:43.350
And then I have to look
at this derivative.
00:57:43.350 --> 00:57:52.394
So if I look at this, I can
see that df by dx equals to 2x.
00:57:52.394 --> 00:57:57.180
The locations that I have to
evaluate are at plus minus
00:57:57.180 --> 00:57:58.800
square root of f.
00:57:58.800 --> 00:58:04.470
So the value of the slope is
minus plus to square root of f.
00:58:04.470 --> 00:58:07.840
And according to that
formula, what I have to do
00:58:07.840 --> 00:58:09.600
is to put the inverse of that.
00:58:09.600 --> 00:58:13.680
So I have to put for one
solution, 1 over 2 square root
00:58:13.680 --> 00:58:15.460
of f.
00:58:15.460 --> 00:58:19.920
For the other one, I have to
put 1 over minus 2 square root
00:58:19.920 --> 00:58:24.560
of f, which would be a disaster
if I didn't convert this
00:58:24.560 --> 00:58:27.650
to an absolute value.
00:58:27.650 --> 00:58:30.900
And if I did convert that
to an absolute value,
00:58:30.900 --> 00:58:36.322
what I would get is lambda
over 2 square root of f
00:58:36.322 --> 00:58:41.000
e to the minus lambda root f.
00:58:41.000 --> 00:58:46.180
It is important to note
that this solution will only
00:58:46.180 --> 00:58:51.340
exist only if f is positive.
00:58:51.340 --> 00:58:55.040
And there's no solution
if f is negative,
00:58:55.040 --> 00:59:00.200
which means that if I
wanted to plot a Probability
00:59:00.200 --> 00:59:03.770
Density for this
function f, which
00:59:03.770 --> 00:59:08.420
is x squared as a function
of f, it will only
00:59:08.420 --> 00:59:12.510
have values for positive
values of x squared.
00:59:12.510 --> 00:59:16.300
There's nothing for
negative values.
00:59:16.300 --> 00:59:19.300
For positive values,
I have this function
00:59:19.300 --> 00:59:22.200
that's exponentially decays.
00:59:22.200 --> 00:59:27.380
Yet at f equals to 0 diverges.
00:59:27.380 --> 00:59:30.770
One reason I chose
that example is
00:59:30.770 --> 00:59:34.170
to emphasize that these
Probability Density
00:59:34.170 --> 00:59:39.206
functions can even go
all the way infinity.
00:59:39.206 --> 00:59:42.880
The requirement,
however, is that you
00:59:42.880 --> 00:59:47.790
should be able to integrate
across the infinity because
00:59:47.790 --> 00:59:51.840
integrating across the infinity
should give you a finite number
00:59:51.840 --> 00:59:53.660
less than 1.
00:59:53.660 --> 00:59:58.490
And so the type of divergence
that you could have is limited.
00:59:58.490 --> 01:00:00.500
1 over square root of f is fine.
01:00:00.500 --> 01:00:02.000
1/f is not accepted.
01:00:06.380 --> 01:00:07.555
Yes?
01:00:07.555 --> 01:00:09.680
AUDIENCE: I have a doubt
about [? the postulate. ?]
01:00:09.680 --> 01:00:16.850
It says that if you raise
the value of f slowly,
01:00:16.850 --> 01:00:20.260
you will eventually get to--
yeah, that point right there.
01:00:20.260 --> 01:00:22.982
So if the prescription that
we have of summing over
01:00:22.982 --> 01:00:27.025
the different roots, at
some point, the roots,
01:00:27.025 --> 01:00:27.715
they converge.
01:00:27.715 --> 01:00:28.340
PROFESSOR: Yes.
01:00:28.340 --> 01:00:30.552
AUDIENCE: So at some point,
we stop summing over 2
01:00:30.552 --> 01:00:31.956
and we start summing over 1.
01:00:31.956 --> 01:00:35.240
It just seems a
little bit strange.
01:00:35.240 --> 01:00:36.070
PROFESSOR: Yeah.
01:00:36.070 --> 01:00:40.280
If you are up here, you have
only one term in the sum.
01:00:40.280 --> 01:00:43.200
If you are down here,
you have three terms.
01:00:43.200 --> 01:00:46.120
And that's really just
the property of the curve
01:00:46.120 --> 01:00:47.870
that I have drawn.
01:00:47.870 --> 01:00:51.530
And so over here, I
have only one root.
01:00:51.530 --> 01:00:53.040
Over here, I have three roots.
01:00:53.040 --> 01:00:54.960
And this is not surprising.
01:00:54.960 --> 01:00:58.190
There are many situations
in mathematics or physics
01:00:58.190 --> 01:01:01.140
where you encounter
situations where,
01:01:01.140 --> 01:01:04.670
as you change some parameters,
new solutions, new roots,
01:01:04.670 --> 01:01:05.650
appear.
01:01:05.650 --> 01:01:11.030
And so if this was really some
kind of a physical system,
01:01:11.030 --> 01:01:13.690
you would probably
encounter some kind
01:01:13.690 --> 01:01:16.712
of a singularity of phase
transitions at this point.
01:01:20.960 --> 01:01:21.825
Yes?
01:01:21.825 --> 01:01:24.675
AUDIENCE: But how does the
equation deal with that when
01:01:24.675 --> 01:01:27.060
[INAUDIBLE]?
01:01:27.060 --> 01:01:28.570
PROFESSOR: Let's see.
01:01:28.570 --> 01:01:34.390
So if I am approaching
that point, what I find
01:01:34.390 --> 01:01:40.540
is that the f by
the x goes to 0.
01:01:40.540 --> 01:01:44.500
So the x by df has some kind
of infinity or singularity,
01:01:44.500 --> 01:01:46.590
so we have to deal with that.
01:01:46.590 --> 01:01:49.670
If you want, we can
choose a particular form
01:01:49.670 --> 01:01:52.140
of that function and
see what happens.
01:01:52.140 --> 01:01:55.230
But actually, we have
that already over here
01:01:55.230 --> 01:01:58.440
because the function f
that I plotted for you
01:01:58.440 --> 01:02:05.740
as a function of x
has this behavior
01:02:05.740 --> 01:02:09.035
that, for some range of
f, you have two solutions.
01:02:14.280 --> 01:02:17.940
So for negative values
of f, I have no solution.
01:02:17.940 --> 01:02:21.000
So this curve, after
having rotated,
01:02:21.000 --> 01:02:24.390
is precisely an example
of what is happening here.
01:02:24.390 --> 01:02:26.950
And you see what the
consequence of that is.
01:02:26.950 --> 01:02:29.910
The consequence of that
is that as I approach here
01:02:29.910 --> 01:02:33.810
and the two solutions merge,
I have the singularity that
01:02:33.810 --> 01:02:36.679
is ultimately
manifested in here.
01:02:47.450 --> 01:02:49.060
So in principle, yes.
01:02:49.060 --> 01:02:51.230
When you make these
changes of variables
01:02:51.230 --> 01:02:55.630
and you have functions
that have multiple solution
01:02:55.630 --> 01:02:58.840
behavior like that, you
have to worry about this.
01:03:02.940 --> 01:03:04.065
Let me go down here.
01:03:07.690 --> 01:03:10.860
One other definition
that, again, you've
01:03:10.860 --> 01:03:13.350
probably seen, before
we go through something
01:03:13.350 --> 01:03:16.483
that I hope you
haven't seen, moment.
01:03:19.430 --> 01:03:22.110
A form of this expectation
value-- actually, here we
01:03:22.110 --> 01:03:24.950
did with x squared,
but in general, we
01:03:24.950 --> 01:03:29.480
can calculate the expectation
value of x to the m.
01:03:29.480 --> 01:03:37.120
And sometimes, that is called
mth moment is the integral 0
01:03:37.120 --> 01:03:41.130
to infinity dx x
to the m p of x.
01:03:56.760 --> 01:04:00.300
Now, I expect that
after this point,
01:04:00.300 --> 01:04:01.940
you would have seen everything.
01:04:01.940 --> 01:04:06.820
But next one maybe
half of you have seen.
01:04:06.820 --> 01:04:10.870
And the next item,
which we will use a lot,
01:04:10.870 --> 01:04:12.720
is the characteristic function.
01:04:24.860 --> 01:04:29.590
So given that I have some
probability distribution
01:04:29.590 --> 01:04:34.220
p of x, I can calculate
various expectation values.
01:04:34.220 --> 01:04:38.470
I calculate the expectation
value of e to the minus ikx.
01:04:43.200 --> 01:04:47.140
This is, by definition
that you have,
01:04:47.140 --> 01:04:50.750
I have to integrate
over the domain of x--
01:04:50.750 --> 01:04:55.140
let's say from minus infinity
to infinity-- p of x against e
01:04:55.140 --> 01:04:56.140
to the minus ikx.
01:05:01.550 --> 01:05:05.560
And you say, well, what's
special about that?
01:05:05.560 --> 01:05:11.470
I know that to be the
Fourier transform of p of x.
01:05:11.470 --> 01:05:13.220
And it is true.
01:05:13.220 --> 01:05:17.790
And you also know how to
invert the Fourier transform.
01:05:17.790 --> 01:05:20.460
That is if you know the
characteristic function, which
01:05:20.460 --> 01:05:23.170
is another name for the Fourier
transform of a probability
01:05:23.170 --> 01:05:26.770
distribution, you
would get the p of x
01:05:26.770 --> 01:05:33.490
back by the integral
over k divided by 2pi,
01:05:33.490 --> 01:05:39.460
the way that I chose the things,
into the ikx p tilde of k.
01:05:39.460 --> 01:05:42.600
Basically, this is the
standard relationship
01:05:42.600 --> 01:05:45.520
between these objects.
01:05:45.520 --> 01:05:50.090
So this is just a
Fourrier transform.
01:05:50.090 --> 01:05:56.320
Now, something that appears a
lot in statistical calculations
01:05:56.320 --> 01:05:58.530
and implicit in lots
of things that we
01:05:58.530 --> 01:06:03.185
do in statistical mechanics
is a generating function.
01:06:12.250 --> 01:06:17.880
I can take the characteristic
function p tilde of k.
01:06:17.880 --> 01:06:21.620
It's a function of this
Fourrier variable, k.
01:06:21.620 --> 01:06:24.360
And I can do an
expansion in that.
01:06:24.360 --> 01:06:29.000
I can do the expansion
inside the expectation value
01:06:29.000 --> 01:06:34.460
because e to the minus ikx I can
write as a sum over n running
01:06:34.460 --> 01:06:39.810
from 0 to infinity minus ik
to the power of m divided by n
01:06:39.810 --> 01:06:42.540
factorial x to the nth.
01:06:45.330 --> 01:06:48.880
This is the expansion
of the exponential.
01:06:48.880 --> 01:06:55.170
The variable here is x, so I can
take everything else outside.
01:06:55.170 --> 01:07:01.040
And what I see is that
if I make an expansion
01:07:01.040 --> 01:07:06.370
of the characteristic
function, the coefficient
01:07:06.370 --> 01:07:11.240
of k to the n up to some
trivial factor of n factorial
01:07:11.240 --> 01:07:14.130
will give me the nth moment.
01:07:14.130 --> 01:07:16.910
That is once you have
calculated the Fourrier
01:07:16.910 --> 01:07:20.610
transform, or the characteristic
function, you can expand it.
01:07:20.610 --> 01:07:23.430
And you can, from out
of that expansion, you
01:07:23.430 --> 01:07:26.690
can extract all the
moments essentially.
01:07:26.690 --> 01:07:30.550
So this is expansion
generates for you the moments,
01:07:30.550 --> 01:07:34.250
hence the generating function.
01:07:34.250 --> 01:07:37.220
You could even do
something like this.
01:07:37.220 --> 01:07:47.880
You could multiply e to the
ikx0 for some x0 p tilde of k.
01:07:47.880 --> 01:07:51.510
And that would be the
expectation value of e
01:07:51.510 --> 01:07:56.780
to the ikx minus x0.
01:07:56.780 --> 01:08:00.450
And you can expand
that, and you would
01:08:00.450 --> 01:08:06.600
generate all of the moments
not around the origin,
01:08:06.600 --> 01:08:08.350
but around the point x0.
01:08:13.960 --> 01:08:17.830
So simple manipulations of
the characteristic function
01:08:17.830 --> 01:08:23.840
can shift and give you
other set of moments
01:08:23.840 --> 01:08:25.000
around different points.
01:08:34.800 --> 01:08:38.560
So the Fourier transform,
or characteristic function,
01:08:38.560 --> 01:08:42.290
is the generator of moments.
01:08:42.290 --> 01:08:46.240
An even more
important property is
01:08:46.240 --> 01:08:48.986
possessed by the cumulant
generating function.
01:08:58.229 --> 01:09:07.260
So you have the
characteristic function,
01:09:07.260 --> 01:09:08.899
the Fourier transform.
01:09:08.899 --> 01:09:13.609
You take its log, so
another function of k.
01:09:13.609 --> 01:09:16.810
You start expanding this
function in covers of k.
01:09:21.189 --> 01:09:29.370
Add the coefficients of
that, you call cumulants.
01:09:33.420 --> 01:09:38.060
So I essentially repeated the
definition that I had up there.
01:09:38.060 --> 01:09:44.497
I took a log, and all I did
is I put this subscript c
01:09:44.497 --> 01:09:47.729
to go from moments to cumulants.
01:09:47.729 --> 01:09:53.890
And also, I have to start the
series from 1 as opposed to 0.
01:09:53.890 --> 01:10:00.960
And essentially, I can find the
relationship between cumulants
01:10:00.960 --> 01:10:04.370
and moments by
writing this as a log
01:10:04.370 --> 01:10:08.980
of the characteristic
function, which
01:10:08.980 --> 01:10:14.150
is 1 plus some n
plus 1 to infinity
01:10:14.150 --> 01:10:21.190
of minus ik to the n over n
factorial, the nth moments.
01:10:21.190 --> 01:10:26.120
So inside the log,
I have the moments.
01:10:26.120 --> 01:10:30.020
Outside the log, I
have the cumulants.
01:10:30.020 --> 01:10:35.910
And if I have a log
of 1 plus epsilon,
01:10:35.910 --> 01:10:41.180
I can use the expansion
of this as epsilon minus
01:10:41.180 --> 01:10:45.380
epsilon squared over 2 epsilon
cubed over 3 minus epsilon
01:10:45.380 --> 01:10:49.500
to the fourth over 4, et cetera.
01:10:49.500 --> 01:10:55.940
And this will enable me to
then match powers of minus ik
01:10:55.940 --> 01:11:00.710
on the left and powers
of minus ik on the right.
01:11:00.710 --> 01:11:03.530
You can see that the first
thing that I will find
01:11:03.530 --> 01:11:09.600
is that the expectation
value of x-- the first power,
01:11:09.600 --> 01:11:13.800
the first term that I have
here is minus ik to the mean.
01:11:13.800 --> 01:11:16.630
Take the log, I will get that.
01:11:16.630 --> 01:11:22.230
So essentially, what I get
is that the first cumulant
01:11:22.230 --> 01:11:26.760
on the left is the
first moment that I
01:11:26.760 --> 01:11:29.660
will get from the
expansion on the right.
01:11:29.660 --> 01:11:32.460
And this is, of course, called
the mean of the distribution.
01:11:34.970 --> 01:11:40.680
The second cumulant, I will
have two contributions,
01:11:40.680 --> 01:11:45.304
one from epsilon, the other from
minus epsilon squared over 2.
01:11:45.304 --> 01:11:48.220
And If you go through
that, you will
01:11:48.220 --> 01:11:52.680
get that it is expectation
value of x squared
01:11:52.680 --> 01:11:57.320
minus the average of x,
the mean squared, which
01:11:57.320 --> 01:12:01.450
is none other than the
expectation value of x
01:12:01.450 --> 01:12:06.910
around the mean squared, which
is clearly a positive quantity.
01:12:06.910 --> 01:12:08.410
And this is the variance.
01:12:14.420 --> 01:12:16.300
And you can keep going.
01:12:16.300 --> 01:12:26.360
The third cumulant is x cubed
minus 3 average of x squared
01:12:26.360 --> 01:12:32.402
average of x plus 2
average of x itself cubed.
01:12:32.402 --> 01:12:33.915
It is called the skewness.
01:12:36.900 --> 01:12:40.340
I don't write the
formula for the next one
01:12:40.340 --> 01:12:42.050
which is called
a [? cortosis ?].
01:12:42.050 --> 01:12:45.910
And you keep going and so forth.
01:12:53.390 --> 01:13:01.220
So it turns out that this
hierarchy of cumulants,
01:13:01.220 --> 01:13:04.710
essentially, is a hierarchy
of the most important things
01:13:04.710 --> 01:13:09.540
that you can know about
a random variable.
01:13:09.540 --> 01:13:17.140
So if I tell you that the
outcome of some experiment
01:13:17.140 --> 01:13:23.409
is some number x,
distribute it somehow-- I
01:13:23.409 --> 01:13:25.450
guess the first thing that
you would like to know
01:13:25.450 --> 01:13:28.375
is whether the typical
values that you get
01:13:28.375 --> 01:13:33.100
are of the order of 1, are of
the order of million, whatever.
01:13:33.100 --> 01:13:36.380
So somehow, the
mean is something
01:13:36.380 --> 01:13:40.510
that tells you something that is
most important is zeroth order
01:13:40.510 --> 01:13:45.030
thing that you want to
know about the variable.
01:13:45.030 --> 01:13:47.130
But the next thing that
you might want to know
01:13:47.130 --> 01:13:50.250
is, well, what's the spread?
01:13:50.250 --> 01:13:52.830
How far does this thing go?
01:13:52.830 --> 01:13:58.090
And then the variance will tell
you something about the spread.
01:13:58.090 --> 01:13:59.750
So the next thing
that you want to do
01:13:59.750 --> 01:14:02.770
is maybe if given
the spread, am I
01:14:02.770 --> 01:14:06.080
more likely to get things
that are on one side or things
01:14:06.080 --> 01:14:08.390
that are on the other side.
01:14:08.390 --> 01:14:12.780
So the measure of its
asymmetry, right versus left,
01:14:12.780 --> 01:14:16.329
is provided by the third
cumulant, which is the skewness
01:14:16.329 --> 01:14:16.870
and so forth.
01:14:20.180 --> 01:14:24.380
So typically, the
very first few members
01:14:24.380 --> 01:14:28.340
of this hierarchy of
cumulants tells you
01:14:28.340 --> 01:14:30.985
the most important
information that you
01:14:30.985 --> 01:14:32.110
need about the probability.
01:14:37.100 --> 01:14:38.860
Now, I will mention
to you, and I
01:14:38.860 --> 01:14:44.040
guess we probably will deal
with it more next time around,
01:14:44.040 --> 01:14:51.700
the result that is in some
sense the backbone or granddaddy
01:14:51.700 --> 01:14:57.080
of all graphical expansions
that are carrying [INAUDIBLE].
01:14:57.080 --> 01:15:00.800
And that's a relationship
between the moments
01:15:00.800 --> 01:15:04.770
and cumulants that I
will express graphically.
01:15:04.770 --> 01:15:14.970
So this is graphical
representation
01:15:14.970 --> 01:15:20.860
of moments in
terms of cumulants.
01:15:26.470 --> 01:15:29.440
Essentially, what I'm
saying is that you
01:15:29.440 --> 01:15:33.100
can go through the
procedure as I outlined.
01:15:33.100 --> 01:15:37.240
And if you want to calculate
minus ik to the fifth power
01:15:37.240 --> 01:15:41.740
so that you find the description
of the fifth cumulant in terms
01:15:41.740 --> 01:15:44.475
of the moment, you'll
have to do a lot of work
01:15:44.475 --> 01:15:49.460
in expanding the log and powers
of this object and making sure
01:15:49.460 --> 01:15:53.790
that you don't make any
mistakes in the coefficient.
01:15:53.790 --> 01:15:58.570
There is a way to
circumvent that graphically
01:15:58.570 --> 01:16:00.460
and get the relationship.
01:16:00.460 --> 01:16:03.710
So how do we do that?
01:16:03.710 --> 01:16:16.010
You'll represent nth cumulant
as let's say a bag of endpoints.
01:16:20.100 --> 01:16:28.640
So let's say this entity will
represent the third cumulant.
01:16:28.640 --> 01:16:31.700
It's a bag with three points.
01:16:31.700 --> 01:16:37.520
This-- one, two, three,
four, five, six--
01:16:37.520 --> 01:16:39.391
will represent the
sixth cumulant.
01:16:42.340 --> 01:16:54.645
Then, the nth moment
is some of all ways
01:16:54.645 --> 01:17:04.680
of distributing end
points amongst bags.
01:17:11.360 --> 01:17:13.664
So what do I mean?
01:17:13.664 --> 01:17:19.140
So I want to calculate
the first moment x.
01:17:19.140 --> 01:17:22.370
That would correspond
to one point.
01:17:22.370 --> 01:17:25.260
And really, there's
only one diagram
01:17:25.260 --> 01:17:28.470
I can put the bag
around it or not
01:17:28.470 --> 01:17:31.280
that would correspond to this.
01:17:31.280 --> 01:17:35.410
And that corresponds
to the first cumulant,
01:17:35.410 --> 01:17:39.120
basically rewriting
what I had before.
01:17:39.120 --> 01:17:43.010
If I want to look at the second
moment, the second moment
01:17:43.010 --> 01:17:44.650
I need two points.
01:17:44.650 --> 01:17:48.770
The two points I can either
put in the same bag or I
01:17:48.770 --> 01:17:52.180
can put into two separate bags.
01:17:52.180 --> 01:17:56.510
And the first one
corresponds to calculating
01:17:56.510 --> 01:17:59.650
the second cumulant.
01:17:59.650 --> 01:18:02.390
The second term
corresponds to two ways
01:18:02.390 --> 01:18:05.460
in which their first
cumulant has appeared,
01:18:05.460 --> 01:18:07.090
so I have to squared x.
01:18:10.010 --> 01:18:17.080
if I want to calculate the
third moment, I need three dots.
01:18:17.080 --> 01:18:21.400
The three dots I can
either put in one bag
01:18:21.400 --> 01:18:27.950
or I can take one of them out
and keep two of them in a bag.
01:18:27.950 --> 01:18:30.000
And here I had the
choice of three things
01:18:30.000 --> 01:18:32.770
that I could've pulled out.
01:18:32.770 --> 01:18:38.370
Or, I could have all of them in
individual bags of their own.
01:18:38.370 --> 01:18:43.290
And mathematically, the first
term corresponds to x cubed c.
01:18:43.290 --> 01:18:46.680
The third term corresponds
to three versions
01:18:46.680 --> 01:18:49.460
of the variance times the mean.
01:18:49.460 --> 01:18:54.040
And the last term is
just the mean cubed.
01:18:54.040 --> 01:18:57.740
And you can massage
this expression
01:18:57.740 --> 01:19:03.200
to see that I get the expression
that I have for the skewness.
01:19:03.200 --> 01:19:05.860
I didn't offhand
remember the relationship
01:19:05.860 --> 01:19:09.790
that I have to write down
for the fourth cumulant.
01:19:09.790 --> 01:19:12.210
But I can graphically,
immediately get
01:19:12.210 --> 01:19:15.140
the relationship for
the fourth moment
01:19:15.140 --> 01:19:20.020
in terms of the fourth
cumulant which is this entity.
01:19:20.020 --> 01:19:23.860
Or, four ways that I
can take one of the back
01:19:23.860 --> 01:19:29.440
and maintain three in the bag,
three ways in which I have
01:19:29.440 --> 01:19:37.190
two bags of two, six ways in
which I can have a bag of two
01:19:37.190 --> 01:19:42.165
and two things that are
individually apart, and one
01:19:42.165 --> 01:19:44.970
way in which there
are four things that
01:19:44.970 --> 01:19:47.350
are independent of each other.
01:19:47.350 --> 01:19:53.700
And this becomes x to the fourth
cumulant, the fourth cumulant,
01:19:53.700 --> 01:19:58.800
4 times the third
cumulant times the mean,
01:19:58.800 --> 01:20:03.740
3 times the square
of the variance,
01:20:03.740 --> 01:20:10.010
6 times the variance
multiplied by the mean squared,
01:20:10.010 --> 01:20:15.630
and the mean raised
to the fourth power.
01:20:15.630 --> 01:20:16.690
And you can keep going.
01:20:24.642 --> 01:20:29.130
AUDIENCE: Is the variance not
squared in the third term?
01:20:29.130 --> 01:20:31.030
PROFESSOR: Did I forget that?
01:20:31.030 --> 01:20:32.030
Yes, thank you.
01:20:42.340 --> 01:20:43.830
All right.
01:20:43.830 --> 01:20:48.010
So the proof of this is really
just the two-line algebra
01:20:48.010 --> 01:20:51.960
exponentiating these expressions
that we have over here.
01:20:51.960 --> 01:20:55.710
But it's much nicer to
represent that graphically.
01:20:55.710 --> 01:20:59.700
And so now you can go
between things very easily.
01:20:59.700 --> 01:21:05.310
And what I will show next time
is how, using this machinery,
01:21:05.310 --> 01:21:09.960
you can calculate any
moment of a Gaussian,
01:21:09.960 --> 01:21:12.650
for example, in just a
matter of seconds as opposed
01:21:12.650 --> 01:21:16.980
to having to do integrations
and things like that.
01:21:16.980 --> 01:21:19.900
So that's what we
will do next time will
01:21:19.900 --> 01:21:23.120
be to apply this machinery
to various probability
01:21:23.120 --> 01:21:25.290
distribution, such
as a Gaussian,
01:21:25.290 --> 01:21:28.590
that we are likely to
encounter again and again.