1
00:00:00,060 --> 00:00:01,780
The following
content is provided

2
00:00:01,780 --> 00:00:04,019
under a Creative
Commons license.

3
00:00:04,019 --> 00:00:06,870
Your support will help MIT
OpenCourseWare continue

4
00:00:06,870 --> 00:00:10,730
to offer high quality
educational resources for free.

5
00:00:10,730 --> 00:00:13,340
To make a donation or
view additional materials

6
00:00:13,340 --> 00:00:17,217
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,217 --> 00:00:17,842
at ocw.mit.edu.

8
00:00:20,365 --> 00:00:25,220
PROFESSOR: We'll begin this time
by looking at some probability

9
00:00:25,220 --> 00:00:27,310
distribution that you
should be familiar

10
00:00:27,310 --> 00:00:30,510
with from this
perspective, starting

11
00:00:30,510 --> 00:00:33,780
with a Gaussian distribution
for one variable.

12
00:00:42,100 --> 00:00:44,660
We're focused on a
variable that takes

13
00:00:44,660 --> 00:00:48,670
real values in the interval
minus infinity to infinity

14
00:00:48,670 --> 00:00:53,255
and the Gaussian has the
form exponential that

15
00:00:53,255 --> 00:00:57,450
is centered around some
value, let's call it lambda,

16
00:00:57,450 --> 00:01:02,380
and has fluctuations around this
value parameterized by sigma.

17
00:01:02,380 --> 00:01:07,940
And the integral of
this p over the interval

18
00:01:07,940 --> 00:01:10,970
should be normalized
to unity, giving you

19
00:01:10,970 --> 00:01:16,780
this hopefully very
family of form.

20
00:01:16,780 --> 00:01:23,540
Now, if you want to characterize
the characteristic function,

21
00:01:23,540 --> 00:01:27,120
all we need to do is to
Fourier transform this.

22
00:01:27,120 --> 00:01:36,580
So I have the integral
dx e the minus ikx.

23
00:01:36,580 --> 00:01:39,810
So this-- let's remind
you alternatively

24
00:01:39,810 --> 00:01:42,560
was the expectation value
of e to the minus ikx.

25
00:01:45,860 --> 00:01:51,310
minus x minus lambda squared
over 2 sigma squared,

26
00:01:51,310 --> 00:01:54,160
which is the probably
distribution.

27
00:01:54,160 --> 00:01:59,220
And you should know what
the answer to that is,

28
00:01:59,220 --> 00:02:00,480
but I will remind.

29
00:02:00,480 --> 00:02:05,730
You can change variables to
x minus lambda [INAUDIBLE] y.

30
00:02:05,730 --> 00:02:08,210
So from here we will
get the factor of 2

31
00:02:08,210 --> 00:02:10,780
to the minus ik lambda.

32
00:02:10,780 --> 00:02:19,050
You have then the
integral over y minus y

33
00:02:19,050 --> 00:02:22,810
squared over 2 sigma squared.

34
00:02:22,810 --> 00:02:29,530
And then what we need to do is
to complete this square over

35
00:02:29,530 --> 00:02:31,990
here.

36
00:02:31,990 --> 00:02:34,890
And you can do
that, essentially,

37
00:02:34,890 --> 00:02:40,300
by adding and
subtracting a minus

38
00:02:40,300 --> 00:02:45,320
k squared sigma squared over 2.

39
00:02:45,320 --> 00:02:53,210
So that if I change variable
to y plus ik sigma squared,

40
00:02:53,210 --> 00:03:00,910
let's call that z, then I
have outside the integral e

41
00:03:00,910 --> 00:03:10,595
to the minus ik lambda minus k
squared sigma squared over 2.

42
00:03:10,595 --> 00:03:14,470
And the remainder I can
write as a full square.

43
00:03:21,800 --> 00:03:25,790
And this is just a
normalized Gaussian integral

44
00:03:25,790 --> 00:03:29,500
that comes to 1.

45
00:03:29,500 --> 00:03:37,260
So as you well know, a Fourier
transform of a Gaussian

46
00:03:37,260 --> 00:03:40,960
is itself a Gaussian, and
that's what we've established.

47
00:03:40,960 --> 00:03:48,350
E to the minus ik lambda minus
k squared sigma squared over 2m.

48
00:03:48,350 --> 00:03:51,600
And if I haven't made a mistake
when I said k equals to 0,

49
00:03:51,600 --> 00:03:56,070
the answer should be 1 because
k equals to 0 expectation

50
00:03:56,070 --> 00:03:58,376
value of 1 just amounts
to normalization.

51
00:04:01,100 --> 00:04:06,850
Now what we said was that
a more interesting function

52
00:04:06,850 --> 00:04:10,360
is obtained by taking
the log of this.

53
00:04:10,360 --> 00:04:16,970
So from here we go to the
log of [INAUDIBLE] of k.

54
00:04:16,970 --> 00:04:22,090
Log of [INAUDIBLE] of k is
very simple for the Gaussian.

55
00:04:28,610 --> 00:04:35,580
And what we had said
was that by definition

56
00:04:35,580 --> 00:04:38,490
this log of the
characteristic function

57
00:04:38,490 --> 00:04:43,510
generates cumulants
through the series

58
00:04:43,510 --> 00:04:49,990
minus ik to the power of n over
n factorial, the end cumulant.

59
00:04:53,110 --> 00:04:57,510
So looking at this,
we can immediately

60
00:04:57,510 --> 00:05:01,690
see that the Guassian
is characterized

61
00:05:01,690 --> 00:05:07,030
by a first cumulant, which is
the coefficient of minus ik.

62
00:05:07,030 --> 00:05:07,550
It's lambda.

63
00:05:12,440 --> 00:05:15,550
It is characterized by
a second cumulant, which

64
00:05:15,550 --> 00:05:20,480
is the coefficient
of minus ik squared.

65
00:05:20,480 --> 00:05:22,160
This, you know, is the variance.

66
00:05:22,160 --> 00:05:25,170
And we can explicitly
see that the coefficient

67
00:05:25,170 --> 00:05:29,040
of minus ik squared
over 2 factorial

68
00:05:29,040 --> 00:05:31,660
is simply sigma squared.

69
00:05:31,660 --> 00:05:34,920
So this is reputation.

70
00:05:34,920 --> 00:05:37,560
But one thing that
is interesting

71
00:05:37,560 --> 00:05:40,880
is that our series
has now terminates,

72
00:05:40,880 --> 00:05:43,010
which means that
if I were to look

73
00:05:43,010 --> 00:05:48,280
at the third cumulant, if I were
to look at the fourth cumulant,

74
00:05:48,280 --> 00:05:52,470
and so forth, for the
Guassian, they're all 0.

75
00:05:52,470 --> 00:05:54,730
So the Gaussian is
the distribution

76
00:05:54,730 --> 00:05:57,850
that is completely
characterized just

77
00:05:57,850 --> 00:06:00,940
by its first and second
cumulants, all the rest

78
00:06:00,940 --> 00:06:01,763
being 0.

79
00:06:05,080 --> 00:06:09,670
So, now our last
time we developed

80
00:06:09,670 --> 00:06:12,500
some kind of a graphical method.

81
00:06:12,500 --> 00:06:17,840
We said that I can graphically
describe the first cumulant

82
00:06:17,840 --> 00:06:22,110
as a bag with one point in it.

83
00:06:22,110 --> 00:06:26,122
Second cumulant with something
that has two points in it.

84
00:06:26,122 --> 00:06:29,520
A third cumulant
with three three,

85
00:06:29,520 --> 00:06:36,220
fourth cumulant with four
and four points and so forth.

86
00:06:36,220 --> 00:06:39,190
This is just rewriting.

87
00:06:39,190 --> 00:06:40,930
Now, the interesting
thing was that we

88
00:06:40,930 --> 00:06:47,490
said that the various moments
we could express graphically.

89
00:06:47,490 --> 00:06:50,540
So that, for example,
the second moment

90
00:06:50,540 --> 00:06:56,760
is either this or this,
which then graphically

91
00:06:56,760 --> 00:07:01,120
is the same thing as lambda
squared plus sigma squared

92
00:07:01,120 --> 00:07:06,440
because this is indicated
by sigma squared.

93
00:07:06,440 --> 00:07:12,790
Now, x cubed you
would say is either

94
00:07:12,790 --> 00:07:17,360
three things by themselves
or put two of them

95
00:07:17,360 --> 00:07:19,610
together and then one separate.

96
00:07:19,610 --> 00:07:23,860
And this I could do in
three different ways.

97
00:07:23,860 --> 00:07:26,090
And in general, for a
general distribution,

98
00:07:26,090 --> 00:07:29,980
I would have had another
term, which is a triangle.

99
00:07:29,980 --> 00:07:32,710
But the triangle is 0.

100
00:07:32,710 --> 00:07:34,940
So for the Gaussian,
this terminates here.

101
00:07:34,940 --> 00:07:40,780
I have lambda cubed plus
3 lambda sigma squared.

102
00:07:40,780 --> 00:07:44,300
If I want to calculate
x to the fourth,

103
00:07:44,300 --> 00:07:46,560
maybe the old way
of doing it would

104
00:07:46,560 --> 00:07:50,710
have been too multiply the
Gaussian distribution against x

105
00:07:50,710 --> 00:07:53,620
to the fourth and try
to do the integration.

106
00:07:53,620 --> 00:07:59,060
And you would ultimately be able
to do that rearranging things

107
00:07:59,060 --> 00:08:02,830
and looking at the various
powers of the Gaussian

108
00:08:02,830 --> 00:08:05,566
integrated from minus
infinity to infinity.

109
00:08:05,566 --> 00:08:06,815
But you can do it graphically.

110
00:08:06,815 --> 00:08:08,950
You can say, OK.

111
00:08:08,950 --> 00:08:14,160
It's either this or
I can have-- well,

112
00:08:14,160 --> 00:08:16,640
I cannot put one aside
and three together,

113
00:08:16,640 --> 00:08:18,870
because that doesn't exist.

114
00:08:18,870 --> 00:08:25,330
I could have two together
and two not together.

115
00:08:25,330 --> 00:08:28,176
And this I can do in
six different ways.

116
00:08:28,176 --> 00:08:31,130
You can convince
yourself of that.

117
00:08:31,130 --> 00:08:36,210
What I could do two pairs, which
I can do three different ways

118
00:08:36,210 --> 00:08:39,440
because I can either do one,
two; one, three; one, four

119
00:08:39,440 --> 00:08:42,360
and then the other is satisfied.

120
00:08:42,360 --> 00:08:47,280
So this is lambda to the
fourth plus 6 lambda squared

121
00:08:47,280 --> 00:08:53,130
sigma squared plus 3
sigma to the fourth.

122
00:08:53,130 --> 00:08:55,678
And you can keep going and
doing different things.

123
00:08:59,800 --> 00:09:00,856
OK.

124
00:09:00,856 --> 00:09:01,355
Question?

125
00:09:04,778 --> 00:09:05,278
Yeah?

126
00:09:05,278 --> 00:09:07,748
AUDIENCE: Is the
second-- [INAUDIBLE].

127
00:09:13,182 --> 00:09:13,970
PROFESSOR: There?

128
00:09:13,970 --> 00:09:17,180
AUDIENCE: Because-- so you said
that the second cumulative--

129
00:09:17,180 --> 00:09:18,144
PROFESSOR: Oh.

130
00:09:18,144 --> 00:09:19,090
AUDIENCE: --x sqaured.

131
00:09:19,090 --> 00:09:19,590
Yes.

132
00:09:19,590 --> 00:09:20,105
PROFESSOR: Yes.

133
00:09:20,105 --> 00:09:21,015
So that's the wrong--

134
00:09:21,015 --> 00:09:21,931
AUDIENCE: [INAUDIBLE].

135
00:09:21,931 --> 00:09:26,520
PROFESSOR: The coefficient of k
squared is the second cumulant.

136
00:09:26,520 --> 00:09:28,210
The additional 2 was a mistake.

137
00:09:34,178 --> 00:09:35,120
OK.

138
00:09:35,120 --> 00:09:35,930
Anything else?

139
00:09:40,790 --> 00:09:42,450
All right.

140
00:09:42,450 --> 00:09:46,979
Let's take a look at couple
of other distributions,

141
00:09:46,979 --> 00:09:47,770
this time discrete.

142
00:09:52,760 --> 00:10:07,170
So the binomial distribution
is repeat a binary random

143
00:10:07,170 --> 00:10:09,845
variable.

144
00:10:09,845 --> 00:10:13,110
And what does this mean?

145
00:10:13,110 --> 00:10:21,500
It means two outcomes
that's binary,

146
00:10:21,500 --> 00:10:27,970
let's call them
A and B. And if I

147
00:10:27,970 --> 00:10:31,660
have a coin that's head
or tails, it's binary.

148
00:10:31,660 --> 00:10:33,390
Two possibilities.

149
00:10:33,390 --> 00:10:38,890
And I can assign probabilities
to the two outcomes

150
00:10:38,890 --> 00:10:44,854
to PA and PB, which
has to be 1 minus PA.

151
00:10:47,600 --> 00:10:52,160
And the question
is if you repeat

152
00:10:52,160 --> 00:11:03,500
this binary random
variables n times,

153
00:11:03,500 --> 00:11:17,360
what is the probability
of NA outcomes of A?

154
00:11:23,640 --> 00:11:25,605
And I forget to say
something important,

155
00:11:25,605 --> 00:11:28,410
so I write it in red.

156
00:11:28,410 --> 00:11:30,160
This should be independent.

157
00:11:33,055 --> 00:11:41,140
That is the outcome of
a coin toss at, say,

158
00:11:41,140 --> 00:11:43,260
the fifth time
should not influence

159
00:11:43,260 --> 00:11:46,131
the sixth time and future times.

160
00:11:46,131 --> 00:11:46,630
OK.

161
00:11:46,630 --> 00:11:48,650
So this is easy.

162
00:11:48,650 --> 00:11:56,010
The probability to have NA
occurrences of A in N trials.

163
00:11:56,010 --> 00:11:58,480
So it has to be index yn.

164
00:11:58,480 --> 00:12:07,490
Is that within the n times that
I tossed, A came up NA times.

165
00:12:07,490 --> 00:12:11,590
So it has to be proportional
to the probability

166
00:12:11,590 --> 00:12:18,870
of A independently multiplied
by itself NA times.

167
00:12:18,870 --> 00:12:22,710
But if I have exactly
NA occurrences of A,

168
00:12:22,710 --> 00:12:30,950
all the other times
I had B occurring.

169
00:12:30,950 --> 00:12:34,110
So I have the probability
of B for the remainder,

170
00:12:34,110 --> 00:12:37,460
which is N minus NA.

171
00:12:37,460 --> 00:12:40,600
Now this is the probability
for a specific occurrence,

172
00:12:40,600 --> 00:12:45,510
like the first NA times that
I threw the coin I will get A.

173
00:12:45,510 --> 00:12:48,290
The remaining times
I would get B.

174
00:12:48,290 --> 00:12:51,550
But the order is not
important and the number

175
00:12:51,550 --> 00:12:54,050
of ways that I can
shuffle the order

176
00:12:54,050 --> 00:12:59,480
and have a total of NA out of
N times is the binomial factor.

177
00:13:08,310 --> 00:13:09,990
Fine.

178
00:13:09,990 --> 00:13:10,830
Again, well known.

179
00:13:10,830 --> 00:13:12,785
Let's look at its
characteristic function.

180
00:13:15,470 --> 00:13:20,530
So p tilde, which is
now a function of k,

181
00:13:20,530 --> 00:13:23,800
is expectation value
of e to the minus ik

182
00:13:23,800 --> 00:13:32,060
NA, which means that I
have to weigh e minus ik

183
00:13:32,060 --> 00:13:39,660
NA against the probability
of occurrences of NA times,

184
00:13:39,660 --> 00:13:53,005
which is this binomial factor
PA to the power of NA, PB

185
00:13:53,005 --> 00:13:56,226
to the power of N minus NA.

186
00:13:56,226 --> 00:14:01,450
And of course, I have to sum
over all possible values of NA

187
00:14:01,450 --> 00:14:09,500
that go all the way from 0 to N.

188
00:14:09,500 --> 00:14:13,300
So what we have
here is something

189
00:14:13,300 --> 00:14:18,505
which is this combination,
PA e to the minus ik

190
00:14:18,505 --> 00:14:21,680
raised to the power of NA.

191
00:14:21,680 --> 00:14:26,400
PB raised to the complement
N minus NA multiplied

192
00:14:26,400 --> 00:14:29,890
by the binomial factor summed
over all possible values.

193
00:14:29,890 --> 00:14:36,610
So this is just the definition
of the binomial expansion of PA

194
00:14:36,610 --> 00:14:46,870
e to the minus ik plus PB
raised to the power of N.

195
00:14:46,870 --> 00:14:47,860
And again, let's check.

196
00:14:47,860 --> 00:14:50,940
If I set k equals to
0, I have PA plus PB,

197
00:14:50,940 --> 00:14:54,580
which is 1 raised to the
power of N. So things are OK.

198
00:14:54,580 --> 00:14:58,450
So this is the
characteristic function.

199
00:14:58,450 --> 00:15:02,600
At this stage, the only thing
that I will note about this

200
00:15:02,600 --> 00:15:09,070
is that if I look at the
characteristic function

201
00:15:09,070 --> 00:15:13,650
I will get N times.

202
00:15:13,650 --> 00:15:15,420
So this is-- actually,
let's make sure

203
00:15:15,420 --> 00:15:19,280
that we maintain the
index N. So this is

204
00:15:19,280 --> 00:15:23,940
the characteristic function
appropriate to N trials.

205
00:15:23,940 --> 00:15:26,840
And what I get is that
up to factor of N,

206
00:15:26,840 --> 00:15:29,320
I will get the
characteristic function

207
00:15:29,320 --> 00:15:31,505
that would be
appropriate to one trial.

208
00:15:35,600 --> 00:15:38,170
So what that means
if I were to look

209
00:15:38,170 --> 00:15:46,650
at powers of k, the expectation
value of some cumulant,

210
00:15:46,650 --> 00:15:52,120
if I go to repeat things
N times-- so this carries

211
00:15:52,120 --> 00:15:58,820
an index N. It is going to be
simply N times what I would

212
00:15:58,820 --> 00:16:00,780
have had in a single trial.

213
00:16:05,470 --> 00:16:07,490
So for a single
trial, you really

214
00:16:07,490 --> 00:16:13,330
have two outcomes-- 0 or 1
occurrences of this object.

215
00:16:13,330 --> 00:16:16,910
So for a binary variable,
you can really easily compute

216
00:16:16,910 --> 00:16:19,140
these quantities
and then you can

217
00:16:19,140 --> 00:16:24,640
calculate the corresponding
ones for N trials simply

218
00:16:24,640 --> 00:16:27,910
multiplying by N.
And we will see

219
00:16:27,910 --> 00:16:30,950
that this is
characteristic, essentially,

220
00:16:30,950 --> 00:16:37,090
of anything that is repeated N
times, not just the binomial.

221
00:16:37,090 --> 00:16:40,510
So this form that you have
N independent objects,

222
00:16:40,510 --> 00:16:42,620
you would get N
times what you would

223
00:16:42,620 --> 00:16:46,090
have for one object
is generally valid

224
00:16:46,090 --> 00:16:48,020
and actually something
that we will build

225
00:16:48,020 --> 00:16:51,270
a lot of statistical
mechanics on because we

226
00:16:51,270 --> 00:16:53,710
are interested in
the [INAUDIBLE].

227
00:16:53,710 --> 00:16:56,280
So we will see that shortly.

228
00:16:56,280 --> 00:16:59,070
But rather than
following this, let's

229
00:16:59,070 --> 00:17:03,120
look at a third distribution
that is closely related,

230
00:17:03,120 --> 00:17:04,089
which is the Poisson.

231
00:17:09,990 --> 00:17:13,359
And the question
that we are asking

232
00:17:13,359 --> 00:17:17,819
is-- we have an interval.

233
00:17:17,819 --> 00:17:24,422
And the question is,
what is the probability

234
00:17:24,422 --> 00:17:39,070
of m events in an
interval from 0 to T?

235
00:17:39,070 --> 00:17:42,420
And I kind of
expressed it this way

236
00:17:42,420 --> 00:17:47,680
because prototypical Poisson
distribution is, let's say,

237
00:17:47,680 --> 00:17:49,370
the radioactivity.

238
00:17:49,370 --> 00:17:52,860
And you can be waiting for
some time interval from 0

239
00:17:52,860 --> 00:17:56,890
to 1 minute and asking within
that time interval, what's

240
00:17:56,890 --> 00:18:02,410
the probability that you will
see m radioactive decay events?

241
00:18:02,410 --> 00:18:08,470
So what is the probability
if two things happen?

242
00:18:08,470 --> 00:18:28,090
One is that the probability of 1
and only 1 event in interval dt

243
00:18:28,090 --> 00:18:32,402
is alpha dt as dt goes to 0.

244
00:18:34,831 --> 00:18:35,330
OK?

245
00:18:35,330 --> 00:18:40,760
So basically if you look
at this over 1 minute,

246
00:18:40,760 --> 00:18:43,680
the chances are that you
will see so many events,

247
00:18:43,680 --> 00:18:47,540
so many radioactivities.

248
00:18:47,540 --> 00:18:50,010
If you shorten the
interval, the chances

249
00:18:50,010 --> 00:18:53,330
that you would see events
would become less and less.

250
00:18:53,330 --> 00:18:55,950
If you make your
event infinitesimal,

251
00:18:55,950 --> 00:18:58,650
most of the time
nothing would happen

252
00:18:58,650 --> 00:19:01,980
with very small
probability that vanishes.

253
00:19:01,980 --> 00:19:08,090
As the size of the interval
goes to 0, you will see 1 event.

254
00:19:08,090 --> 00:19:10,130
So this is one condition.

255
00:19:10,130 --> 00:19:16,589
And the second condition is
events in different intervals

256
00:19:16,589 --> 00:19:17,255
are independent.

257
00:19:23,430 --> 00:19:26,200
And since I wrote
independent in red up there,

258
00:19:26,200 --> 00:19:31,084
let me write it in red here
because it sort of harks back

259
00:19:31,084 --> 00:19:32,000
to the same condition.

260
00:19:34,970 --> 00:19:38,014
And so this is the question.

261
00:19:38,014 --> 00:19:39,055
What is this probability?

262
00:19:42,210 --> 00:19:47,630
And the to get the
answer, what we do

263
00:19:47,630 --> 00:20:02,240
is to subdivide our
big interval into N,

264
00:20:02,240 --> 00:20:08,130
which is big T divided by
the small dt subintervals.

265
00:20:13,330 --> 00:20:19,515
So basically, originally
let's say on the time axis,

266
00:20:19,515 --> 00:20:24,460
we were covering a distance
that went from 0 to big T

267
00:20:24,460 --> 00:20:27,240
and we were asking
what happens here.

268
00:20:27,240 --> 00:20:29,920
So what we are
doing now is we are

269
00:20:29,920 --> 00:20:37,320
sort of dividing this interval
to lots of subintervals,

270
00:20:37,320 --> 00:20:41,470
the size of each one
of them being dt.

271
00:20:41,470 --> 00:20:46,950
And therefore, the total
number is big T over dt.

272
00:20:46,950 --> 00:20:49,230
And ultimately,
clearly, I want to sit

273
00:20:49,230 --> 00:20:55,100
dt going to 0 so that this
condition is satisfied.

274
00:20:55,100 --> 00:21:00,090
So also because of the second
condition, each one of these

275
00:21:00,090 --> 00:21:05,220
will independently tell me
whether or not I have an event.

276
00:21:05,220 --> 00:21:09,190
And so if I want to count
the total number of events,

277
00:21:09,190 --> 00:21:11,405
I have to add things
that are occurring

278
00:21:11,405 --> 00:21:12,760
in different intervals.

279
00:21:12,760 --> 00:21:14,870
And we can see that
this problem now

280
00:21:14,870 --> 00:21:18,790
became identical to that
problem because each one

281
00:21:18,790 --> 00:21:22,450
of these intervals has two
possible outcomes-- nothing

282
00:21:22,450 --> 00:21:26,370
happens with probability
1 minus alpha dt,

283
00:21:26,370 --> 00:21:30,240
something happens with
probability alpha dt.

284
00:21:30,240 --> 00:21:38,210
So no event is probability
1 minus alpha dt.

285
00:21:38,210 --> 00:21:42,700
One event means
probability alpha dt.

286
00:21:42,700 --> 00:21:44,570
So this is a binomial process.

287
00:21:50,880 --> 00:21:55,930
So we can calculate,
for example,

288
00:21:55,930 --> 00:21:59,070
the characteristic function.

289
00:21:59,070 --> 00:22:02,510
And I will indicate
that we are looking

290
00:22:02,510 --> 00:22:07,440
at some interval of
size T and parameterized

291
00:22:07,440 --> 00:22:09,680
by this straight
alpha, we'll see

292
00:22:09,680 --> 00:22:11,220
that only the
product will occur.

293
00:22:14,610 --> 00:22:18,230
So this is this before
Fourier variable.

294
00:22:18,230 --> 00:22:20,190
We said that it's
a binary, so it

295
00:22:20,190 --> 00:22:27,420
is one of the probabilities
plus e to the minus ik

296
00:22:27,420 --> 00:22:31,180
plus the other probability
raised to the power of N.

297
00:22:31,180 --> 00:22:33,690
Now we just substitute
the probabilities that we

298
00:22:33,690 --> 00:22:35,350
have over here.

299
00:22:35,350 --> 00:22:43,810
So the probability of not having
an event is 1 minus alpha dt.

300
00:22:43,810 --> 00:22:47,640
The probability of having
an event is alpha dt.

301
00:22:47,640 --> 00:22:53,500
So alpha dt is going
to appear here as e

302
00:22:53,500 --> 00:22:58,440
to the minus ik minus 1.

303
00:22:58,440 --> 00:23:02,620
So alpha st, e to the
minus ik, is from here.

304
00:23:02,620 --> 00:23:06,760
From PB I will get
1 minus alpha dt.

305
00:23:06,760 --> 00:23:09,470
And I bunched
together the two terms

306
00:23:09,470 --> 00:23:13,000
that are proportional
to alpha dt.

307
00:23:13,000 --> 00:23:15,480
And then I have to raise
to the power of N, which

308
00:23:15,480 --> 00:23:17,660
is T divided by dt.

309
00:23:21,190 --> 00:23:26,230
And this whole prescription
is valid in the limit

310
00:23:26,230 --> 00:23:30,290
where dt is going to 0.

311
00:23:30,290 --> 00:23:35,160
So what you have is 1
plus an infinitesimal

312
00:23:35,160 --> 00:23:38,590
raised to a huge power.

313
00:23:38,590 --> 00:23:42,040
And this limiting procedure
is equivalent to taking

314
00:23:42,040 --> 00:23:43,530
the exponential.

315
00:23:43,530 --> 00:23:45,500
So basically this
is the same thing

316
00:23:45,500 --> 00:23:50,770
as exponential of what is here
multiplied by what is here.

317
00:23:50,770 --> 00:23:53,880
The dt's cancel each
other out and the answer

318
00:23:53,880 --> 00:23:58,670
is alpha T e to the
minus ik minus 1.

319
00:24:01,740 --> 00:24:05,490
So the characteristic
function for this process

320
00:24:05,490 --> 00:24:08,506
that we described is
simply given by this form.

321
00:24:20,170 --> 00:24:21,090
You say, wait.

322
00:24:21,090 --> 00:24:23,440
I didn't ask for the
characteristic function.

323
00:24:23,440 --> 00:24:25,970
I wanted the probability.

324
00:24:25,970 --> 00:24:27,020
Well, I say, OK.

325
00:24:27,020 --> 00:24:30,780
Characteristic function is
simply the Fourier transform.

326
00:24:30,780 --> 00:24:33,160
So let me Fourier
transform back,

327
00:24:33,160 --> 00:24:39,850
and I would say that the
probability along not

328
00:24:39,850 --> 00:24:43,210
the Fourier axis
but the actual axis

329
00:24:43,210 --> 00:24:45,540
is open by the inverse
Fourier process.

330
00:24:45,540 --> 00:24:50,120
So I have to do an
integral dk over 2 pi

331
00:24:50,120 --> 00:24:54,040
e to the ikx times the
characteristic function.

332
00:24:54,040 --> 00:24:59,510
And the characteristic function
is e to minus-- what was it?

333
00:24:59,510 --> 00:25:11,330
e to the alpha T. E to
the minus ik k minus 1.

334
00:25:15,495 --> 00:25:17,770
Well, there is an
equal to minus alpha

335
00:25:17,770 --> 00:25:22,280
T that I can simply take
outside the integration.

336
00:25:22,280 --> 00:25:27,950
I have the integration
over k e to that ikx.

337
00:25:31,490 --> 00:25:34,050
And then what I will do
is I have this factor of e

338
00:25:34,050 --> 00:25:38,783
to the something in the-- e to
the alpha T to the minus ik.

339
00:25:38,783 --> 00:25:42,530
I will use the expansion
of the exponential.

340
00:25:42,530 --> 00:25:44,420
So the expansion
of the exponential

341
00:25:44,420 --> 00:25:48,340
is a sum over m running
from 0 to infinity.

342
00:25:48,340 --> 00:25:50,840
The exponent raised
to the m-th power.

343
00:25:50,840 --> 00:25:54,480
So I have alpha T raised
to the m-th power,

344
00:25:54,480 --> 00:25:57,350
e to the minus ik raised
to the m-th power divided

345
00:25:57,350 --> 00:25:57,960
m factor here.

346
00:26:04,150 --> 00:26:09,050
So now I reorder the
sum and the integration.

347
00:26:09,050 --> 00:26:11,900
The sum is over m, the
integration is over k.

348
00:26:11,900 --> 00:26:14,200
I can reorder them.

349
00:26:14,200 --> 00:26:16,750
So on the things
that go outside I

350
00:26:16,750 --> 00:26:22,270
have a sum over m running from
0 to infinity e to the minus

351
00:26:22,270 --> 00:26:28,800
alpha T, alpha T to the power
of m divided by m factorial.

352
00:26:28,800 --> 00:26:35,790
Then I have the integral
over k over 2 pi e to the ik.

353
00:26:35,790 --> 00:26:40,670
Well, I had the x here and I
have e to the minus ikm here.

354
00:26:40,670 --> 00:26:42,610
So I have x minus m.

355
00:26:47,120 --> 00:26:50,970
And then I say, OK, this is
an integral that I recognize.

356
00:26:50,970 --> 00:26:55,420
The integral of e to
the ik times something

357
00:26:55,420 --> 00:26:57,940
is simply a delta function.

358
00:26:57,940 --> 00:27:01,940
So this whole thing
is a delta function

359
00:27:01,940 --> 00:27:07,320
that's says, oh, x
has to be an integer.

360
00:27:07,320 --> 00:27:12,430
Because I kind of did something
that maybe, in retrospect,

361
00:27:12,430 --> 00:27:15,130
you would have said
why are you doing this.

362
00:27:15,130 --> 00:27:20,600
Because along how many
times things have occurred,

363
00:27:20,600 --> 00:27:24,070
they have either occurred
0 times, 1 times, 2 decays,

364
00:27:24,070 --> 00:27:24,800
3 decays.

365
00:27:24,800 --> 00:27:28,040
I don't have 2.5 decays.

366
00:27:28,040 --> 00:27:31,630
So I treated x as a
continuous variable,

367
00:27:31,630 --> 00:27:35,800
but the mathematics was really
clever enough to say that, no.

368
00:27:35,800 --> 00:27:40,010
The only places that you can
have are really integer values.

369
00:27:40,010 --> 00:27:45,230
And the probability that you
have some particular value

370
00:27:45,230 --> 00:27:51,020
integer m is simply what we have
over here, e to the minus alpha

371
00:27:51,020 --> 00:27:56,954
T alpha T to the power of m
divided by m factorial, which

372
00:27:56,954 --> 00:27:58,120
is the Poisson distribution.

373
00:28:07,810 --> 00:28:08,310
OK.

374
00:28:13,750 --> 00:28:16,780
But fine.

375
00:28:16,780 --> 00:28:18,460
So this is the
Poisson distribution,

376
00:28:18,460 --> 00:28:21,450
but really we go
through the root

377
00:28:21,450 --> 00:28:24,760
of the characteristic
function in order

378
00:28:24,760 --> 00:28:28,430
to use this machinery
that we developed earlier

379
00:28:28,430 --> 00:28:31,770
for cumulants et cetera.

380
00:28:31,770 --> 00:28:34,620
So let's look at the
cumulant generating function.

381
00:28:34,620 --> 00:28:40,490
So I have to take the
log of the function

382
00:28:40,490 --> 00:28:43,360
that I had calculated there.

383
00:28:43,360 --> 00:28:46,850
It is nicely in the
exponential, so I

384
00:28:46,850 --> 00:28:51,950
get alpha T e to the
minus ik minus 1.

385
00:28:55,690 --> 00:29:01,510
So now I can make an expansion
of this in powers of k

386
00:29:01,510 --> 00:29:03,950
so I can expand the exponential.

387
00:29:03,950 --> 00:29:07,940
The first term vanishes
because this starts with one.

388
00:29:07,940 --> 00:29:13,440
So really I have alpha
T sum running from 1

389
00:29:13,440 --> 00:29:18,510
to infinity or N running
from 1 to infinity minus ik

390
00:29:18,510 --> 00:29:20,420
to the power of n
over n factorial.

391
00:29:28,230 --> 00:29:35,680
So my task for
identifying the cumulants

392
00:29:35,680 --> 00:29:39,210
is to look at the
expansion of this log

393
00:29:39,210 --> 00:29:44,510
and read off powers of minus
ikn to the power of n factorial.

394
00:29:44,510 --> 00:29:46,620
So what do we see?

395
00:29:46,620 --> 00:29:53,810
We see that the first cumulant
of the Poisson is alpha T,

396
00:29:53,810 --> 00:29:57,090
but all the coefficients
are the same thing.

397
00:29:57,090 --> 00:29:59,260
The expectation value-- sorry.

398
00:29:59,260 --> 00:30:03,320
The second cumulant is
alpha T. The third cumulant,

399
00:30:03,320 --> 00:30:06,310
the fourth cumulant,
all the other cumulants

400
00:30:06,310 --> 00:30:15,090
are also alpha T.

401
00:30:15,090 --> 00:30:19,190
So the average number of decays
that you see in the interval

402
00:30:19,190 --> 00:30:23,260
is simply alpha T. But
there are fluctuations,

403
00:30:23,260 --> 00:30:25,100
and if somebody
should, for example,

404
00:30:25,100 --> 00:30:30,660
ask you what's the average
number cubed of events,

405
00:30:30,660 --> 00:30:32,730
you would say, OK.

406
00:30:32,730 --> 00:30:35,750
I'm going to use the
relationship between moments

407
00:30:35,750 --> 00:30:37,130
and cumulants.

408
00:30:37,130 --> 00:30:42,830
I can either have
three first objects

409
00:30:42,830 --> 00:30:47,640
or I can put one of
them separate in three

410
00:30:47,640 --> 00:30:50,010
different factions.

411
00:30:50,010 --> 00:30:56,280
But this is a case where
the triangle is allowed,

412
00:30:56,280 --> 00:30:59,220
so diagrammatically
all three are possible.

413
00:30:59,220 --> 00:31:03,640
And so the answer for the
first term is alpha T cubed.

414
00:31:06,260 --> 00:31:09,620
For the second term,
it is a factor of 3.

415
00:31:09,620 --> 00:31:13,960
Both this variance and the mean
give me a factor of alpha T,

416
00:31:13,960 --> 00:31:17,110
so I will get alpha T squared.

417
00:31:17,110 --> 00:31:20,830
And the third term, which
is the third cumulant,

418
00:31:20,830 --> 00:31:26,163
is also alpha T. So the
answer is simply of this form.

419
00:31:29,550 --> 00:31:31,620
Again, m is an integer.

420
00:31:31,620 --> 00:31:33,330
Alpha T is dimensionless.

421
00:31:33,330 --> 00:31:37,985
So there is no dimension problem
by it having different powers.

422
00:31:42,050 --> 00:31:42,920
OK.

423
00:31:42,920 --> 00:31:43,550
Any questions?

424
00:31:50,050 --> 00:31:50,650
All right.

425
00:31:50,650 --> 00:31:58,010
So that's what I wanted
to say about one variable.

426
00:31:58,010 --> 00:32:03,685
Now let's go and look at
corresponding definitions

427
00:32:03,685 --> 00:32:05,060
when you have
multiple variables.

428
00:32:13,140 --> 00:32:27,250
So for many random variables,
the set of possible outcomes,

429
00:32:27,250 --> 00:32:32,650
let's say, has variables x1, x2.

430
00:32:32,650 --> 00:32:33,990
Let's be precise.

431
00:32:33,990 --> 00:32:36,440
Let's end it a xn.

432
00:32:36,440 --> 00:32:41,090
And if these are
distributed, each one of them

433
00:32:41,090 --> 00:32:45,220
continuously over the
interval, to each point

434
00:32:45,220 --> 00:32:50,380
we can characterize some kind
of a probability density.

435
00:32:50,380 --> 00:32:57,940
So this entity is called the
joint probability density

436
00:32:57,940 --> 00:33:00,630
function.

437
00:33:00,630 --> 00:33:14,950
And its definition would be to
look at probability of outcome

438
00:33:14,950 --> 00:33:22,360
in some interval that
is between, say, x1, x1

439
00:33:22,360 --> 00:33:30,770
plus dx1 in one, x2, x2 plus
dx2 in the second variable.

440
00:33:30,770 --> 00:33:36,750
xn plus dxn in
the last variable.

441
00:33:36,750 --> 00:33:40,560
So you sort of look at
the particular point

442
00:33:40,560 --> 00:33:44,420
in this multi-dimensional
space that you are interested.

443
00:33:44,420 --> 00:33:48,340
You build a little
cube around it.

444
00:33:48,340 --> 00:33:51,580
You ask, what's the
probability to being that cube?

445
00:33:51,580 --> 00:33:55,420
And then you divide by
the volume of that cube.

446
00:33:55,420 --> 00:34:01,960
So dx1, dx2, dxn,
which is the same thing

447
00:34:01,960 --> 00:34:06,530
that you would be doing in
constructing any density

448
00:34:06,530 --> 00:34:10,719
and by ultimately taking the
limit that all of the x's

449
00:34:10,719 --> 00:34:11,688
go to 0.

450
00:34:21,250 --> 00:34:22,060
All right.

451
00:34:22,060 --> 00:34:26,469
So this is the joint
probability distribution.

452
00:34:26,469 --> 00:34:30,739
You can construct, now, the
joint characteristic function.

453
00:34:44,319 --> 00:34:45,740
Now how do you do that?

454
00:34:45,740 --> 00:34:49,190
Well, again, just like you
would do for your transform

455
00:34:49,190 --> 00:34:52,030
with multiple variables.

456
00:34:52,030 --> 00:34:56,510
So you would go
for each variable

457
00:34:56,510 --> 00:34:58,310
to a conjugate variable.

458
00:34:58,310 --> 00:35:00,810
So x1 would go to k1.

459
00:35:00,810 --> 00:35:02,990
x2 would go to k2.

460
00:35:02,990 --> 00:35:05,330
xn would go to kn.

461
00:35:05,330 --> 00:35:08,600
And this would
mathematically amount

462
00:35:08,600 --> 00:35:11,670
to calculating the
expectation value of e

463
00:35:11,670 --> 00:35:26,470
to the minus i k1x1 x1,
k2x2 and so forth, which

464
00:35:26,470 --> 00:35:38,850
you would obtain by integrating
over all of these variables, e

465
00:35:38,850 --> 00:35:46,280
to the minus ik alpha x alpha,
against the probability of x1

466
00:35:46,280 --> 00:35:47,106
through xn.

467
00:35:53,140 --> 00:35:53,640
Question?

468
00:35:57,966 --> 00:35:59,775
AUDIENCE: It's a
bit hard to read.

469
00:35:59,775 --> 00:36:00,858
It's getting really small.

470
00:36:04,475 --> 00:36:07,856
PROFESSOR: OK.

471
00:36:07,856 --> 00:36:10,271
[LAUGHTER]

472
00:36:11,720 --> 00:36:15,020
PROFESSOR: But it's just
multi-dimensional integral.

473
00:36:15,020 --> 00:36:17,080
OK?

474
00:36:17,080 --> 00:36:18,550
All right.

475
00:36:18,550 --> 00:36:25,270
So this is, as
the case of one, I

476
00:36:25,270 --> 00:36:29,540
think the problem is not the
size but the angle I see.

477
00:36:29,540 --> 00:36:30,875
I can't do much for that.

478
00:36:30,875 --> 00:36:34,310
You have to move to the center.

479
00:36:34,310 --> 00:36:34,820
OK.

480
00:36:34,820 --> 00:36:40,600
So what we can look at
now is joint moment.

481
00:36:46,310 --> 00:36:50,350
So you can-- when
we had one variable,

482
00:36:50,350 --> 00:36:52,970
we could look at something
like the expectation

483
00:36:52,970 --> 00:36:54,490
value of x to the m.

484
00:36:57,890 --> 00:37:00,990
That would be the m-th moment.

485
00:37:00,990 --> 00:37:04,270
But if you have two
variable, we can raise x1

486
00:37:04,270 --> 00:37:08,270
to some other, x2
to another power,

487
00:37:08,270 --> 00:37:12,430
and actually xn
to another power.

488
00:37:12,430 --> 00:37:13,980
So this is a joint moment.

489
00:37:18,310 --> 00:37:23,310
Now the thing is, that
the same way that moments

490
00:37:23,310 --> 00:37:25,500
for one variable
could be generated

491
00:37:25,500 --> 00:37:28,770
by expanding the
characteristic function,

492
00:37:28,770 --> 00:37:32,380
if I were to expand this
function in powers of k,

493
00:37:32,380 --> 00:37:35,820
you can see that in meeting
the expectation value,

494
00:37:35,820 --> 00:37:38,840
I will get various powers
of x1 to some power,

495
00:37:38,840 --> 00:37:41,400
x2 to some power, et cetera.

496
00:37:41,400 --> 00:37:44,550
So by appropriate
expansion of that function,

497
00:37:44,550 --> 00:37:49,200
I can generate all of-- read
off all of these moments.

498
00:37:49,200 --> 00:37:53,650
Now, a more common way of
generating the Taylor series

499
00:37:53,650 --> 00:37:56,620
expansion is
through derivatives.

500
00:37:56,620 --> 00:38:02,810
So what I can do is I can
take a derivative with respect

501
00:38:02,810 --> 00:38:03,860
to, say, ik1.

502
00:38:08,280 --> 00:38:11,340
If I take a derivative
with respect

503
00:38:11,340 --> 00:38:14,580
to ik1 here, what happens
is I will bring down

504
00:38:14,580 --> 00:38:16,490
a factor of minus x alpha.

505
00:38:16,490 --> 00:38:19,420
So actually let me
put the minus so it

506
00:38:19,420 --> 00:38:22,340
becomes a factor of x alpha.

507
00:38:22,340 --> 00:38:29,060
And if I integrate x
alpha against this,

508
00:38:29,060 --> 00:38:31,620
I will be generating
the expectation value

509
00:38:31,620 --> 00:38:37,990
of x alpha provided that
ultimately I set all of the k's

510
00:38:37,990 --> 00:38:38,990
to 0.

511
00:38:38,990 --> 00:38:41,080
So I will calculate
the derivative

512
00:38:41,080 --> 00:38:45,160
of this function with respect
to all of these arguments.

513
00:38:45,160 --> 00:38:48,170
At the end of the day, I
will set k equals to 0.

514
00:38:48,170 --> 00:38:51,720
That will give me the
expectation value of x1.

515
00:38:51,720 --> 00:38:57,550
But I don't want x1, I want
x1 raised to the power of m1.

516
00:38:57,550 --> 00:38:59,130
So I do this.

517
00:38:59,130 --> 00:39:02,590
Each time I take a derivative
with respect to minus ik,

518
00:39:02,590 --> 00:39:06,720
I will bring down the factor
of the corresponding x.

519
00:39:06,720 --> 00:39:10,240
And I can do this with
multiple different things.

520
00:39:10,240 --> 00:39:17,120
So d by the ik2 raised
to the power of m2

521
00:39:17,120 --> 00:39:23,310
minus d by the ikn,
the whole thing

522
00:39:23,310 --> 00:39:26,580
raised to the power of mn.

523
00:39:29,650 --> 00:39:32,560
So I can either
take this function

524
00:39:32,560 --> 00:39:35,430
of multiple variables--
k1 through kn--

525
00:39:35,430 --> 00:39:39,620
and expand it and read off the
appropriate powers of k1k2.

526
00:39:39,620 --> 00:39:43,090
Or I can say that the
terms in this expansion

527
00:39:43,090 --> 00:39:45,830
are generated through taking
appropriate derivative.

528
00:39:45,830 --> 00:39:47,145
Yes?

529
00:39:47,145 --> 00:39:48,520
AUDIENCE: Is there
any reason why

530
00:39:48,520 --> 00:39:52,810
you're choosing to take
a derivative with respect

531
00:39:52,810 --> 00:39:57,406
to ikj instead of simply
putting the i in the numerator?

532
00:39:57,406 --> 00:39:59,641
Or are there-- are there
things that I'm not--

533
00:39:59,641 --> 00:40:00,600
PROFESSOR: No.

534
00:40:00,600 --> 00:40:01,100
No.

535
00:40:01,100 --> 00:40:01,950
There is no reason.

536
00:40:01,950 --> 00:40:05,341
So you're saying why didn't I
write this as this like this?

537
00:40:05,341 --> 00:40:05,840
i divided?

538
00:40:05,840 --> 00:40:07,585
AUDIENCE: Yeah.

539
00:40:07,585 --> 00:40:10,230
PROFESSOR: I think
I just visually saw

540
00:40:10,230 --> 00:40:13,830
that it was kind
of more that way.

541
00:40:13,830 --> 00:40:15,420
But it's exactly the same thing.

542
00:40:15,420 --> 00:40:18,266
Yes.

543
00:40:18,266 --> 00:40:18,766
OK.

544
00:40:22,660 --> 00:40:25,460
All right.

545
00:40:25,460 --> 00:40:27,930
Now the interesting
object, of course, to us

546
00:40:27,930 --> 00:40:29,820
is more the joint cumulants.

547
00:40:34,800 --> 00:40:39,270
So how do we generate
joint cumulants?

548
00:40:39,270 --> 00:40:44,310
Well previously, essentially
we had a bunch of objects

549
00:40:44,310 --> 00:40:48,830
for one variable
that was some moment.

550
00:40:48,830 --> 00:40:50,770
And in order to
make them cumulants,

551
00:40:50,770 --> 00:40:52,540
we just put a sub C here.

552
00:40:52,540 --> 00:40:55,390
So we do that and we are done.

553
00:40:55,390 --> 00:40:58,390
But what did
operationally happen

554
00:40:58,390 --> 00:41:01,990
was that we did the
expansion rather

555
00:41:01,990 --> 00:41:04,400
than for the
characteristic function

556
00:41:04,400 --> 00:41:08,430
for the log of the
characteristic function.

557
00:41:08,430 --> 00:41:12,630
So all I need to do
is to do precisely

558
00:41:12,630 --> 00:41:17,120
this set of derivatives
applied rather

559
00:41:17,120 --> 00:41:21,290
than to the joint
characteristic function

560
00:41:21,290 --> 00:41:24,260
to the log of the joint
characteristic function.

561
00:41:24,260 --> 00:41:27,849
And at the end, set
all of the case to 0.

562
00:41:30,702 --> 00:41:31,202
OK?

563
00:41:37,920 --> 00:41:47,650
So by looking at these two
definitions and the expansion

564
00:41:47,650 --> 00:41:51,580
of the log, for example, you
can calculate various things.

565
00:41:51,580 --> 00:42:00,730
Like, for example, x1x2 with
a C is the expectation value

566
00:42:00,730 --> 00:42:03,440
of x1x2.

567
00:42:03,440 --> 00:42:13,600
This joint moment minus x1x2,
just as you would have thought,

568
00:42:13,600 --> 00:42:17,710
would be the appropriate
generalization of the variance.

569
00:42:17,710 --> 00:42:18,835
And this is the covariance.

570
00:42:23,360 --> 00:42:28,560
And you can construct
appropriate extensions.

571
00:42:28,560 --> 00:42:29,060
OK.

572
00:42:43,410 --> 00:42:50,967
Now we made a lot of use of the
relationship between moments

573
00:42:50,967 --> 00:42:51,550
and cumulants.

574
00:42:51,550 --> 00:42:54,560
We just-- so the
idea, really, was

575
00:42:54,560 --> 00:42:58,110
that the essence of a
probability distribution

576
00:42:58,110 --> 00:43:01,030
is characterized
in the cumulants.

577
00:43:01,030 --> 00:43:03,910
Moments kind of depend on
how you look at things.

578
00:43:03,910 --> 00:43:07,040
The essence is in the cumulants,
but sometimes the moments

579
00:43:07,040 --> 00:43:10,020
are more usefully
computed, and there

580
00:43:10,020 --> 00:43:13,120
was a relationship between
moments and cumulants,

581
00:43:13,120 --> 00:43:16,025
we can generalize that
graphical relation

582
00:43:16,025 --> 00:43:19,640
to the case joint moments
and joint cumulants.

583
00:43:19,640 --> 00:43:29,120
So graphical relation
applies as long

584
00:43:29,120 --> 00:43:39,490
as points are labeled
by appropriate

585
00:43:39,490 --> 00:43:45,040
or by corresponding variable.

586
00:43:49,790 --> 00:43:55,970
So suppose I wanted to calculate
some kind of a moment that

587
00:43:55,970 --> 00:43:58,780
is x1 squared.

588
00:43:58,780 --> 00:44:05,490
Let's say x2, x3.

589
00:44:05,490 --> 00:44:07,480
This may generate
for me many diagrams,

590
00:44:07,480 --> 00:44:11,180
so let's stop from here.

591
00:44:11,180 --> 00:44:15,820
So what I can do is
I can have points

592
00:44:15,820 --> 00:44:20,150
that I label 1, 1, and 2.

593
00:44:20,150 --> 00:44:24,100
And have them separate
from each other.

594
00:44:24,100 --> 00:44:27,800
Or I can start
pairing them together.

595
00:44:27,800 --> 00:44:33,000
So one possibility is that I
put the 1's together and the 2

596
00:44:33,000 --> 00:44:36,380
starts separately.

597
00:44:36,380 --> 00:44:40,470
Another possibility is that
I can group the 1 and the 2

598
00:44:40,470 --> 00:44:42,410
together.

599
00:44:42,410 --> 00:44:46,500
And then the other
1 starts separately.

600
00:44:46,500 --> 00:44:48,860
But I had a choice of
two ways to do this,

601
00:44:48,860 --> 00:44:53,180
so this comes-- this diagram
with an overall factor of 2.

602
00:44:53,180 --> 00:44:58,770
And then there's the
possibility to put all of them

603
00:44:58,770 --> 00:45:01,022
in the same bag.

604
00:45:01,022 --> 00:45:03,420
And so mathematically,
that means

605
00:45:03,420 --> 00:45:08,610
that the third-- this
particular joint moment

606
00:45:08,610 --> 00:45:17,880
is obtained by taking average
of x1 squared x2 average, which

607
00:45:17,880 --> 00:45:20,030
is the first term.

608
00:45:20,030 --> 00:45:26,450
The second term is
the variance of x1.

609
00:45:26,450 --> 00:45:28,440
And then multiplied by x2.

610
00:45:32,030 --> 00:45:40,060
The third term is twice the
covariance of x1 and x2 times

611
00:45:40,060 --> 00:45:42,017
the mean of x1.

612
00:45:45,960 --> 00:45:49,610
And the final term is
just the third cumulant.

613
00:45:53,034 --> 00:45:57,980
So again, you would need to
compute these, presumably,

614
00:45:57,980 --> 00:46:00,200
from the law of the
characteristic function

615
00:46:00,200 --> 00:46:01,934
and then you would be done.

616
00:46:27,790 --> 00:46:32,780
Couple of other definitions.

617
00:46:32,780 --> 00:46:42,150
One of them is an
unconditional probability.

618
00:46:44,710 --> 00:46:49,960
So very soon we will be talking
about, say, probabilities

619
00:46:49,960 --> 00:46:52,470
appropriate to the
gas in this room.

620
00:46:52,470 --> 00:46:54,780
And the particles in
the gas in this room

621
00:46:54,780 --> 00:46:58,020
will be characterized
where they are,

622
00:46:58,020 --> 00:47:02,700
some position vector q, and
how fast they are moving,

623
00:47:02,700 --> 00:47:05,210
some momentum vector p.

624
00:47:05,210 --> 00:47:08,950
And there would be some kind
of a probability density

625
00:47:08,950 --> 00:47:13,750
associated with finding a
particle with some momentum

626
00:47:13,750 --> 00:47:17,120
at some location in space.

627
00:47:17,120 --> 00:47:20,650
But sometimes I
say, well, I really

628
00:47:20,650 --> 00:47:23,370
don't care about where
the particles are,

629
00:47:23,370 --> 00:47:27,140
I just want to know how
fast they are moving.

630
00:47:27,140 --> 00:47:31,610
So what I really care
is the probability

631
00:47:31,610 --> 00:47:35,180
that I have a particle
moving with some momentum p,

632
00:47:35,180 --> 00:47:38,160
irrespective of where it is.

633
00:47:38,160 --> 00:47:42,120
Then all I need to
do is to integrate

634
00:47:42,120 --> 00:47:45,545
over the position the joint
probability distribution.

635
00:47:48,230 --> 00:47:50,560
And the check that
this is correct

636
00:47:50,560 --> 00:47:54,360
is that if I first do not
integrate this over p,

637
00:47:54,360 --> 00:47:57,570
this would be integrated
over the entire space

638
00:47:57,570 --> 00:47:59,930
and the joint
probabilities appropriately

639
00:47:59,930 --> 00:48:04,630
normalized so that the joint
integration will give me one.

640
00:48:04,630 --> 00:48:09,400
So this is a correct
normalized probability.

641
00:48:09,400 --> 00:48:14,930
And more generally, if I'm
interested in, say, a bunch of

642
00:48:14,930 --> 00:48:21,540
coordinates x1 through xs, out
of a larger list of coordinates

643
00:48:21,540 --> 00:48:29,510
that spans x1 through xs all
the way to something else,

644
00:48:29,510 --> 00:48:34,350
all I need to do to get
unconditional probability is

645
00:48:34,350 --> 00:48:39,770
to integrate over the variables
that I'm not interested.

646
00:48:39,770 --> 00:48:43,800
Again, check is that it's
a good properly normalized.

647
00:48:46,660 --> 00:48:48,880
Now, this is to
be contrasted with

648
00:48:48,880 --> 00:48:50,210
the conditional probability.

649
00:48:55,470 --> 00:48:58,440
The conditional
probability, let's

650
00:48:58,440 --> 00:49:02,955
say we would be interested in
calculating the pressure that

651
00:49:02,955 --> 00:49:05,110
is exerted on the board.

652
00:49:05,110 --> 00:49:07,680
The pressure is exerted
by the particles that

653
00:49:07,680 --> 00:49:10,340
impinge on the board
and then go away,

654
00:49:10,340 --> 00:49:13,620
so I'm interested in the
momentum of particles

655
00:49:13,620 --> 00:49:17,290
right at the board, not
anywhere else in space.

656
00:49:17,290 --> 00:49:23,760
So if I'm interested in
the momentum of particles

657
00:49:23,760 --> 00:49:28,870
at the particular location,
which could in principle depend

658
00:49:28,870 --> 00:49:36,440
on location-- so now q is a
parameter p is the variable,

659
00:49:36,440 --> 00:49:41,010
but the probability
distribution could depend on q.

660
00:49:41,010 --> 00:49:43,020
How do we obtain this?

661
00:49:43,020 --> 00:49:43,870
This, again.

662
00:49:43,870 --> 00:49:48,330
is going to be proportional
to the probability

663
00:49:48,330 --> 00:49:50,650
that I will find
a particle both at

664
00:49:50,650 --> 00:49:53,430
this location with momentum p.

665
00:49:53,430 --> 00:49:56,480
So I need to have that.

666
00:49:56,480 --> 00:49:58,180
But it's not
exactly that there's

667
00:49:58,180 --> 00:50:01,310
a normalization involved.

668
00:50:01,310 --> 00:50:04,210
And the way to get
normalization is

669
00:50:04,210 --> 00:50:13,590
to note that if I integrate this
probability over its variable p

670
00:50:13,590 --> 00:50:24,050
but not over the parameter
q, the answer should be 1.

671
00:50:24,050 --> 00:50:27,320
So this is going to
be, if I apply it

672
00:50:27,320 --> 00:50:32,760
to the right-hand side,
the integral over p

673
00:50:32,760 --> 00:50:39,850
of p of p and q,
which we recognize

674
00:50:39,850 --> 00:50:44,470
as an example of an
unconditional probability

675
00:50:44,470 --> 00:50:48,970
to find something at position 1.

676
00:50:48,970 --> 00:50:55,480
So the normalization is going to
be this so that the ratio is 1.

677
00:50:55,480 --> 00:51:02,070
So most generally, we
find that the probability

678
00:51:02,070 --> 00:51:08,610
to have some subset
of variables, given

679
00:51:08,610 --> 00:51:12,930
that the location of the
other variables in the list

680
00:51:12,930 --> 00:51:19,110
are somewhat fixed, is given
by the joint probability of all

681
00:51:19,110 --> 00:51:24,970
of the variables x1
through xn divided

682
00:51:24,970 --> 00:51:29,290
by the unconditional
probability that that applies

683
00:51:29,290 --> 00:51:31,530
to the parameters of our fixed.

684
00:51:35,844 --> 00:51:37,260
And this is called
Bayes' theorem.

685
00:51:57,510 --> 00:52:05,240
By the way, if variables
are independent,

686
00:52:05,240 --> 00:52:08,340
which actually does apply
to the case of the particles

687
00:52:08,340 --> 00:52:11,300
in this room as far as their
momentum and position is

688
00:52:11,300 --> 00:52:15,900
concerned, then the
joint probability

689
00:52:15,900 --> 00:52:18,190
is going to be the
product of one that

690
00:52:18,190 --> 00:52:20,950
is appropriate to
the position and one

691
00:52:20,950 --> 00:52:22,535
that is appropriate
to the momentum.

692
00:52:25,990 --> 00:52:29,050
And if you have
this independence,

693
00:52:29,050 --> 00:52:31,835
then what you'll
find is that there

694
00:52:31,835 --> 00:52:34,730
is no difference between
conditional and unconditional

695
00:52:34,730 --> 00:52:36,460
probabilities.

696
00:52:36,460 --> 00:52:38,610
And when you go
through this procedure,

697
00:52:38,610 --> 00:52:42,110
you will find that all the
joint cumulants-- but not

698
00:52:42,110 --> 00:52:44,780
the joint moments, naturally--
all the joint cumulants

699
00:52:44,780 --> 00:52:45,748
will be 0.

700
00:52:52,050 --> 00:52:52,800
OK.

701
00:52:52,800 --> 00:52:53,985
Any questions?

702
00:52:53,985 --> 00:52:54,485
Yes?

703
00:52:54,485 --> 00:53:00,286
AUDIENCE: Could you explain
how the condition of p--

704
00:53:00,286 --> 00:53:03,200
PROFESSOR: How
this was obtained?

705
00:53:03,200 --> 00:53:04,180
Or the one above?

706
00:53:04,180 --> 00:53:05,164
AUDIENCE: Yeah.

707
00:53:05,164 --> 00:53:07,624
The condition you applied
that the integral is 1.

708
00:53:07,624 --> 00:53:08,960
PROFESSOR: OK.

709
00:53:08,960 --> 00:53:14,950
So first of all, what
I want to look at

710
00:53:14,950 --> 00:53:18,580
is the probability that
is appropriate to one

711
00:53:18,580 --> 00:53:22,240
random variable at
the fixed value of all

712
00:53:22,240 --> 00:53:24,050
the other random variables.

713
00:53:24,050 --> 00:53:27,890
Like you say, in general I
should specify the probability

714
00:53:27,890 --> 00:53:31,110
as a function of momentum and
position throughout space.

715
00:53:31,110 --> 00:53:33,790
But I'm really interested
only at this point.

716
00:53:33,790 --> 00:53:37,730
I don't really care
about other points.

717
00:53:37,730 --> 00:53:40,820
However, the answer may depend
whether I'm looking at here

718
00:53:40,820 --> 00:53:42,640
or I'm looking at here.

719
00:53:42,640 --> 00:53:45,410
So the answer for the
probability of momentum

720
00:53:45,410 --> 00:53:47,710
is parametrized by q.

721
00:53:47,710 --> 00:53:50,760
On the other hand,
I say that I know

722
00:53:50,760 --> 00:53:53,370
the probability over
the entire space

723
00:53:53,370 --> 00:53:56,270
to be a disposition
with the momentum p

724
00:53:56,270 --> 00:53:59,720
as given by this
joint probability.

725
00:53:59,720 --> 00:54:02,580
But if I just set
that equal to this,

726
00:54:02,580 --> 00:54:05,820
the answer is not
correct because the way

727
00:54:05,820 --> 00:54:10,570
that this quantity is normalized
is if I first integrate over

728
00:54:10,570 --> 00:54:15,170
all possible values
of its variable, p.

729
00:54:15,170 --> 00:54:20,200
The answer should be 1,
irrespective of what q is.

730
00:54:20,200 --> 00:54:24,870
So I can define a conditional
probability for momentum

731
00:54:24,870 --> 00:54:28,630
here, a conditional
probability for momentum there.

732
00:54:28,630 --> 00:54:32,600
In both cases the momentum
would be the variable it.

733
00:54:32,600 --> 00:54:36,230
And integrating over all
possible values of momentum

734
00:54:36,230 --> 00:54:39,430
should give me one for a
properly normalized probability

735
00:54:39,430 --> 00:54:40,120
distribution.

736
00:54:40,120 --> 00:54:42,575
AUDIENCE: [INAUDIBLE].

737
00:54:42,575 --> 00:54:44,700
PROFESSOR: Given
that q is something.

738
00:54:44,700 --> 00:54:48,510
So q could be some--
now here q can

739
00:54:48,510 --> 00:54:51,050
be regarded as some parameter.

740
00:54:51,050 --> 00:54:55,680
So the condition is that this
integration should give me 1.

741
00:54:55,680 --> 00:54:57,630
I said that on
physical grounds, I

742
00:54:57,630 --> 00:55:00,660
expect this
conditional probability

743
00:55:00,660 --> 00:55:03,820
to be the joint probability
up to some normalization

744
00:55:03,820 --> 00:55:06,270
that I don't know.

745
00:55:06,270 --> 00:55:06,770
OK.

746
00:55:06,770 --> 00:55:09,390
So what is that normalization?

747
00:55:09,390 --> 00:55:11,890
The whole answer should be 1.

748
00:55:11,890 --> 00:55:13,820
What I have to do
is an integration

749
00:55:13,820 --> 00:55:17,380
over momentum of the
joint probability.

750
00:55:17,380 --> 00:55:22,845
I have said that an integration
over some set of variables

751
00:55:22,845 --> 00:55:25,070
of a joint probability
will give me

752
00:55:25,070 --> 00:55:28,860
the unconditional probability
for all the others.

753
00:55:28,860 --> 00:55:32,860
So integrating over all momentum
of this joint probability

754
00:55:32,860 --> 00:55:37,310
will give me the unconditional
probability for position.

755
00:55:37,310 --> 00:55:41,550
So the normalization of 1 is
the unconditional probability

756
00:55:41,550 --> 00:55:44,220
for position divided by n.

757
00:55:44,220 --> 00:55:48,220
So n-- this has to be this.

758
00:55:48,220 --> 00:55:53,200
And in general, it would
have to be this in order

759
00:55:53,200 --> 00:55:55,370
to ensure that if
I integrate over

760
00:55:55,370 --> 00:55:59,180
this first set of variables
of the joint probability

761
00:55:59,180 --> 00:56:04,040
distribution which would
give me the unconditional,

762
00:56:04,040 --> 00:56:07,090
cancels the unconditional in
the denominator to give me 1.

763
00:56:15,324 --> 00:56:15,990
Other questions?

764
00:56:20,900 --> 00:56:22,290
OK.

765
00:56:22,290 --> 00:56:28,980
So I'm going to
erase this last board

766
00:56:28,980 --> 00:56:35,530
to be underneath that
top board in looking

767
00:56:35,530 --> 00:56:40,140
at the joint Gaussian
distribution.

768
00:56:40,140 --> 00:56:41,950
So that was the
Gaussian, and we want

769
00:56:41,950 --> 00:56:43,290
to look at the joint Gaussian.

770
00:56:53,950 --> 00:56:57,440
So we want to
generalize the formula

771
00:56:57,440 --> 00:57:01,855
that we have over there for one
variable to multiple variables.

772
00:57:05,160 --> 00:57:09,200
So what I have there
initially is a factor, which

773
00:57:09,200 --> 00:57:23,530
is exponential of minus
1/2, x minus lambda squared.

774
00:57:23,530 --> 00:57:26,060
I can write this x
minus lambda squared

775
00:57:26,060 --> 00:57:29,680
as x minus lambda
x minus lambda.

776
00:57:29,680 --> 00:57:33,040
And then put the variance.

777
00:57:33,040 --> 00:57:39,850
Let's call it is 1 over sigma
rather than a small sigma

778
00:57:39,850 --> 00:57:42,141
squared or something like this.

779
00:57:42,141 --> 00:57:43,780
Actually, let me
just write it as 1

780
00:57:43,780 --> 00:57:46,110
over sigma squared
for the time being.

781
00:57:46,110 --> 00:57:53,480
And then the normalization was
1 over root 2 pi sigma squared.

782
00:57:53,480 --> 00:57:55,690
But you say, well, I
have multiple variables,

783
00:57:55,690 --> 00:57:59,080
so maybe this is what I
would give for my B variable.

784
00:58:04,090 --> 00:58:10,410
And then I would sum over
all N, running from 1 to N.

785
00:58:10,410 --> 00:58:12,470
So this is essentially
the form that I

786
00:58:12,470 --> 00:58:18,240
would have for an independent
Gaussian variables.

787
00:58:18,240 --> 00:58:21,570
And then I would
have to multiply here

788
00:58:21,570 --> 00:58:25,990
factors of 2 pi sigma squared,
so I would have 2 pi to the N

789
00:58:25,990 --> 00:58:27,190
over 2.

790
00:58:27,190 --> 00:58:30,710
And I would have
product of-- actually,

791
00:58:30,710 --> 00:58:34,370
let's write it as 2 pi
to the N square root.

792
00:58:34,370 --> 00:58:36,400
I would have the product
of sigma i squared.

793
00:58:40,890 --> 00:58:44,770
But that's just too
limiting a form.

794
00:58:44,770 --> 00:58:50,700
The most general form that these
quadratic will allow me to have

795
00:58:50,700 --> 00:58:55,980
also cross terms where it is
not only the diagonal terms

796
00:58:55,980 --> 00:58:59,950
x1 and x1 that's are multiplying
each other, but x2 and x3,

797
00:58:59,950 --> 00:59:01,140
et cetera.

798
00:59:01,140 --> 00:59:07,070
So I would have a sum over both
m and n running from 1 to n.

799
00:59:07,070 --> 00:59:09,040
And then to coefficient
here, rather

800
00:59:09,040 --> 00:59:14,890
than just being a number,
would be the variables

801
00:59:14,890 --> 00:59:18,520
that would be like a matrix.

802
00:59:18,520 --> 00:59:25,070
Because for each pair m and
n, I would have some number.

803
00:59:25,070 --> 00:59:32,420
And I will call them the
inverse of some matrix C.

804
00:59:32,420 --> 00:59:37,710
And if you, again, think
of the problem as a matrix,

805
00:59:37,710 --> 00:59:41,440
if I have the diagonal
matrix, then the product

806
00:59:41,440 --> 00:59:43,990
of elements along the
diagonal is the same thing

807
00:59:43,990 --> 00:59:45,740
as the determinant.

808
00:59:45,740 --> 00:59:49,780
If I were to rotate the matrix
to have off diagonal elements,

809
00:59:49,780 --> 00:59:52,752
the determinant will
always be there.

810
00:59:52,752 --> 00:59:56,365
So this is really
the determinant of C

811
00:59:56,365 --> 00:59:57,610
that will appear here.

812
01:00:00,420 --> 01:00:01,220
Yes?

813
01:00:01,220 --> 01:00:07,045
AUDIENCE: So are you inverting
the individual elements of C

814
01:00:07,045 --> 01:00:11,180
or are you inverting the matrix
C and taking its elements?

815
01:00:11,180 --> 01:00:14,940
PROFESSOR: Actually
a very good point.

816
01:00:14,940 --> 01:00:19,780
I really wanted to write it
as the inverse of the matrix

817
01:00:19,780 --> 01:00:22,350
and then peak the
mn [INAUDIBLE].

818
01:00:22,350 --> 01:00:25,890
So we imagine that
we have the matrix.

819
01:00:25,890 --> 01:00:30,960
And these are the
elements of some--

820
01:00:30,960 --> 01:00:33,730
so I could have called
this whatever I want.

821
01:00:33,730 --> 01:00:37,150
So I could have called the
coefficients of x and n.

822
01:00:37,150 --> 01:00:44,610
I have chosen to regard
them as the inverse

823
01:00:44,610 --> 01:00:48,260
of some other matrix C.
And the reason for that

824
01:00:48,260 --> 01:00:52,350
becomes shortly clear,
because the covariances will

825
01:00:52,350 --> 01:00:55,240
be related to the
inverse of this matrix.

826
01:00:55,240 --> 01:00:58,731
And hence, that's the
appropriate way to look at it.

827
01:00:58,731 --> 01:01:00,730
AUDIENCE: Can [INAUDIBLE]
what C means up there?

828
01:01:00,730 --> 01:01:02,030
PROFESSOR: OK.

829
01:01:02,030 --> 01:01:05,845
So let's forget
about this lambdas.

830
01:01:05,845 --> 01:01:09,080
So I would have in
general for two variables

831
01:01:09,080 --> 01:01:12,640
some coefficient for x1
squared, some coefficient

832
01:01:12,640 --> 01:01:17,730
for x2 squared, and some
coefficient for x1, x2.

833
01:01:17,730 --> 01:01:21,070
So I could call this a11.

834
01:01:21,070 --> 01:01:23,580
I could call this a22.

835
01:01:23,580 --> 01:01:26,430
I could call this 2a12.

836
01:01:26,430 --> 01:01:28,550
Or actually I
could, if I wanted,

837
01:01:28,550 --> 01:01:33,250
just write it as
a12 plus a21 x2x1

838
01:01:33,250 --> 01:01:36,930
or do a1 to an a21
would be the same.

839
01:01:36,930 --> 01:01:42,350
So what I could then
regard this is as x1 2.

840
01:01:42,350 --> 01:01:50,440
The matrix a11, a12,
a21, a22, x1, x2.

841
01:01:50,440 --> 01:01:54,304
So this is exactly
the same as that.

842
01:01:54,304 --> 01:01:55,570
All right?

843
01:01:55,570 --> 01:02:01,490
So these objects
here are the elements

844
01:02:01,490 --> 01:02:04,330
of this matrix C inverse.

845
01:02:04,330 --> 01:02:12,790
So I could call this x1,
x2 some matrix A x1 x2.

846
01:02:12,790 --> 01:02:14,990
That A is 2 by 2 matrix.

847
01:02:14,990 --> 01:02:19,080
The name I have given to that
2 by 2 matrix in C inverse.

848
01:02:23,160 --> 01:02:23,660
Yes?

849
01:02:23,660 --> 01:02:25,404
AUDIENCE: The matrix
is required to be

850
01:02:25,404 --> 01:02:25,840
symmetric though, isn't it?

851
01:02:25,840 --> 01:02:27,360
PROFESSOR: The
matrix is required

852
01:02:27,360 --> 01:02:29,450
to be symmetric for
any quadrant form.

853
01:02:29,450 --> 01:02:30,270
Yes.

854
01:02:30,270 --> 01:02:33,920
So when I wrote it
initially, I wrote as 2 a12.

855
01:02:33,920 --> 01:02:36,410
And then I said, well,
I can also write it

856
01:02:36,410 --> 01:02:38,730
this fashion provided the
two of them are the same.

857
01:02:38,730 --> 01:02:39,230
Yes?

858
01:02:39,230 --> 01:02:42,150
AUDIENCE: How did you know
the determinant of C belonged

859
01:02:42,150 --> 01:02:42,650
there?

860
01:02:42,650 --> 01:02:43,380
PROFESSOR: Pardon?

861
01:02:43,380 --> 01:02:45,020
AUDIENCE: How did you
know that the determinant

862
01:02:45,020 --> 01:02:45,545
of C [INAUDIBLE]?

863
01:02:45,545 --> 01:02:46,128
PROFESSOR: OK.

864
01:02:46,128 --> 01:02:48,190
How do I know the
determinant of C?

865
01:02:48,190 --> 01:02:51,330
Let's say I give you this form.

866
01:02:51,330 --> 01:02:55,270
And then I don't know
what the normalization is.

867
01:02:55,270 --> 01:03:00,390
What I can do is I can do a
change of variables from x1 x2

868
01:03:00,390 --> 01:03:06,690
to something like y1 y2 such
that when I look at y1 and y2,

869
01:03:06,690 --> 01:03:10,220
the matrix becomes diagonal.

870
01:03:10,220 --> 01:03:13,130
So I can rotate the matrix.

871
01:03:13,130 --> 01:03:18,730
So any matrix I can imagine
that I will find some U such

872
01:03:18,730 --> 01:03:24,870
that A U U dagger is this
diagonal matrix lambda.

873
01:03:24,870 --> 01:03:29,370
Now under these procedures,
one thing that does not change

874
01:03:29,370 --> 01:03:30,970
is the determinant.

875
01:03:30,970 --> 01:03:34,290
It's always the product
of the eigenvalues.

876
01:03:34,290 --> 01:03:35,980
The way that I set
up the problem,

877
01:03:35,980 --> 01:03:41,480
I said that if I hadn't made
the problem to have cross terms,

878
01:03:41,480 --> 01:03:45,560
I knew the answers to be the
product of that eigenvalues.

879
01:03:45,560 --> 01:03:49,870
So if you like, I can start from
there and then do a rotation

880
01:03:49,870 --> 01:03:52,350
and have the more general form.

881
01:03:52,350 --> 01:03:54,800
The answer would stay
as the determinant.

882
01:03:54,800 --> 01:03:55,571
Yes?

883
01:03:55,571 --> 01:03:57,820
AUDIENCE: The matrix should
be positive as well or no?

884
01:03:57,820 --> 01:04:00,490
PROFESSOR: The matrix should
be positive definite in order

885
01:04:00,490 --> 01:04:04,620
for the probability to be
well-defined and exist, yes.

886
01:04:04,620 --> 01:04:06,090
OK.

887
01:04:06,090 --> 01:04:10,380
So if you like, by stating
that this is a probability,

888
01:04:10,380 --> 01:04:12,650
I have imposed a
number of conditions

889
01:04:12,650 --> 01:04:15,560
such as symmetry, as
well as positivity.

890
01:04:18,310 --> 01:04:20,750
Yes.

891
01:04:20,750 --> 01:04:23,337
OK.

892
01:04:23,337 --> 01:04:24,670
But this is just linear algebra.

893
01:04:29,460 --> 01:04:31,929
I will assume that you
know linear algebra.

894
01:04:34,710 --> 01:04:35,500
OK.

895
01:04:35,500 --> 01:04:40,790
So this property normalized
Gaussian joint probability.

896
01:04:40,790 --> 01:04:44,570
We are interested in the
characteristic function.

897
01:04:44,570 --> 01:04:48,256
So what we are interested is the
joint Gaussian characteristic.

898
01:04:54,640 --> 01:04:58,010
And so again we
saw the procedure

899
01:04:58,010 --> 01:05:00,636
was that I have to do
the Fourier transform.

900
01:05:05,200 --> 01:05:08,660
So I have to take this
probability that I

901
01:05:08,660 --> 01:05:13,600
have over there and do
an integration product,

902
01:05:13,600 --> 01:05:17,470
say, alpha running
from 1 to N dx

903
01:05:17,470 --> 01:05:27,750
alpha of e to the
minus ik alpha x alpha.

904
01:05:27,750 --> 01:05:31,620
This product exists
for all values.

905
01:05:31,620 --> 01:05:34,810
Then I have to multiply
with this probability

906
01:05:34,810 --> 01:05:37,220
that I have up there,
which would appear here.

907
01:05:41,339 --> 01:05:41,839
OK.

908
01:05:44,730 --> 01:05:49,570
Now, again maybe an
easy way to imagine

909
01:05:49,570 --> 01:05:52,420
is what I was saying
to previously.

910
01:05:52,420 --> 01:05:54,880
Let's imagine that
I have rotated

911
01:05:54,880 --> 01:05:59,260
into a basis where
everything is diagonal.

912
01:05:59,260 --> 01:06:03,350
Then in the rotated
basis, all you need to do

913
01:06:03,350 --> 01:06:11,980
is to essentially do product
of characteristic functions

914
01:06:11,980 --> 01:06:14,680
such as what we have over here.

915
01:06:14,680 --> 01:06:20,560
So the corresponding
product to this first term

916
01:06:20,560 --> 01:06:26,250
would be exponential of
minus i sum over N running

917
01:06:26,250 --> 01:06:29,640
from 1 to N k
alpha lambda alpha.

918
01:06:32,910 --> 01:06:34,440
kn lambda n.

919
01:06:34,440 --> 01:06:39,140
I guess I'm using n
as the variable here.

920
01:06:39,140 --> 01:06:43,440
And as long as things
would be diagonal,

921
01:06:43,440 --> 01:06:46,670
the next ordered
term would be a sum

922
01:06:46,670 --> 01:06:52,260
over alpha kn squared the
corresponding eigenvalue

923
01:06:52,260 --> 01:06:54,790
inverted.

924
01:06:54,790 --> 01:07:00,560
So remember that in the diagonal
form, each one of these sigmas

925
01:07:00,560 --> 01:07:03,690
would appear as the diagonal.

926
01:07:03,690 --> 01:07:07,950
If I do my rotation,
essentially this term

927
01:07:07,950 --> 01:07:10,180
would not be affected.

928
01:07:10,180 --> 01:07:15,180
The next term would give me
minus 1/2 sum over m and n

929
01:07:15,180 --> 01:07:19,860
rather than just having k1
squared k2 squared, et cetera.

930
01:07:19,860 --> 01:07:25,890
Just like here, I
would have km kn.

931
01:07:25,890 --> 01:07:30,360
What happened previously
was that each eigenvalue

932
01:07:30,360 --> 01:07:32,460
would get inverted.

933
01:07:32,460 --> 01:07:36,860
If you think about rotating a
matrix, all of its eigenvalues

934
01:07:36,860 --> 01:07:41,310
are inverted, you are really
rotating the inverse matrix.

935
01:07:41,310 --> 01:07:44,260
So this here would be the
inverse of whatever matrix

936
01:07:44,260 --> 01:07:44,990
I have here.

937
01:07:44,990 --> 01:07:46,426
So this would be C mn.

938
01:07:49,620 --> 01:07:54,960
So I did will leave you to
do the corresponding linear

939
01:07:54,960 --> 01:07:59,820
algebra here, but the
answer is correct.

940
01:07:59,820 --> 01:08:10,630
So the answer is that the
generator of cumulants

941
01:08:10,630 --> 01:08:15,090
for a joint Gaussian
distribution

942
01:08:15,090 --> 01:08:27,479
has a form which has a bunch of
the linear terms-- kn lambda n.

943
01:08:27,479 --> 01:08:29,514
And a bunch of
second order terms,

944
01:08:29,514 --> 01:08:35,910
so we will have minus
1/2 sum over m and n km

945
01:08:35,910 --> 01:08:38,245
kn times some coefficient.

946
01:08:42,550 --> 01:08:46,670
And the series terminates here.

947
01:08:46,670 --> 01:08:52,710
So for the joint Gaussian,
you have first cumulant.

948
01:08:52,710 --> 01:08:57,470
So the expectation
value of nm cumulant

949
01:08:57,470 --> 01:09:01,430
is the same thing as lambda m.

950
01:09:01,430 --> 01:09:09,149
You have covariances or second
cumulants xm, xn, C is Cmn.

951
01:09:09,149 --> 01:09:13,109
And in particular,
the diagonal elements

952
01:09:13,109 --> 01:09:16,430
would correspond
to the variances.

953
01:09:16,430 --> 01:09:23,490
And all the higher orders
are 0 because there's

954
01:09:23,490 --> 01:09:25,250
no further term
in the expansion.

955
01:09:39,750 --> 01:09:47,300
So for example, if I were to
calculate this thing that I

956
01:09:47,300 --> 01:09:55,100
have on the board here for
the case of a Gaussian,

957
01:09:55,100 --> 01:10:02,150
for the case of the Gaussian, I
would not have this third term.

958
01:10:02,150 --> 01:10:04,200
So the answer that
I would write down

959
01:10:04,200 --> 01:10:06,640
for the case of the
third term would

960
01:10:06,640 --> 01:10:09,860
be something that
didn't have this.

961
01:10:09,860 --> 01:10:13,785
And in the way that we
have written things,

962
01:10:13,785 --> 01:10:16,120
the answer would
have been x1 squared

963
01:10:16,120 --> 01:10:22,020
x2-- would be just a
lambda 1 squared lambda

964
01:10:22,020 --> 01:10:27,050
2 plus sigma 1 squared,
or let's call it

965
01:10:27,050 --> 01:10:36,240
C11, times lambda 2
plus 2 lambda 1 C12.

966
01:10:36,240 --> 01:10:37,410
And that's it.

967
01:11:01,670 --> 01:11:05,670
So there is something
that follows from this

968
01:11:05,670 --> 01:11:13,070
that it is used a
lot in field theory.

969
01:11:13,070 --> 01:11:15,792
And it's called Wick's theorem.

970
01:11:15,792 --> 01:11:20,140
So that's just a
particular case of this,

971
01:11:20,140 --> 01:11:22,480
but let's state it anyway.

972
01:11:32,410 --> 01:11:51,000
So for Gaussian distributed
variables of 0 mean,

973
01:11:51,000 --> 01:11:53,770
following condition applies.

974
01:11:53,770 --> 01:11:56,665
I can take the first
variable raised

975
01:11:56,665 --> 01:12:00,450
to power n1, the
second variable to n2,

976
01:12:00,450 --> 01:12:03,592
the last variable
to some other nN

977
01:12:03,592 --> 01:12:08,530
and look at a joint
expectation value such as this.

978
01:12:08,530 --> 01:12:16,220
And this is 0 if sum
over alpha and alpha

979
01:12:16,220 --> 01:12:37,120
is odd and is called sum
over all pairwise contraction

980
01:12:37,120 --> 01:12:42,056
if a sum over alpha
and alpha is even.

981
01:12:47,920 --> 01:12:52,320
So actually, I have right
here an example of this.

982
01:12:52,320 --> 01:12:56,110
If I have a Gaussian
variable-- jointly

983
01:12:56,110 --> 01:13:00,100
distributed Gaussian variables
where the means are all 0-- so

984
01:13:00,100 --> 01:13:05,110
if I say that lambda
1 and lambda 2 are 0,

985
01:13:05,110 --> 01:13:08,210
then this is an odd
power, x1 squared x2.

986
01:13:08,210 --> 01:13:10,300
Because of the symmetry
it has to be 0,

987
01:13:10,300 --> 01:13:13,790
but you explicitly see
that every term that I have

988
01:13:13,790 --> 01:13:16,840
will be multiplying some
power of [INAUDIBLE].

989
01:13:16,840 --> 01:13:19,000
Whereas if for
other than this, I

990
01:13:19,000 --> 01:13:27,150
was looking at something
like x1 squared x2 x3

991
01:13:27,150 --> 01:13:32,240
where the net power
is even, then I

992
01:13:32,240 --> 01:13:35,070
could sort of
imagine putting them

993
01:13:35,070 --> 01:13:37,330
into these kinds of diagrams.

994
01:13:37,330 --> 01:13:43,950
Or alternatively, I can
imagine pairing these things

995
01:13:43,950 --> 01:13:45,780
in all possible ways.

996
01:13:45,780 --> 01:13:48,246
So one pairing would
be this with this,

997
01:13:48,246 --> 01:13:57,730
this with this, which would have
given me x1 squared C x2 x3 C.

998
01:13:57,730 --> 01:14:02,180
Another pairing would
have been x1 with x2.

999
01:14:02,180 --> 01:14:05,650
And then, naturally, x1 with x3.

1000
01:14:05,650 --> 01:14:11,495
So I would have gotten
x1 with x2 covariance x1

1001
01:14:11,495 --> 01:14:12,820
with extreme x3 covariance.

1002
01:14:15,660 --> 01:14:21,690
But I could have connected
the x1 to x2 or the second x1

1003
01:14:21,690 --> 01:14:22,980
to x2.

1004
01:14:22,980 --> 01:14:26,920
So this comes in 2 variance.

1005
01:14:26,920 --> 01:14:38,320
And so the answer here would
be C11 C23 plus 2 C1 2 C13.

1006
01:14:38,320 --> 01:14:40,190
Yes?

1007
01:14:40,190 --> 01:14:45,930
AUDIENCE: In your writing
of x1 to the n1 [INAUDIBLE].

1008
01:14:45,930 --> 01:14:47,850
It should be the
cumulant, right?

1009
01:14:47,850 --> 01:14:48,696
Or is it the moment?

1010
01:14:48,696 --> 01:14:49,946
PROFESSOR: This is the moment.

1011
01:14:49,946 --> 01:14:50,735
AUDIENCE: OK.

1012
01:14:50,735 --> 01:14:54,420
PROFESSOR: The contractions
are the covariances.

1013
01:14:54,420 --> 01:14:56,042
AUDIENCE: OK.

1014
01:14:56,042 --> 01:15:00,580
PROFESSOR: So the point is
that the Gaussian distribution

1015
01:15:00,580 --> 01:15:05,180
is completely characterized
in terms of its covariances.

1016
01:15:05,180 --> 01:15:07,665
Once you know the
covariances, essentially you

1017
01:15:07,665 --> 01:15:09,700
know everything.

1018
01:15:09,700 --> 01:15:13,080
And in particular, you may be
interested in some particular

1019
01:15:13,080 --> 01:15:14,980
combination of x's.

1020
01:15:14,980 --> 01:15:18,490
And then you use to
express that in terms

1021
01:15:18,490 --> 01:15:22,160
of all possible
pairwise contractions,

1022
01:15:22,160 --> 01:15:24,480
which are the covariances.

1023
01:15:24,480 --> 01:15:29,730
And essentially, in
all of field theory,

1024
01:15:29,730 --> 01:15:32,900
you expand around some kind
of a Gaussian background

1025
01:15:32,900 --> 01:15:35,870
or Gaussian 0 toward the result.

1026
01:15:35,870 --> 01:15:37,520
And then in your
perturbation theory

1027
01:15:37,520 --> 01:15:40,200
you need various
powers of your field

1028
01:15:40,200 --> 01:15:43,079
or some combination of
powers, and you express them

1029
01:15:43,079 --> 01:15:44,620
through these kinds
of relationships.

1030
01:15:55,400 --> 01:15:56,000
Any questions?

1031
01:16:01,440 --> 01:16:01,940
OK.

1032
01:16:08,329 --> 01:16:08,870
This is fine.

1033
01:16:08,870 --> 01:16:10,850
Let's get rid of this.

1034
01:16:22,480 --> 01:16:22,980
OK.

1035
01:16:25,710 --> 01:16:33,500
Now there is one result that
all of statistical mechanics

1036
01:16:33,500 --> 01:16:35,290
hangs on.

1037
01:16:35,290 --> 01:16:39,230
So I expect that
as I get old and I

1038
01:16:39,230 --> 01:16:43,810
get infirm or whatever
and my memory vanishes,

1039
01:16:43,810 --> 01:16:46,272
the last thing that I
will remember before I die

1040
01:16:46,272 --> 01:16:47,730
would be the central
limit theorem.

1041
01:16:58,750 --> 01:17:01,740
And why is this important
is because you end up

1042
01:17:01,740 --> 01:17:05,960
in statistical physics
adding lots of things.

1043
01:17:05,960 --> 01:17:09,960
So really, the question that
you have or you should be asking

1044
01:17:09,960 --> 01:17:13,280
is thermodynamics is
a very precise thing.

1045
01:17:13,280 --> 01:17:15,370
It says that heat
goes from the higher

1046
01:17:15,370 --> 01:17:16,810
temperature to
lower temperature.

1047
01:17:16,810 --> 01:17:21,610
It doesn't say it does that 50%
of the time or 95% of the time.

1048
01:17:21,610 --> 01:17:23,630
It's a definite statement.

1049
01:17:23,630 --> 01:17:25,660
If I am telling you
that ultimately I'm

1050
01:17:25,660 --> 01:17:29,660
going to express everything
in terms of probabilities,

1051
01:17:29,660 --> 01:17:31,610
how does that jive?

1052
01:17:31,610 --> 01:17:34,480
The reason that it jives
is because of this theorem.

1053
01:17:34,480 --> 01:17:40,290
It's because in order to go from
the probabilistic description,

1054
01:17:40,290 --> 01:17:43,170
you will be dealing with
so many different-- so many

1055
01:17:43,170 --> 01:17:46,630
large number of variables--
that probabilistic statements

1056
01:17:46,630 --> 01:17:49,590
actually become precise
deterministic statements.

1057
01:17:49,590 --> 01:17:52,200
And that's captured
by this theorem, which

1058
01:17:52,200 --> 01:18:01,160
says that let's look at the
sum of N random variables.

1059
01:18:06,550 --> 01:18:14,332
And I will indicate the sum
by x and my random variables

1060
01:18:14,332 --> 01:18:17,750
as small x's.

1061
01:18:17,750 --> 01:18:23,230
And let's say that this is, for
the individual set of things

1062
01:18:23,230 --> 01:18:25,580
that I'm adding up
together, some kind

1063
01:18:25,580 --> 01:18:28,690
of a joint probability
distribution

1064
01:18:28,690 --> 01:18:31,670
out of which I take
these random variables.

1065
01:18:31,670 --> 01:18:40,230
So each instance of this sum is
selected from this joint PDF,

1066
01:18:40,230 --> 01:18:45,510
so x itself is a random variable
because of possible choices

1067
01:18:45,510 --> 01:18:49,180
of different xi from this
probability distribution.

1068
01:18:51,720 --> 01:18:56,420
So what I'm interested is what
is the probability for the sum?

1069
01:18:56,420 --> 01:19:03,770
So what is the p that
determines this sum?

1070
01:19:09,410 --> 01:19:13,950
I will go by the root of these
characteristic functions.

1071
01:19:13,950 --> 01:19:19,210
I will say, OK, what's the
expectation value of-- well,

1072
01:19:19,210 --> 01:19:24,170
let's-- what's the Fourier
transform of this probability

1073
01:19:24,170 --> 01:19:26,730
distribution?

1074
01:19:26,730 --> 01:19:31,450
If we transform, by definition
it is the expectation of e

1075
01:19:31,450 --> 01:19:37,320
to the minus ik this big X,
which is the sum over all off

1076
01:19:37,320 --> 01:19:38,351
the small x's.

1077
01:19:47,640 --> 01:19:49,370
Do I have that
definition somewhere?

1078
01:19:49,370 --> 01:19:50,820
I erased it.

1079
01:19:50,820 --> 01:19:56,290
Basically, what is this?

1080
01:19:56,290 --> 01:20:02,740
If this k was, in fact,
different k's-- if I had a k1

1081
01:20:02,740 --> 01:20:05,940
multiplying x1,
k2 multiplying x2,

1082
01:20:05,940 --> 01:20:08,790
that would be the definition
of the joint characteristics

1083
01:20:08,790 --> 01:20:13,350
function for this joint
probability distribution.

1084
01:20:13,350 --> 01:20:18,470
So what this is is you take the
joint characteristic function,

1085
01:20:18,470 --> 01:20:27,010
which depends on k1
k2, all the way to kn.

1086
01:20:27,010 --> 01:20:28,830
And you set all of
them to be the same.

1087
01:20:38,460 --> 01:20:41,530
So take the joint
characteristic function

1088
01:20:41,530 --> 01:20:44,090
depends on N Fourier variables.

1089
01:20:44,090 --> 01:20:47,000
Put all of them the same k
and you have that for the sum.

1090
01:20:50,230 --> 01:20:55,970
So I can certainly do
that by adding a log here.

1091
01:21:00,600 --> 01:21:02,720
Nothing has changed.

1092
01:21:02,720 --> 01:21:09,340
I know that the log is the
generator of the cumulants.

1093
01:21:09,340 --> 01:21:13,470
So this is a sum
over, let's say, n

1094
01:21:13,470 --> 01:21:16,690
running from 1 to
infinity minus ik

1095
01:21:16,690 --> 01:21:20,390
to the power of n
over n factorial,

1096
01:21:20,390 --> 01:21:22,990
the joint cumulant of the sum.

1097
01:21:29,640 --> 01:21:31,790
So what is the
expansion that I would

1098
01:21:31,790 --> 01:21:37,580
have for log of the joint
characteristic function?

1099
01:21:37,580 --> 01:21:43,150
Well, typically I would say have
at the lowest order k1 times

1100
01:21:43,150 --> 01:21:45,930
the mean of the first
variable, k2 times

1101
01:21:45,930 --> 01:21:47,690
the mean of the second variable.

1102
01:21:47,690 --> 01:21:49,440
But all of them are the same.

1103
01:21:49,440 --> 01:21:52,550
So the first order,
I would get minus i

1104
01:21:52,550 --> 01:21:58,480
the same k sum over n
of the first cumulant

1105
01:21:58,480 --> 01:22:00,360
of the N-th variable.

1106
01:22:06,180 --> 01:22:09,160
Typically, this
second order term,

1107
01:22:09,160 --> 01:22:11,080
I would have all
kinds of products.

1108
01:22:11,080 --> 01:22:15,920
I would have k1 k3 k2 k4,
as well as k1 squared.

1109
01:22:15,920 --> 01:22:18,190
But now all of them
become the same,

1110
01:22:18,190 --> 01:22:23,780
and so what I will have
is a minus ik squared.

1111
01:22:23,780 --> 01:22:34,260
But then I have all possible
pairings mn of xm xn cumulants.

1112
01:22:36,802 --> 01:22:37,770
AUDIENCE: Question.

1113
01:22:37,770 --> 01:22:38,410
PROFESSOR: Yes?

1114
01:22:38,410 --> 01:22:40,170
AUDIENCE: [INAUDIBLE]
expression you probably

1115
01:22:40,170 --> 01:22:41,000
should use different
indices when you're

1116
01:22:41,000 --> 01:22:43,365
summing over elements
of Taylor serious

1117
01:22:43,365 --> 01:22:45,828
and when you're summing
over your [INAUDIBLE]

1118
01:22:45,828 --> 01:22:46,806
random variables.

1119
01:22:46,806 --> 01:22:53,163
Just-- it gets confusing
when both indexes are n.

1120
01:22:53,163 --> 01:22:56,330
PROFESSOR: This here, you
want me to right here, say, i?

1121
01:22:56,330 --> 01:22:58,752
AUDIENCE: Yeah.

1122
01:22:58,752 --> 01:23:00,890
PROFESSOR: OK.

1123
01:23:00,890 --> 01:23:02,978
And here I can write i and j.

1124
01:23:12,020 --> 01:23:15,180
So I think there's
still a 2 factorial.

1125
01:23:15,180 --> 01:23:18,680
And then there's higher orders.

1126
01:23:18,680 --> 01:23:23,490
Essentially then, matching
the coefficients of minus ik

1127
01:23:23,490 --> 01:23:27,470
from the left minus
ik from the right

1128
01:23:27,470 --> 01:23:35,440
will enable me to calculate
relationships between cumulants

1129
01:23:35,440 --> 01:23:39,980
of the sum and cumulants of
the individual variables.

1130
01:23:39,980 --> 01:23:44,070
This first one of them is
not particularly surprising.

1131
01:23:44,070 --> 01:23:46,980
You would say that
the mean of the sum

1132
01:23:46,980 --> 01:23:49,540
is sum of the means of
the individual variables.

1133
01:23:54,890 --> 01:24:02,290
The second statement is
that the variance of the sum

1134
01:24:02,290 --> 01:24:11,410
really involves a pair, i
and j, running from 1 to N.

1135
01:24:11,410 --> 01:24:15,030
So if these variable
were independent,

1136
01:24:15,030 --> 01:24:18,230
you would be just
adding the variances.

1137
01:24:18,230 --> 01:24:21,640
Since they are
potentially dependent,

1138
01:24:21,640 --> 01:24:25,610
you have to also keep
track of covariances.

1139
01:24:28,320 --> 01:24:33,430
And this kind of summation
extends to higher and higher

1140
01:24:33,430 --> 01:24:38,130
cumulants, essentially
including more and more powers

1141
01:24:38,130 --> 01:24:40,905
of cumulants that you
would put on that side.

1142
01:24:47,170 --> 01:24:49,570
And what we do
with that, I guess

1143
01:24:49,570 --> 01:24:52,030
we'll start next time around.