1 00:00:00,060 --> 00:00:01,780 The following content is provided 2 00:00:01,780 --> 00:00:04,019 under a Creative Commons license. 3 00:00:04,019 --> 00:00:06,870 Your support will help MIT OpenCourseWare continue 4 00:00:06,870 --> 00:00:10,730 to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,340 To make a donation or view additional materials 6 00:00:13,340 --> 00:00:17,217 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,217 --> 00:00:17,842 at ocw.mit.edu. 8 00:00:20,365 --> 00:00:25,220 PROFESSOR: We'll begin this time by looking at some probability 9 00:00:25,220 --> 00:00:27,310 distribution that you should be familiar 10 00:00:27,310 --> 00:00:30,510 with from this perspective, starting 11 00:00:30,510 --> 00:00:33,780 with a Gaussian distribution for one variable. 12 00:00:42,100 --> 00:00:44,660 We're focused on a variable that takes 13 00:00:44,660 --> 00:00:48,670 real values in the interval minus infinity to infinity 14 00:00:48,670 --> 00:00:53,255 and the Gaussian has the form exponential that 15 00:00:53,255 --> 00:00:57,450 is centered around some value, let's call it lambda, 16 00:00:57,450 --> 00:01:02,380 and has fluctuations around this value parameterized by sigma. 17 00:01:02,380 --> 00:01:07,940 And the integral of this p over the interval 18 00:01:07,940 --> 00:01:10,970 should be normalized to unity, giving you 19 00:01:10,970 --> 00:01:16,780 this hopefully very family of form. 20 00:01:16,780 --> 00:01:23,540 Now, if you want to characterize the characteristic function, 21 00:01:23,540 --> 00:01:27,120 all we need to do is to Fourier transform this. 22 00:01:27,120 --> 00:01:36,580 So I have the integral dx e the minus ikx. 23 00:01:36,580 --> 00:01:39,810 So this-- let's remind you alternatively 24 00:01:39,810 --> 00:01:42,560 was the expectation value of e to the minus ikx. 25 00:01:45,860 --> 00:01:51,310 minus x minus lambda squared over 2 sigma squared, 26 00:01:51,310 --> 00:01:54,160 which is the probably distribution. 27 00:01:54,160 --> 00:01:59,220 And you should know what the answer to that is, 28 00:01:59,220 --> 00:02:00,480 but I will remind. 29 00:02:00,480 --> 00:02:05,730 You can change variables to x minus lambda [INAUDIBLE] y. 30 00:02:05,730 --> 00:02:08,210 So from here we will get the factor of 2 31 00:02:08,210 --> 00:02:10,780 to the minus ik lambda. 32 00:02:10,780 --> 00:02:19,050 You have then the integral over y minus y 33 00:02:19,050 --> 00:02:22,810 squared over 2 sigma squared. 34 00:02:22,810 --> 00:02:29,530 And then what we need to do is to complete this square over 35 00:02:29,530 --> 00:02:31,990 here. 36 00:02:31,990 --> 00:02:34,890 And you can do that, essentially, 37 00:02:34,890 --> 00:02:40,300 by adding and subtracting a minus 38 00:02:40,300 --> 00:02:45,320 k squared sigma squared over 2. 39 00:02:45,320 --> 00:02:53,210 So that if I change variable to y plus ik sigma squared, 40 00:02:53,210 --> 00:03:00,910 let's call that z, then I have outside the integral e 41 00:03:00,910 --> 00:03:10,595 to the minus ik lambda minus k squared sigma squared over 2. 42 00:03:10,595 --> 00:03:14,470 And the remainder I can write as a full square. 43 00:03:21,800 --> 00:03:25,790 And this is just a normalized Gaussian integral 44 00:03:25,790 --> 00:03:29,500 that comes to 1. 45 00:03:29,500 --> 00:03:37,260 So as you well know, a Fourier transform of a Gaussian 46 00:03:37,260 --> 00:03:40,960 is itself a Gaussian, and that's what we've established. 47 00:03:40,960 --> 00:03:48,350 E to the minus ik lambda minus k squared sigma squared over 2m. 48 00:03:48,350 --> 00:03:51,600 And if I haven't made a mistake when I said k equals to 0, 49 00:03:51,600 --> 00:03:56,070 the answer should be 1 because k equals to 0 expectation 50 00:03:56,070 --> 00:03:58,376 value of 1 just amounts to normalization. 51 00:04:01,100 --> 00:04:06,850 Now what we said was that a more interesting function 52 00:04:06,850 --> 00:04:10,360 is obtained by taking the log of this. 53 00:04:10,360 --> 00:04:16,970 So from here we go to the log of [INAUDIBLE] of k. 54 00:04:16,970 --> 00:04:22,090 Log of [INAUDIBLE] of k is very simple for the Gaussian. 55 00:04:28,610 --> 00:04:35,580 And what we had said was that by definition 56 00:04:35,580 --> 00:04:38,490 this log of the characteristic function 57 00:04:38,490 --> 00:04:43,510 generates cumulants through the series 58 00:04:43,510 --> 00:04:49,990 minus ik to the power of n over n factorial, the end cumulant. 59 00:04:53,110 --> 00:04:57,510 So looking at this, we can immediately 60 00:04:57,510 --> 00:05:01,690 see that the Guassian is characterized 61 00:05:01,690 --> 00:05:07,030 by a first cumulant, which is the coefficient of minus ik. 62 00:05:07,030 --> 00:05:07,550 It's lambda. 63 00:05:12,440 --> 00:05:15,550 It is characterized by a second cumulant, which 64 00:05:15,550 --> 00:05:20,480 is the coefficient of minus ik squared. 65 00:05:20,480 --> 00:05:22,160 This, you know, is the variance. 66 00:05:22,160 --> 00:05:25,170 And we can explicitly see that the coefficient 67 00:05:25,170 --> 00:05:29,040 of minus ik squared over 2 factorial 68 00:05:29,040 --> 00:05:31,660 is simply sigma squared. 69 00:05:31,660 --> 00:05:34,920 So this is reputation. 70 00:05:34,920 --> 00:05:37,560 But one thing that is interesting 71 00:05:37,560 --> 00:05:40,880 is that our series has now terminates, 72 00:05:40,880 --> 00:05:43,010 which means that if I were to look 73 00:05:43,010 --> 00:05:48,280 at the third cumulant, if I were to look at the fourth cumulant, 74 00:05:48,280 --> 00:05:52,470 and so forth, for the Guassian, they're all 0. 75 00:05:52,470 --> 00:05:54,730 So the Gaussian is the distribution 76 00:05:54,730 --> 00:05:57,850 that is completely characterized just 77 00:05:57,850 --> 00:06:00,940 by its first and second cumulants, all the rest 78 00:06:00,940 --> 00:06:01,763 being 0. 79 00:06:05,080 --> 00:06:09,670 So, now our last time we developed 80 00:06:09,670 --> 00:06:12,500 some kind of a graphical method. 81 00:06:12,500 --> 00:06:17,840 We said that I can graphically describe the first cumulant 82 00:06:17,840 --> 00:06:22,110 as a bag with one point in it. 83 00:06:22,110 --> 00:06:26,122 Second cumulant with something that has two points in it. 84 00:06:26,122 --> 00:06:29,520 A third cumulant with three three, 85 00:06:29,520 --> 00:06:36,220 fourth cumulant with four and four points and so forth. 86 00:06:36,220 --> 00:06:39,190 This is just rewriting. 87 00:06:39,190 --> 00:06:40,930 Now, the interesting thing was that we 88 00:06:40,930 --> 00:06:47,490 said that the various moments we could express graphically. 89 00:06:47,490 --> 00:06:50,540 So that, for example, the second moment 90 00:06:50,540 --> 00:06:56,760 is either this or this, which then graphically 91 00:06:56,760 --> 00:07:01,120 is the same thing as lambda squared plus sigma squared 92 00:07:01,120 --> 00:07:06,440 because this is indicated by sigma squared. 93 00:07:06,440 --> 00:07:12,790 Now, x cubed you would say is either 94 00:07:12,790 --> 00:07:17,360 three things by themselves or put two of them 95 00:07:17,360 --> 00:07:19,610 together and then one separate. 96 00:07:19,610 --> 00:07:23,860 And this I could do in three different ways. 97 00:07:23,860 --> 00:07:26,090 And in general, for a general distribution, 98 00:07:26,090 --> 00:07:29,980 I would have had another term, which is a triangle. 99 00:07:29,980 --> 00:07:32,710 But the triangle is 0. 100 00:07:32,710 --> 00:07:34,940 So for the Gaussian, this terminates here. 101 00:07:34,940 --> 00:07:40,780 I have lambda cubed plus 3 lambda sigma squared. 102 00:07:40,780 --> 00:07:44,300 If I want to calculate x to the fourth, 103 00:07:44,300 --> 00:07:46,560 maybe the old way of doing it would 104 00:07:46,560 --> 00:07:50,710 have been too multiply the Gaussian distribution against x 105 00:07:50,710 --> 00:07:53,620 to the fourth and try to do the integration. 106 00:07:53,620 --> 00:07:59,060 And you would ultimately be able to do that rearranging things 107 00:07:59,060 --> 00:08:02,830 and looking at the various powers of the Gaussian 108 00:08:02,830 --> 00:08:05,566 integrated from minus infinity to infinity. 109 00:08:05,566 --> 00:08:06,815 But you can do it graphically. 110 00:08:06,815 --> 00:08:08,950 You can say, OK. 111 00:08:08,950 --> 00:08:14,160 It's either this or I can have-- well, 112 00:08:14,160 --> 00:08:16,640 I cannot put one aside and three together, 113 00:08:16,640 --> 00:08:18,870 because that doesn't exist. 114 00:08:18,870 --> 00:08:25,330 I could have two together and two not together. 115 00:08:25,330 --> 00:08:28,176 And this I can do in six different ways. 116 00:08:28,176 --> 00:08:31,130 You can convince yourself of that. 117 00:08:31,130 --> 00:08:36,210 What I could do two pairs, which I can do three different ways 118 00:08:36,210 --> 00:08:39,440 because I can either do one, two; one, three; one, four 119 00:08:39,440 --> 00:08:42,360 and then the other is satisfied. 120 00:08:42,360 --> 00:08:47,280 So this is lambda to the fourth plus 6 lambda squared 121 00:08:47,280 --> 00:08:53,130 sigma squared plus 3 sigma to the fourth. 122 00:08:53,130 --> 00:08:55,678 And you can keep going and doing different things. 123 00:08:59,800 --> 00:09:00,856 OK. 124 00:09:00,856 --> 00:09:01,355 Question? 125 00:09:04,778 --> 00:09:05,278 Yeah? 126 00:09:05,278 --> 00:09:07,748 AUDIENCE: Is the second-- [INAUDIBLE]. 127 00:09:13,182 --> 00:09:13,970 PROFESSOR: There? 128 00:09:13,970 --> 00:09:17,180 AUDIENCE: Because-- so you said that the second cumulative-- 129 00:09:17,180 --> 00:09:18,144 PROFESSOR: Oh. 130 00:09:18,144 --> 00:09:19,090 AUDIENCE: --x sqaured. 131 00:09:19,090 --> 00:09:19,590 Yes. 132 00:09:19,590 --> 00:09:20,105 PROFESSOR: Yes. 133 00:09:20,105 --> 00:09:21,015 So that's the wrong-- 134 00:09:21,015 --> 00:09:21,931 AUDIENCE: [INAUDIBLE]. 135 00:09:21,931 --> 00:09:26,520 PROFESSOR: The coefficient of k squared is the second cumulant. 136 00:09:26,520 --> 00:09:28,210 The additional 2 was a mistake. 137 00:09:34,178 --> 00:09:35,120 OK. 138 00:09:35,120 --> 00:09:35,930 Anything else? 139 00:09:40,790 --> 00:09:42,450 All right. 140 00:09:42,450 --> 00:09:46,979 Let's take a look at couple of other distributions, 141 00:09:46,979 --> 00:09:47,770 this time discrete. 142 00:09:52,760 --> 00:10:07,170 So the binomial distribution is repeat a binary random 143 00:10:07,170 --> 00:10:09,845 variable. 144 00:10:09,845 --> 00:10:13,110 And what does this mean? 145 00:10:13,110 --> 00:10:21,500 It means two outcomes that's binary, 146 00:10:21,500 --> 00:10:27,970 let's call them A and B. And if I 147 00:10:27,970 --> 00:10:31,660 have a coin that's head or tails, it's binary. 148 00:10:31,660 --> 00:10:33,390 Two possibilities. 149 00:10:33,390 --> 00:10:38,890 And I can assign probabilities to the two outcomes 150 00:10:38,890 --> 00:10:44,854 to PA and PB, which has to be 1 minus PA. 151 00:10:47,600 --> 00:10:52,160 And the question is if you repeat 152 00:10:52,160 --> 00:11:03,500 this binary random variables n times, 153 00:11:03,500 --> 00:11:17,360 what is the probability of NA outcomes of A? 154 00:11:23,640 --> 00:11:25,605 And I forget to say something important, 155 00:11:25,605 --> 00:11:28,410 so I write it in red. 156 00:11:28,410 --> 00:11:30,160 This should be independent. 157 00:11:33,055 --> 00:11:41,140 That is the outcome of a coin toss at, say, 158 00:11:41,140 --> 00:11:43,260 the fifth time should not influence 159 00:11:43,260 --> 00:11:46,131 the sixth time and future times. 160 00:11:46,131 --> 00:11:46,630 OK. 161 00:11:46,630 --> 00:11:48,650 So this is easy. 162 00:11:48,650 --> 00:11:56,010 The probability to have NA occurrences of A in N trials. 163 00:11:56,010 --> 00:11:58,480 So it has to be index yn. 164 00:11:58,480 --> 00:12:07,490 Is that within the n times that I tossed, A came up NA times. 165 00:12:07,490 --> 00:12:11,590 So it has to be proportional to the probability 166 00:12:11,590 --> 00:12:18,870 of A independently multiplied by itself NA times. 167 00:12:18,870 --> 00:12:22,710 But if I have exactly NA occurrences of A, 168 00:12:22,710 --> 00:12:30,950 all the other times I had B occurring. 169 00:12:30,950 --> 00:12:34,110 So I have the probability of B for the remainder, 170 00:12:34,110 --> 00:12:37,460 which is N minus NA. 171 00:12:37,460 --> 00:12:40,600 Now this is the probability for a specific occurrence, 172 00:12:40,600 --> 00:12:45,510 like the first NA times that I threw the coin I will get A. 173 00:12:45,510 --> 00:12:48,290 The remaining times I would get B. 174 00:12:48,290 --> 00:12:51,550 But the order is not important and the number 175 00:12:51,550 --> 00:12:54,050 of ways that I can shuffle the order 176 00:12:54,050 --> 00:12:59,480 and have a total of NA out of N times is the binomial factor. 177 00:13:08,310 --> 00:13:09,990 Fine. 178 00:13:09,990 --> 00:13:10,830 Again, well known. 179 00:13:10,830 --> 00:13:12,785 Let's look at its characteristic function. 180 00:13:15,470 --> 00:13:20,530 So p tilde, which is now a function of k, 181 00:13:20,530 --> 00:13:23,800 is expectation value of e to the minus ik 182 00:13:23,800 --> 00:13:32,060 NA, which means that I have to weigh e minus ik 183 00:13:32,060 --> 00:13:39,660 NA against the probability of occurrences of NA times, 184 00:13:39,660 --> 00:13:53,005 which is this binomial factor PA to the power of NA, PB 185 00:13:53,005 --> 00:13:56,226 to the power of N minus NA. 186 00:13:56,226 --> 00:14:01,450 And of course, I have to sum over all possible values of NA 187 00:14:01,450 --> 00:14:09,500 that go all the way from 0 to N. 188 00:14:09,500 --> 00:14:13,300 So what we have here is something 189 00:14:13,300 --> 00:14:18,505 which is this combination, PA e to the minus ik 190 00:14:18,505 --> 00:14:21,680 raised to the power of NA. 191 00:14:21,680 --> 00:14:26,400 PB raised to the complement N minus NA multiplied 192 00:14:26,400 --> 00:14:29,890 by the binomial factor summed over all possible values. 193 00:14:29,890 --> 00:14:36,610 So this is just the definition of the binomial expansion of PA 194 00:14:36,610 --> 00:14:46,870 e to the minus ik plus PB raised to the power of N. 195 00:14:46,870 --> 00:14:47,860 And again, let's check. 196 00:14:47,860 --> 00:14:50,940 If I set k equals to 0, I have PA plus PB, 197 00:14:50,940 --> 00:14:54,580 which is 1 raised to the power of N. So things are OK. 198 00:14:54,580 --> 00:14:58,450 So this is the characteristic function. 199 00:14:58,450 --> 00:15:02,600 At this stage, the only thing that I will note about this 200 00:15:02,600 --> 00:15:09,070 is that if I look at the characteristic function 201 00:15:09,070 --> 00:15:13,650 I will get N times. 202 00:15:13,650 --> 00:15:15,420 So this is-- actually, let's make sure 203 00:15:15,420 --> 00:15:19,280 that we maintain the index N. So this is 204 00:15:19,280 --> 00:15:23,940 the characteristic function appropriate to N trials. 205 00:15:23,940 --> 00:15:26,840 And what I get is that up to factor of N, 206 00:15:26,840 --> 00:15:29,320 I will get the characteristic function 207 00:15:29,320 --> 00:15:31,505 that would be appropriate to one trial. 208 00:15:35,600 --> 00:15:38,170 So what that means if I were to look 209 00:15:38,170 --> 00:15:46,650 at powers of k, the expectation value of some cumulant, 210 00:15:46,650 --> 00:15:52,120 if I go to repeat things N times-- so this carries 211 00:15:52,120 --> 00:15:58,820 an index N. It is going to be simply N times what I would 212 00:15:58,820 --> 00:16:00,780 have had in a single trial. 213 00:16:05,470 --> 00:16:07,490 So for a single trial, you really 214 00:16:07,490 --> 00:16:13,330 have two outcomes-- 0 or 1 occurrences of this object. 215 00:16:13,330 --> 00:16:16,910 So for a binary variable, you can really easily compute 216 00:16:16,910 --> 00:16:19,140 these quantities and then you can 217 00:16:19,140 --> 00:16:24,640 calculate the corresponding ones for N trials simply 218 00:16:24,640 --> 00:16:27,910 multiplying by N. And we will see 219 00:16:27,910 --> 00:16:30,950 that this is characteristic, essentially, 220 00:16:30,950 --> 00:16:37,090 of anything that is repeated N times, not just the binomial. 221 00:16:37,090 --> 00:16:40,510 So this form that you have N independent objects, 222 00:16:40,510 --> 00:16:42,620 you would get N times what you would 223 00:16:42,620 --> 00:16:46,090 have for one object is generally valid 224 00:16:46,090 --> 00:16:48,020 and actually something that we will build 225 00:16:48,020 --> 00:16:51,270 a lot of statistical mechanics on because we 226 00:16:51,270 --> 00:16:53,710 are interested in the [INAUDIBLE]. 227 00:16:53,710 --> 00:16:56,280 So we will see that shortly. 228 00:16:56,280 --> 00:16:59,070 But rather than following this, let's 229 00:16:59,070 --> 00:17:03,120 look at a third distribution that is closely related, 230 00:17:03,120 --> 00:17:04,089 which is the Poisson. 231 00:17:09,990 --> 00:17:13,359 And the question that we are asking 232 00:17:13,359 --> 00:17:17,819 is-- we have an interval. 233 00:17:17,819 --> 00:17:24,422 And the question is, what is the probability 234 00:17:24,422 --> 00:17:39,070 of m events in an interval from 0 to T? 235 00:17:39,070 --> 00:17:42,420 And I kind of expressed it this way 236 00:17:42,420 --> 00:17:47,680 because prototypical Poisson distribution is, let's say, 237 00:17:47,680 --> 00:17:49,370 the radioactivity. 238 00:17:49,370 --> 00:17:52,860 And you can be waiting for some time interval from 0 239 00:17:52,860 --> 00:17:56,890 to 1 minute and asking within that time interval, what's 240 00:17:56,890 --> 00:18:02,410 the probability that you will see m radioactive decay events? 241 00:18:02,410 --> 00:18:08,470 So what is the probability if two things happen? 242 00:18:08,470 --> 00:18:28,090 One is that the probability of 1 and only 1 event in interval dt 243 00:18:28,090 --> 00:18:32,402 is alpha dt as dt goes to 0. 244 00:18:34,831 --> 00:18:35,330 OK? 245 00:18:35,330 --> 00:18:40,760 So basically if you look at this over 1 minute, 246 00:18:40,760 --> 00:18:43,680 the chances are that you will see so many events, 247 00:18:43,680 --> 00:18:47,540 so many radioactivities. 248 00:18:47,540 --> 00:18:50,010 If you shorten the interval, the chances 249 00:18:50,010 --> 00:18:53,330 that you would see events would become less and less. 250 00:18:53,330 --> 00:18:55,950 If you make your event infinitesimal, 251 00:18:55,950 --> 00:18:58,650 most of the time nothing would happen 252 00:18:58,650 --> 00:19:01,980 with very small probability that vanishes. 253 00:19:01,980 --> 00:19:08,090 As the size of the interval goes to 0, you will see 1 event. 254 00:19:08,090 --> 00:19:10,130 So this is one condition. 255 00:19:10,130 --> 00:19:16,589 And the second condition is events in different intervals 256 00:19:16,589 --> 00:19:17,255 are independent. 257 00:19:23,430 --> 00:19:26,200 And since I wrote independent in red up there, 258 00:19:26,200 --> 00:19:31,084 let me write it in red here because it sort of harks back 259 00:19:31,084 --> 00:19:32,000 to the same condition. 260 00:19:34,970 --> 00:19:38,014 And so this is the question. 261 00:19:38,014 --> 00:19:39,055 What is this probability? 262 00:19:42,210 --> 00:19:47,630 And the to get the answer, what we do 263 00:19:47,630 --> 00:20:02,240 is to subdivide our big interval into N, 264 00:20:02,240 --> 00:20:08,130 which is big T divided by the small dt subintervals. 265 00:20:13,330 --> 00:20:19,515 So basically, originally let's say on the time axis, 266 00:20:19,515 --> 00:20:24,460 we were covering a distance that went from 0 to big T 267 00:20:24,460 --> 00:20:27,240 and we were asking what happens here. 268 00:20:27,240 --> 00:20:29,920 So what we are doing now is we are 269 00:20:29,920 --> 00:20:37,320 sort of dividing this interval to lots of subintervals, 270 00:20:37,320 --> 00:20:41,470 the size of each one of them being dt. 271 00:20:41,470 --> 00:20:46,950 And therefore, the total number is big T over dt. 272 00:20:46,950 --> 00:20:49,230 And ultimately, clearly, I want to sit 273 00:20:49,230 --> 00:20:55,100 dt going to 0 so that this condition is satisfied. 274 00:20:55,100 --> 00:21:00,090 So also because of the second condition, each one of these 275 00:21:00,090 --> 00:21:05,220 will independently tell me whether or not I have an event. 276 00:21:05,220 --> 00:21:09,190 And so if I want to count the total number of events, 277 00:21:09,190 --> 00:21:11,405 I have to add things that are occurring 278 00:21:11,405 --> 00:21:12,760 in different intervals. 279 00:21:12,760 --> 00:21:14,870 And we can see that this problem now 280 00:21:14,870 --> 00:21:18,790 became identical to that problem because each one 281 00:21:18,790 --> 00:21:22,450 of these intervals has two possible outcomes-- nothing 282 00:21:22,450 --> 00:21:26,370 happens with probability 1 minus alpha dt, 283 00:21:26,370 --> 00:21:30,240 something happens with probability alpha dt. 284 00:21:30,240 --> 00:21:38,210 So no event is probability 1 minus alpha dt. 285 00:21:38,210 --> 00:21:42,700 One event means probability alpha dt. 286 00:21:42,700 --> 00:21:44,570 So this is a binomial process. 287 00:21:50,880 --> 00:21:55,930 So we can calculate, for example, 288 00:21:55,930 --> 00:21:59,070 the characteristic function. 289 00:21:59,070 --> 00:22:02,510 And I will indicate that we are looking 290 00:22:02,510 --> 00:22:07,440 at some interval of size T and parameterized 291 00:22:07,440 --> 00:22:09,680 by this straight alpha, we'll see 292 00:22:09,680 --> 00:22:11,220 that only the product will occur. 293 00:22:14,610 --> 00:22:18,230 So this is this before Fourier variable. 294 00:22:18,230 --> 00:22:20,190 We said that it's a binary, so it 295 00:22:20,190 --> 00:22:27,420 is one of the probabilities plus e to the minus ik 296 00:22:27,420 --> 00:22:31,180 plus the other probability raised to the power of N. 297 00:22:31,180 --> 00:22:33,690 Now we just substitute the probabilities that we 298 00:22:33,690 --> 00:22:35,350 have over here. 299 00:22:35,350 --> 00:22:43,810 So the probability of not having an event is 1 minus alpha dt. 300 00:22:43,810 --> 00:22:47,640 The probability of having an event is alpha dt. 301 00:22:47,640 --> 00:22:53,500 So alpha dt is going to appear here as e 302 00:22:53,500 --> 00:22:58,440 to the minus ik minus 1. 303 00:22:58,440 --> 00:23:02,620 So alpha st, e to the minus ik, is from here. 304 00:23:02,620 --> 00:23:06,760 From PB I will get 1 minus alpha dt. 305 00:23:06,760 --> 00:23:09,470 And I bunched together the two terms 306 00:23:09,470 --> 00:23:13,000 that are proportional to alpha dt. 307 00:23:13,000 --> 00:23:15,480 And then I have to raise to the power of N, which 308 00:23:15,480 --> 00:23:17,660 is T divided by dt. 309 00:23:21,190 --> 00:23:26,230 And this whole prescription is valid in the limit 310 00:23:26,230 --> 00:23:30,290 where dt is going to 0. 311 00:23:30,290 --> 00:23:35,160 So what you have is 1 plus an infinitesimal 312 00:23:35,160 --> 00:23:38,590 raised to a huge power. 313 00:23:38,590 --> 00:23:42,040 And this limiting procedure is equivalent to taking 314 00:23:42,040 --> 00:23:43,530 the exponential. 315 00:23:43,530 --> 00:23:45,500 So basically this is the same thing 316 00:23:45,500 --> 00:23:50,770 as exponential of what is here multiplied by what is here. 317 00:23:50,770 --> 00:23:53,880 The dt's cancel each other out and the answer 318 00:23:53,880 --> 00:23:58,670 is alpha T e to the minus ik minus 1. 319 00:24:01,740 --> 00:24:05,490 So the characteristic function for this process 320 00:24:05,490 --> 00:24:08,506 that we described is simply given by this form. 321 00:24:20,170 --> 00:24:21,090 You say, wait. 322 00:24:21,090 --> 00:24:23,440 I didn't ask for the characteristic function. 323 00:24:23,440 --> 00:24:25,970 I wanted the probability. 324 00:24:25,970 --> 00:24:27,020 Well, I say, OK. 325 00:24:27,020 --> 00:24:30,780 Characteristic function is simply the Fourier transform. 326 00:24:30,780 --> 00:24:33,160 So let me Fourier transform back, 327 00:24:33,160 --> 00:24:39,850 and I would say that the probability along not 328 00:24:39,850 --> 00:24:43,210 the Fourier axis but the actual axis 329 00:24:43,210 --> 00:24:45,540 is open by the inverse Fourier process. 330 00:24:45,540 --> 00:24:50,120 So I have to do an integral dk over 2 pi 331 00:24:50,120 --> 00:24:54,040 e to the ikx times the characteristic function. 332 00:24:54,040 --> 00:24:59,510 And the characteristic function is e to minus-- what was it? 333 00:24:59,510 --> 00:25:11,330 e to the alpha T. E to the minus ik k minus 1. 334 00:25:15,495 --> 00:25:17,770 Well, there is an equal to minus alpha 335 00:25:17,770 --> 00:25:22,280 T that I can simply take outside the integration. 336 00:25:22,280 --> 00:25:27,950 I have the integration over k e to that ikx. 337 00:25:31,490 --> 00:25:34,050 And then what I will do is I have this factor of e 338 00:25:34,050 --> 00:25:38,783 to the something in the-- e to the alpha T to the minus ik. 339 00:25:38,783 --> 00:25:42,530 I will use the expansion of the exponential. 340 00:25:42,530 --> 00:25:44,420 So the expansion of the exponential 341 00:25:44,420 --> 00:25:48,340 is a sum over m running from 0 to infinity. 342 00:25:48,340 --> 00:25:50,840 The exponent raised to the m-th power. 343 00:25:50,840 --> 00:25:54,480 So I have alpha T raised to the m-th power, 344 00:25:54,480 --> 00:25:57,350 e to the minus ik raised to the m-th power divided 345 00:25:57,350 --> 00:25:57,960 m factor here. 346 00:26:04,150 --> 00:26:09,050 So now I reorder the sum and the integration. 347 00:26:09,050 --> 00:26:11,900 The sum is over m, the integration is over k. 348 00:26:11,900 --> 00:26:14,200 I can reorder them. 349 00:26:14,200 --> 00:26:16,750 So on the things that go outside I 350 00:26:16,750 --> 00:26:22,270 have a sum over m running from 0 to infinity e to the minus 351 00:26:22,270 --> 00:26:28,800 alpha T, alpha T to the power of m divided by m factorial. 352 00:26:28,800 --> 00:26:35,790 Then I have the integral over k over 2 pi e to the ik. 353 00:26:35,790 --> 00:26:40,670 Well, I had the x here and I have e to the minus ikm here. 354 00:26:40,670 --> 00:26:42,610 So I have x minus m. 355 00:26:47,120 --> 00:26:50,970 And then I say, OK, this is an integral that I recognize. 356 00:26:50,970 --> 00:26:55,420 The integral of e to the ik times something 357 00:26:55,420 --> 00:26:57,940 is simply a delta function. 358 00:26:57,940 --> 00:27:01,940 So this whole thing is a delta function 359 00:27:01,940 --> 00:27:07,320 that's says, oh, x has to be an integer. 360 00:27:07,320 --> 00:27:12,430 Because I kind of did something that maybe, in retrospect, 361 00:27:12,430 --> 00:27:15,130 you would have said why are you doing this. 362 00:27:15,130 --> 00:27:20,600 Because along how many times things have occurred, 363 00:27:20,600 --> 00:27:24,070 they have either occurred 0 times, 1 times, 2 decays, 364 00:27:24,070 --> 00:27:24,800 3 decays. 365 00:27:24,800 --> 00:27:28,040 I don't have 2.5 decays. 366 00:27:28,040 --> 00:27:31,630 So I treated x as a continuous variable, 367 00:27:31,630 --> 00:27:35,800 but the mathematics was really clever enough to say that, no. 368 00:27:35,800 --> 00:27:40,010 The only places that you can have are really integer values. 369 00:27:40,010 --> 00:27:45,230 And the probability that you have some particular value 370 00:27:45,230 --> 00:27:51,020 integer m is simply what we have over here, e to the minus alpha 371 00:27:51,020 --> 00:27:56,954 T alpha T to the power of m divided by m factorial, which 372 00:27:56,954 --> 00:27:58,120 is the Poisson distribution. 373 00:28:07,810 --> 00:28:08,310 OK. 374 00:28:13,750 --> 00:28:16,780 But fine. 375 00:28:16,780 --> 00:28:18,460 So this is the Poisson distribution, 376 00:28:18,460 --> 00:28:21,450 but really we go through the root 377 00:28:21,450 --> 00:28:24,760 of the characteristic function in order 378 00:28:24,760 --> 00:28:28,430 to use this machinery that we developed earlier 379 00:28:28,430 --> 00:28:31,770 for cumulants et cetera. 380 00:28:31,770 --> 00:28:34,620 So let's look at the cumulant generating function. 381 00:28:34,620 --> 00:28:40,490 So I have to take the log of the function 382 00:28:40,490 --> 00:28:43,360 that I had calculated there. 383 00:28:43,360 --> 00:28:46,850 It is nicely in the exponential, so I 384 00:28:46,850 --> 00:28:51,950 get alpha T e to the minus ik minus 1. 385 00:28:55,690 --> 00:29:01,510 So now I can make an expansion of this in powers of k 386 00:29:01,510 --> 00:29:03,950 so I can expand the exponential. 387 00:29:03,950 --> 00:29:07,940 The first term vanishes because this starts with one. 388 00:29:07,940 --> 00:29:13,440 So really I have alpha T sum running from 1 389 00:29:13,440 --> 00:29:18,510 to infinity or N running from 1 to infinity minus ik 390 00:29:18,510 --> 00:29:20,420 to the power of n over n factorial. 391 00:29:28,230 --> 00:29:35,680 So my task for identifying the cumulants 392 00:29:35,680 --> 00:29:39,210 is to look at the expansion of this log 393 00:29:39,210 --> 00:29:44,510 and read off powers of minus ikn to the power of n factorial. 394 00:29:44,510 --> 00:29:46,620 So what do we see? 395 00:29:46,620 --> 00:29:53,810 We see that the first cumulant of the Poisson is alpha T, 396 00:29:53,810 --> 00:29:57,090 but all the coefficients are the same thing. 397 00:29:57,090 --> 00:29:59,260 The expectation value-- sorry. 398 00:29:59,260 --> 00:30:03,320 The second cumulant is alpha T. The third cumulant, 399 00:30:03,320 --> 00:30:06,310 the fourth cumulant, all the other cumulants 400 00:30:06,310 --> 00:30:15,090 are also alpha T. 401 00:30:15,090 --> 00:30:19,190 So the average number of decays that you see in the interval 402 00:30:19,190 --> 00:30:23,260 is simply alpha T. But there are fluctuations, 403 00:30:23,260 --> 00:30:25,100 and if somebody should, for example, 404 00:30:25,100 --> 00:30:30,660 ask you what's the average number cubed of events, 405 00:30:30,660 --> 00:30:32,730 you would say, OK. 406 00:30:32,730 --> 00:30:35,750 I'm going to use the relationship between moments 407 00:30:35,750 --> 00:30:37,130 and cumulants. 408 00:30:37,130 --> 00:30:42,830 I can either have three first objects 409 00:30:42,830 --> 00:30:47,640 or I can put one of them separate in three 410 00:30:47,640 --> 00:30:50,010 different factions. 411 00:30:50,010 --> 00:30:56,280 But this is a case where the triangle is allowed, 412 00:30:56,280 --> 00:30:59,220 so diagrammatically all three are possible. 413 00:30:59,220 --> 00:31:03,640 And so the answer for the first term is alpha T cubed. 414 00:31:06,260 --> 00:31:09,620 For the second term, it is a factor of 3. 415 00:31:09,620 --> 00:31:13,960 Both this variance and the mean give me a factor of alpha T, 416 00:31:13,960 --> 00:31:17,110 so I will get alpha T squared. 417 00:31:17,110 --> 00:31:20,830 And the third term, which is the third cumulant, 418 00:31:20,830 --> 00:31:26,163 is also alpha T. So the answer is simply of this form. 419 00:31:29,550 --> 00:31:31,620 Again, m is an integer. 420 00:31:31,620 --> 00:31:33,330 Alpha T is dimensionless. 421 00:31:33,330 --> 00:31:37,985 So there is no dimension problem by it having different powers. 422 00:31:42,050 --> 00:31:42,920 OK. 423 00:31:42,920 --> 00:31:43,550 Any questions? 424 00:31:50,050 --> 00:31:50,650 All right. 425 00:31:50,650 --> 00:31:58,010 So that's what I wanted to say about one variable. 426 00:31:58,010 --> 00:32:03,685 Now let's go and look at corresponding definitions 427 00:32:03,685 --> 00:32:05,060 when you have multiple variables. 428 00:32:13,140 --> 00:32:27,250 So for many random variables, the set of possible outcomes, 429 00:32:27,250 --> 00:32:32,650 let's say, has variables x1, x2. 430 00:32:32,650 --> 00:32:33,990 Let's be precise. 431 00:32:33,990 --> 00:32:36,440 Let's end it a xn. 432 00:32:36,440 --> 00:32:41,090 And if these are distributed, each one of them 433 00:32:41,090 --> 00:32:45,220 continuously over the interval, to each point 434 00:32:45,220 --> 00:32:50,380 we can characterize some kind of a probability density. 435 00:32:50,380 --> 00:32:57,940 So this entity is called the joint probability density 436 00:32:57,940 --> 00:33:00,630 function. 437 00:33:00,630 --> 00:33:14,950 And its definition would be to look at probability of outcome 438 00:33:14,950 --> 00:33:22,360 in some interval that is between, say, x1, x1 439 00:33:22,360 --> 00:33:30,770 plus dx1 in one, x2, x2 plus dx2 in the second variable. 440 00:33:30,770 --> 00:33:36,750 xn plus dxn in the last variable. 441 00:33:36,750 --> 00:33:40,560 So you sort of look at the particular point 442 00:33:40,560 --> 00:33:44,420 in this multi-dimensional space that you are interested. 443 00:33:44,420 --> 00:33:48,340 You build a little cube around it. 444 00:33:48,340 --> 00:33:51,580 You ask, what's the probability to being that cube? 445 00:33:51,580 --> 00:33:55,420 And then you divide by the volume of that cube. 446 00:33:55,420 --> 00:34:01,960 So dx1, dx2, dxn, which is the same thing 447 00:34:01,960 --> 00:34:06,530 that you would be doing in constructing any density 448 00:34:06,530 --> 00:34:10,719 and by ultimately taking the limit that all of the x's 449 00:34:10,719 --> 00:34:11,688 go to 0. 450 00:34:21,250 --> 00:34:22,060 All right. 451 00:34:22,060 --> 00:34:26,469 So this is the joint probability distribution. 452 00:34:26,469 --> 00:34:30,739 You can construct, now, the joint characteristic function. 453 00:34:44,319 --> 00:34:45,740 Now how do you do that? 454 00:34:45,740 --> 00:34:49,190 Well, again, just like you would do for your transform 455 00:34:49,190 --> 00:34:52,030 with multiple variables. 456 00:34:52,030 --> 00:34:56,510 So you would go for each variable 457 00:34:56,510 --> 00:34:58,310 to a conjugate variable. 458 00:34:58,310 --> 00:35:00,810 So x1 would go to k1. 459 00:35:00,810 --> 00:35:02,990 x2 would go to k2. 460 00:35:02,990 --> 00:35:05,330 xn would go to kn. 461 00:35:05,330 --> 00:35:08,600 And this would mathematically amount 462 00:35:08,600 --> 00:35:11,670 to calculating the expectation value of e 463 00:35:11,670 --> 00:35:26,470 to the minus i k1x1 x1, k2x2 and so forth, which 464 00:35:26,470 --> 00:35:38,850 you would obtain by integrating over all of these variables, e 465 00:35:38,850 --> 00:35:46,280 to the minus ik alpha x alpha, against the probability of x1 466 00:35:46,280 --> 00:35:47,106 through xn. 467 00:35:53,140 --> 00:35:53,640 Question? 468 00:35:57,966 --> 00:35:59,775 AUDIENCE: It's a bit hard to read. 469 00:35:59,775 --> 00:36:00,858 It's getting really small. 470 00:36:04,475 --> 00:36:07,856 PROFESSOR: OK. 471 00:36:07,856 --> 00:36:10,271 [LAUGHTER] 472 00:36:11,720 --> 00:36:15,020 PROFESSOR: But it's just multi-dimensional integral. 473 00:36:15,020 --> 00:36:17,080 OK? 474 00:36:17,080 --> 00:36:18,550 All right. 475 00:36:18,550 --> 00:36:25,270 So this is, as the case of one, I 476 00:36:25,270 --> 00:36:29,540 think the problem is not the size but the angle I see. 477 00:36:29,540 --> 00:36:30,875 I can't do much for that. 478 00:36:30,875 --> 00:36:34,310 You have to move to the center. 479 00:36:34,310 --> 00:36:34,820 OK. 480 00:36:34,820 --> 00:36:40,600 So what we can look at now is joint moment. 481 00:36:46,310 --> 00:36:50,350 So you can-- when we had one variable, 482 00:36:50,350 --> 00:36:52,970 we could look at something like the expectation 483 00:36:52,970 --> 00:36:54,490 value of x to the m. 484 00:36:57,890 --> 00:37:00,990 That would be the m-th moment. 485 00:37:00,990 --> 00:37:04,270 But if you have two variable, we can raise x1 486 00:37:04,270 --> 00:37:08,270 to some other, x2 to another power, 487 00:37:08,270 --> 00:37:12,430 and actually xn to another power. 488 00:37:12,430 --> 00:37:13,980 So this is a joint moment. 489 00:37:18,310 --> 00:37:23,310 Now the thing is, that the same way that moments 490 00:37:23,310 --> 00:37:25,500 for one variable could be generated 491 00:37:25,500 --> 00:37:28,770 by expanding the characteristic function, 492 00:37:28,770 --> 00:37:32,380 if I were to expand this function in powers of k, 493 00:37:32,380 --> 00:37:35,820 you can see that in meeting the expectation value, 494 00:37:35,820 --> 00:37:38,840 I will get various powers of x1 to some power, 495 00:37:38,840 --> 00:37:41,400 x2 to some power, et cetera. 496 00:37:41,400 --> 00:37:44,550 So by appropriate expansion of that function, 497 00:37:44,550 --> 00:37:49,200 I can generate all of-- read off all of these moments. 498 00:37:49,200 --> 00:37:53,650 Now, a more common way of generating the Taylor series 499 00:37:53,650 --> 00:37:56,620 expansion is through derivatives. 500 00:37:56,620 --> 00:38:02,810 So what I can do is I can take a derivative with respect 501 00:38:02,810 --> 00:38:03,860 to, say, ik1. 502 00:38:08,280 --> 00:38:11,340 If I take a derivative with respect 503 00:38:11,340 --> 00:38:14,580 to ik1 here, what happens is I will bring down 504 00:38:14,580 --> 00:38:16,490 a factor of minus x alpha. 505 00:38:16,490 --> 00:38:19,420 So actually let me put the minus so it 506 00:38:19,420 --> 00:38:22,340 becomes a factor of x alpha. 507 00:38:22,340 --> 00:38:29,060 And if I integrate x alpha against this, 508 00:38:29,060 --> 00:38:31,620 I will be generating the expectation value 509 00:38:31,620 --> 00:38:37,990 of x alpha provided that ultimately I set all of the k's 510 00:38:37,990 --> 00:38:38,990 to 0. 511 00:38:38,990 --> 00:38:41,080 So I will calculate the derivative 512 00:38:41,080 --> 00:38:45,160 of this function with respect to all of these arguments. 513 00:38:45,160 --> 00:38:48,170 At the end of the day, I will set k equals to 0. 514 00:38:48,170 --> 00:38:51,720 That will give me the expectation value of x1. 515 00:38:51,720 --> 00:38:57,550 But I don't want x1, I want x1 raised to the power of m1. 516 00:38:57,550 --> 00:38:59,130 So I do this. 517 00:38:59,130 --> 00:39:02,590 Each time I take a derivative with respect to minus ik, 518 00:39:02,590 --> 00:39:06,720 I will bring down the factor of the corresponding x. 519 00:39:06,720 --> 00:39:10,240 And I can do this with multiple different things. 520 00:39:10,240 --> 00:39:17,120 So d by the ik2 raised to the power of m2 521 00:39:17,120 --> 00:39:23,310 minus d by the ikn, the whole thing 522 00:39:23,310 --> 00:39:26,580 raised to the power of mn. 523 00:39:29,650 --> 00:39:32,560 So I can either take this function 524 00:39:32,560 --> 00:39:35,430 of multiple variables-- k1 through kn-- 525 00:39:35,430 --> 00:39:39,620 and expand it and read off the appropriate powers of k1k2. 526 00:39:39,620 --> 00:39:43,090 Or I can say that the terms in this expansion 527 00:39:43,090 --> 00:39:45,830 are generated through taking appropriate derivative. 528 00:39:45,830 --> 00:39:47,145 Yes? 529 00:39:47,145 --> 00:39:48,520 AUDIENCE: Is there any reason why 530 00:39:48,520 --> 00:39:52,810 you're choosing to take a derivative with respect 531 00:39:52,810 --> 00:39:57,406 to ikj instead of simply putting the i in the numerator? 532 00:39:57,406 --> 00:39:59,641 Or are there-- are there things that I'm not-- 533 00:39:59,641 --> 00:40:00,600 PROFESSOR: No. 534 00:40:00,600 --> 00:40:01,100 No. 535 00:40:01,100 --> 00:40:01,950 There is no reason. 536 00:40:01,950 --> 00:40:05,341 So you're saying why didn't I write this as this like this? 537 00:40:05,341 --> 00:40:05,840 i divided? 538 00:40:05,840 --> 00:40:07,585 AUDIENCE: Yeah. 539 00:40:07,585 --> 00:40:10,230 PROFESSOR: I think I just visually saw 540 00:40:10,230 --> 00:40:13,830 that it was kind of more that way. 541 00:40:13,830 --> 00:40:15,420 But it's exactly the same thing. 542 00:40:15,420 --> 00:40:18,266 Yes. 543 00:40:18,266 --> 00:40:18,766 OK. 544 00:40:22,660 --> 00:40:25,460 All right. 545 00:40:25,460 --> 00:40:27,930 Now the interesting object, of course, to us 546 00:40:27,930 --> 00:40:29,820 is more the joint cumulants. 547 00:40:34,800 --> 00:40:39,270 So how do we generate joint cumulants? 548 00:40:39,270 --> 00:40:44,310 Well previously, essentially we had a bunch of objects 549 00:40:44,310 --> 00:40:48,830 for one variable that was some moment. 550 00:40:48,830 --> 00:40:50,770 And in order to make them cumulants, 551 00:40:50,770 --> 00:40:52,540 we just put a sub C here. 552 00:40:52,540 --> 00:40:55,390 So we do that and we are done. 553 00:40:55,390 --> 00:40:58,390 But what did operationally happen 554 00:40:58,390 --> 00:41:01,990 was that we did the expansion rather 555 00:41:01,990 --> 00:41:04,400 than for the characteristic function 556 00:41:04,400 --> 00:41:08,430 for the log of the characteristic function. 557 00:41:08,430 --> 00:41:12,630 So all I need to do is to do precisely 558 00:41:12,630 --> 00:41:17,120 this set of derivatives applied rather 559 00:41:17,120 --> 00:41:21,290 than to the joint characteristic function 560 00:41:21,290 --> 00:41:24,260 to the log of the joint characteristic function. 561 00:41:24,260 --> 00:41:27,849 And at the end, set all of the case to 0. 562 00:41:30,702 --> 00:41:31,202 OK? 563 00:41:37,920 --> 00:41:47,650 So by looking at these two definitions and the expansion 564 00:41:47,650 --> 00:41:51,580 of the log, for example, you can calculate various things. 565 00:41:51,580 --> 00:42:00,730 Like, for example, x1x2 with a C is the expectation value 566 00:42:00,730 --> 00:42:03,440 of x1x2. 567 00:42:03,440 --> 00:42:13,600 This joint moment minus x1x2, just as you would have thought, 568 00:42:13,600 --> 00:42:17,710 would be the appropriate generalization of the variance. 569 00:42:17,710 --> 00:42:18,835 And this is the covariance. 570 00:42:23,360 --> 00:42:28,560 And you can construct appropriate extensions. 571 00:42:28,560 --> 00:42:29,060 OK. 572 00:42:43,410 --> 00:42:50,967 Now we made a lot of use of the relationship between moments 573 00:42:50,967 --> 00:42:51,550 and cumulants. 574 00:42:51,550 --> 00:42:54,560 We just-- so the idea, really, was 575 00:42:54,560 --> 00:42:58,110 that the essence of a probability distribution 576 00:42:58,110 --> 00:43:01,030 is characterized in the cumulants. 577 00:43:01,030 --> 00:43:03,910 Moments kind of depend on how you look at things. 578 00:43:03,910 --> 00:43:07,040 The essence is in the cumulants, but sometimes the moments 579 00:43:07,040 --> 00:43:10,020 are more usefully computed, and there 580 00:43:10,020 --> 00:43:13,120 was a relationship between moments and cumulants, 581 00:43:13,120 --> 00:43:16,025 we can generalize that graphical relation 582 00:43:16,025 --> 00:43:19,640 to the case joint moments and joint cumulants. 583 00:43:19,640 --> 00:43:29,120 So graphical relation applies as long 584 00:43:29,120 --> 00:43:39,490 as points are labeled by appropriate 585 00:43:39,490 --> 00:43:45,040 or by corresponding variable. 586 00:43:49,790 --> 00:43:55,970 So suppose I wanted to calculate some kind of a moment that 587 00:43:55,970 --> 00:43:58,780 is x1 squared. 588 00:43:58,780 --> 00:44:05,490 Let's say x2, x3. 589 00:44:05,490 --> 00:44:07,480 This may generate for me many diagrams, 590 00:44:07,480 --> 00:44:11,180 so let's stop from here. 591 00:44:11,180 --> 00:44:15,820 So what I can do is I can have points 592 00:44:15,820 --> 00:44:20,150 that I label 1, 1, and 2. 593 00:44:20,150 --> 00:44:24,100 And have them separate from each other. 594 00:44:24,100 --> 00:44:27,800 Or I can start pairing them together. 595 00:44:27,800 --> 00:44:33,000 So one possibility is that I put the 1's together and the 2 596 00:44:33,000 --> 00:44:36,380 starts separately. 597 00:44:36,380 --> 00:44:40,470 Another possibility is that I can group the 1 and the 2 598 00:44:40,470 --> 00:44:42,410 together. 599 00:44:42,410 --> 00:44:46,500 And then the other 1 starts separately. 600 00:44:46,500 --> 00:44:48,860 But I had a choice of two ways to do this, 601 00:44:48,860 --> 00:44:53,180 so this comes-- this diagram with an overall factor of 2. 602 00:44:53,180 --> 00:44:58,770 And then there's the possibility to put all of them 603 00:44:58,770 --> 00:45:01,022 in the same bag. 604 00:45:01,022 --> 00:45:03,420 And so mathematically, that means 605 00:45:03,420 --> 00:45:08,610 that the third-- this particular joint moment 606 00:45:08,610 --> 00:45:17,880 is obtained by taking average of x1 squared x2 average, which 607 00:45:17,880 --> 00:45:20,030 is the first term. 608 00:45:20,030 --> 00:45:26,450 The second term is the variance of x1. 609 00:45:26,450 --> 00:45:28,440 And then multiplied by x2. 610 00:45:32,030 --> 00:45:40,060 The third term is twice the covariance of x1 and x2 times 611 00:45:40,060 --> 00:45:42,017 the mean of x1. 612 00:45:45,960 --> 00:45:49,610 And the final term is just the third cumulant. 613 00:45:53,034 --> 00:45:57,980 So again, you would need to compute these, presumably, 614 00:45:57,980 --> 00:46:00,200 from the law of the characteristic function 615 00:46:00,200 --> 00:46:01,934 and then you would be done. 616 00:46:27,790 --> 00:46:32,780 Couple of other definitions. 617 00:46:32,780 --> 00:46:42,150 One of them is an unconditional probability. 618 00:46:44,710 --> 00:46:49,960 So very soon we will be talking about, say, probabilities 619 00:46:49,960 --> 00:46:52,470 appropriate to the gas in this room. 620 00:46:52,470 --> 00:46:54,780 And the particles in the gas in this room 621 00:46:54,780 --> 00:46:58,020 will be characterized where they are, 622 00:46:58,020 --> 00:47:02,700 some position vector q, and how fast they are moving, 623 00:47:02,700 --> 00:47:05,210 some momentum vector p. 624 00:47:05,210 --> 00:47:08,950 And there would be some kind of a probability density 625 00:47:08,950 --> 00:47:13,750 associated with finding a particle with some momentum 626 00:47:13,750 --> 00:47:17,120 at some location in space. 627 00:47:17,120 --> 00:47:20,650 But sometimes I say, well, I really 628 00:47:20,650 --> 00:47:23,370 don't care about where the particles are, 629 00:47:23,370 --> 00:47:27,140 I just want to know how fast they are moving. 630 00:47:27,140 --> 00:47:31,610 So what I really care is the probability 631 00:47:31,610 --> 00:47:35,180 that I have a particle moving with some momentum p, 632 00:47:35,180 --> 00:47:38,160 irrespective of where it is. 633 00:47:38,160 --> 00:47:42,120 Then all I need to do is to integrate 634 00:47:42,120 --> 00:47:45,545 over the position the joint probability distribution. 635 00:47:48,230 --> 00:47:50,560 And the check that this is correct 636 00:47:50,560 --> 00:47:54,360 is that if I first do not integrate this over p, 637 00:47:54,360 --> 00:47:57,570 this would be integrated over the entire space 638 00:47:57,570 --> 00:47:59,930 and the joint probabilities appropriately 639 00:47:59,930 --> 00:48:04,630 normalized so that the joint integration will give me one. 640 00:48:04,630 --> 00:48:09,400 So this is a correct normalized probability. 641 00:48:09,400 --> 00:48:14,930 And more generally, if I'm interested in, say, a bunch of 642 00:48:14,930 --> 00:48:21,540 coordinates x1 through xs, out of a larger list of coordinates 643 00:48:21,540 --> 00:48:29,510 that spans x1 through xs all the way to something else, 644 00:48:29,510 --> 00:48:34,350 all I need to do to get unconditional probability is 645 00:48:34,350 --> 00:48:39,770 to integrate over the variables that I'm not interested. 646 00:48:39,770 --> 00:48:43,800 Again, check is that it's a good properly normalized. 647 00:48:46,660 --> 00:48:48,880 Now, this is to be contrasted with 648 00:48:48,880 --> 00:48:50,210 the conditional probability. 649 00:48:55,470 --> 00:48:58,440 The conditional probability, let's 650 00:48:58,440 --> 00:49:02,955 say we would be interested in calculating the pressure that 651 00:49:02,955 --> 00:49:05,110 is exerted on the board. 652 00:49:05,110 --> 00:49:07,680 The pressure is exerted by the particles that 653 00:49:07,680 --> 00:49:10,340 impinge on the board and then go away, 654 00:49:10,340 --> 00:49:13,620 so I'm interested in the momentum of particles 655 00:49:13,620 --> 00:49:17,290 right at the board, not anywhere else in space. 656 00:49:17,290 --> 00:49:23,760 So if I'm interested in the momentum of particles 657 00:49:23,760 --> 00:49:28,870 at the particular location, which could in principle depend 658 00:49:28,870 --> 00:49:36,440 on location-- so now q is a parameter p is the variable, 659 00:49:36,440 --> 00:49:41,010 but the probability distribution could depend on q. 660 00:49:41,010 --> 00:49:43,020 How do we obtain this? 661 00:49:43,020 --> 00:49:43,870 This, again. 662 00:49:43,870 --> 00:49:48,330 is going to be proportional to the probability 663 00:49:48,330 --> 00:49:50,650 that I will find a particle both at 664 00:49:50,650 --> 00:49:53,430 this location with momentum p. 665 00:49:53,430 --> 00:49:56,480 So I need to have that. 666 00:49:56,480 --> 00:49:58,180 But it's not exactly that there's 667 00:49:58,180 --> 00:50:01,310 a normalization involved. 668 00:50:01,310 --> 00:50:04,210 And the way to get normalization is 669 00:50:04,210 --> 00:50:13,590 to note that if I integrate this probability over its variable p 670 00:50:13,590 --> 00:50:24,050 but not over the parameter q, the answer should be 1. 671 00:50:24,050 --> 00:50:27,320 So this is going to be, if I apply it 672 00:50:27,320 --> 00:50:32,760 to the right-hand side, the integral over p 673 00:50:32,760 --> 00:50:39,850 of p of p and q, which we recognize 674 00:50:39,850 --> 00:50:44,470 as an example of an unconditional probability 675 00:50:44,470 --> 00:50:48,970 to find something at position 1. 676 00:50:48,970 --> 00:50:55,480 So the normalization is going to be this so that the ratio is 1. 677 00:50:55,480 --> 00:51:02,070 So most generally, we find that the probability 678 00:51:02,070 --> 00:51:08,610 to have some subset of variables, given 679 00:51:08,610 --> 00:51:12,930 that the location of the other variables in the list 680 00:51:12,930 --> 00:51:19,110 are somewhat fixed, is given by the joint probability of all 681 00:51:19,110 --> 00:51:24,970 of the variables x1 through xn divided 682 00:51:24,970 --> 00:51:29,290 by the unconditional probability that that applies 683 00:51:29,290 --> 00:51:31,530 to the parameters of our fixed. 684 00:51:35,844 --> 00:51:37,260 And this is called Bayes' theorem. 685 00:51:57,510 --> 00:52:05,240 By the way, if variables are independent, 686 00:52:05,240 --> 00:52:08,340 which actually does apply to the case of the particles 687 00:52:08,340 --> 00:52:11,300 in this room as far as their momentum and position is 688 00:52:11,300 --> 00:52:15,900 concerned, then the joint probability 689 00:52:15,900 --> 00:52:18,190 is going to be the product of one that 690 00:52:18,190 --> 00:52:20,950 is appropriate to the position and one 691 00:52:20,950 --> 00:52:22,535 that is appropriate to the momentum. 692 00:52:25,990 --> 00:52:29,050 And if you have this independence, 693 00:52:29,050 --> 00:52:31,835 then what you'll find is that there 694 00:52:31,835 --> 00:52:34,730 is no difference between conditional and unconditional 695 00:52:34,730 --> 00:52:36,460 probabilities. 696 00:52:36,460 --> 00:52:38,610 And when you go through this procedure, 697 00:52:38,610 --> 00:52:42,110 you will find that all the joint cumulants-- but not 698 00:52:42,110 --> 00:52:44,780 the joint moments, naturally-- all the joint cumulants 699 00:52:44,780 --> 00:52:45,748 will be 0. 700 00:52:52,050 --> 00:52:52,800 OK. 701 00:52:52,800 --> 00:52:53,985 Any questions? 702 00:52:53,985 --> 00:52:54,485 Yes? 703 00:52:54,485 --> 00:53:00,286 AUDIENCE: Could you explain how the condition of p-- 704 00:53:00,286 --> 00:53:03,200 PROFESSOR: How this was obtained? 705 00:53:03,200 --> 00:53:04,180 Or the one above? 706 00:53:04,180 --> 00:53:05,164 AUDIENCE: Yeah. 707 00:53:05,164 --> 00:53:07,624 The condition you applied that the integral is 1. 708 00:53:07,624 --> 00:53:08,960 PROFESSOR: OK. 709 00:53:08,960 --> 00:53:14,950 So first of all, what I want to look at 710 00:53:14,950 --> 00:53:18,580 is the probability that is appropriate to one 711 00:53:18,580 --> 00:53:22,240 random variable at the fixed value of all 712 00:53:22,240 --> 00:53:24,050 the other random variables. 713 00:53:24,050 --> 00:53:27,890 Like you say, in general I should specify the probability 714 00:53:27,890 --> 00:53:31,110 as a function of momentum and position throughout space. 715 00:53:31,110 --> 00:53:33,790 But I'm really interested only at this point. 716 00:53:33,790 --> 00:53:37,730 I don't really care about other points. 717 00:53:37,730 --> 00:53:40,820 However, the answer may depend whether I'm looking at here 718 00:53:40,820 --> 00:53:42,640 or I'm looking at here. 719 00:53:42,640 --> 00:53:45,410 So the answer for the probability of momentum 720 00:53:45,410 --> 00:53:47,710 is parametrized by q. 721 00:53:47,710 --> 00:53:50,760 On the other hand, I say that I know 722 00:53:50,760 --> 00:53:53,370 the probability over the entire space 723 00:53:53,370 --> 00:53:56,270 to be a disposition with the momentum p 724 00:53:56,270 --> 00:53:59,720 as given by this joint probability. 725 00:53:59,720 --> 00:54:02,580 But if I just set that equal to this, 726 00:54:02,580 --> 00:54:05,820 the answer is not correct because the way 727 00:54:05,820 --> 00:54:10,570 that this quantity is normalized is if I first integrate over 728 00:54:10,570 --> 00:54:15,170 all possible values of its variable, p. 729 00:54:15,170 --> 00:54:20,200 The answer should be 1, irrespective of what q is. 730 00:54:20,200 --> 00:54:24,870 So I can define a conditional probability for momentum 731 00:54:24,870 --> 00:54:28,630 here, a conditional probability for momentum there. 732 00:54:28,630 --> 00:54:32,600 In both cases the momentum would be the variable it. 733 00:54:32,600 --> 00:54:36,230 And integrating over all possible values of momentum 734 00:54:36,230 --> 00:54:39,430 should give me one for a properly normalized probability 735 00:54:39,430 --> 00:54:40,120 distribution. 736 00:54:40,120 --> 00:54:42,575 AUDIENCE: [INAUDIBLE]. 737 00:54:42,575 --> 00:54:44,700 PROFESSOR: Given that q is something. 738 00:54:44,700 --> 00:54:48,510 So q could be some-- now here q can 739 00:54:48,510 --> 00:54:51,050 be regarded as some parameter. 740 00:54:51,050 --> 00:54:55,680 So the condition is that this integration should give me 1. 741 00:54:55,680 --> 00:54:57,630 I said that on physical grounds, I 742 00:54:57,630 --> 00:55:00,660 expect this conditional probability 743 00:55:00,660 --> 00:55:03,820 to be the joint probability up to some normalization 744 00:55:03,820 --> 00:55:06,270 that I don't know. 745 00:55:06,270 --> 00:55:06,770 OK. 746 00:55:06,770 --> 00:55:09,390 So what is that normalization? 747 00:55:09,390 --> 00:55:11,890 The whole answer should be 1. 748 00:55:11,890 --> 00:55:13,820 What I have to do is an integration 749 00:55:13,820 --> 00:55:17,380 over momentum of the joint probability. 750 00:55:17,380 --> 00:55:22,845 I have said that an integration over some set of variables 751 00:55:22,845 --> 00:55:25,070 of a joint probability will give me 752 00:55:25,070 --> 00:55:28,860 the unconditional probability for all the others. 753 00:55:28,860 --> 00:55:32,860 So integrating over all momentum of this joint probability 754 00:55:32,860 --> 00:55:37,310 will give me the unconditional probability for position. 755 00:55:37,310 --> 00:55:41,550 So the normalization of 1 is the unconditional probability 756 00:55:41,550 --> 00:55:44,220 for position divided by n. 757 00:55:44,220 --> 00:55:48,220 So n-- this has to be this. 758 00:55:48,220 --> 00:55:53,200 And in general, it would have to be this in order 759 00:55:53,200 --> 00:55:55,370 to ensure that if I integrate over 760 00:55:55,370 --> 00:55:59,180 this first set of variables of the joint probability 761 00:55:59,180 --> 00:56:04,040 distribution which would give me the unconditional, 762 00:56:04,040 --> 00:56:07,090 cancels the unconditional in the denominator to give me 1. 763 00:56:15,324 --> 00:56:15,990 Other questions? 764 00:56:20,900 --> 00:56:22,290 OK. 765 00:56:22,290 --> 00:56:28,980 So I'm going to erase this last board 766 00:56:28,980 --> 00:56:35,530 to be underneath that top board in looking 767 00:56:35,530 --> 00:56:40,140 at the joint Gaussian distribution. 768 00:56:40,140 --> 00:56:41,950 So that was the Gaussian, and we want 769 00:56:41,950 --> 00:56:43,290 to look at the joint Gaussian. 770 00:56:53,950 --> 00:56:57,440 So we want to generalize the formula 771 00:56:57,440 --> 00:57:01,855 that we have over there for one variable to multiple variables. 772 00:57:05,160 --> 00:57:09,200 So what I have there initially is a factor, which 773 00:57:09,200 --> 00:57:23,530 is exponential of minus 1/2, x minus lambda squared. 774 00:57:23,530 --> 00:57:26,060 I can write this x minus lambda squared 775 00:57:26,060 --> 00:57:29,680 as x minus lambda x minus lambda. 776 00:57:29,680 --> 00:57:33,040 And then put the variance. 777 00:57:33,040 --> 00:57:39,850 Let's call it is 1 over sigma rather than a small sigma 778 00:57:39,850 --> 00:57:42,141 squared or something like this. 779 00:57:42,141 --> 00:57:43,780 Actually, let me just write it as 1 780 00:57:43,780 --> 00:57:46,110 over sigma squared for the time being. 781 00:57:46,110 --> 00:57:53,480 And then the normalization was 1 over root 2 pi sigma squared. 782 00:57:53,480 --> 00:57:55,690 But you say, well, I have multiple variables, 783 00:57:55,690 --> 00:57:59,080 so maybe this is what I would give for my B variable. 784 00:58:04,090 --> 00:58:10,410 And then I would sum over all N, running from 1 to N. 785 00:58:10,410 --> 00:58:12,470 So this is essentially the form that I 786 00:58:12,470 --> 00:58:18,240 would have for an independent Gaussian variables. 787 00:58:18,240 --> 00:58:21,570 And then I would have to multiply here 788 00:58:21,570 --> 00:58:25,990 factors of 2 pi sigma squared, so I would have 2 pi to the N 789 00:58:25,990 --> 00:58:27,190 over 2. 790 00:58:27,190 --> 00:58:30,710 And I would have product of-- actually, 791 00:58:30,710 --> 00:58:34,370 let's write it as 2 pi to the N square root. 792 00:58:34,370 --> 00:58:36,400 I would have the product of sigma i squared. 793 00:58:40,890 --> 00:58:44,770 But that's just too limiting a form. 794 00:58:44,770 --> 00:58:50,700 The most general form that these quadratic will allow me to have 795 00:58:50,700 --> 00:58:55,980 also cross terms where it is not only the diagonal terms 796 00:58:55,980 --> 00:58:59,950 x1 and x1 that's are multiplying each other, but x2 and x3, 797 00:58:59,950 --> 00:59:01,140 et cetera. 798 00:59:01,140 --> 00:59:07,070 So I would have a sum over both m and n running from 1 to n. 799 00:59:07,070 --> 00:59:09,040 And then to coefficient here, rather 800 00:59:09,040 --> 00:59:14,890 than just being a number, would be the variables 801 00:59:14,890 --> 00:59:18,520 that would be like a matrix. 802 00:59:18,520 --> 00:59:25,070 Because for each pair m and n, I would have some number. 803 00:59:25,070 --> 00:59:32,420 And I will call them the inverse of some matrix C. 804 00:59:32,420 --> 00:59:37,710 And if you, again, think of the problem as a matrix, 805 00:59:37,710 --> 00:59:41,440 if I have the diagonal matrix, then the product 806 00:59:41,440 --> 00:59:43,990 of elements along the diagonal is the same thing 807 00:59:43,990 --> 00:59:45,740 as the determinant. 808 00:59:45,740 --> 00:59:49,780 If I were to rotate the matrix to have off diagonal elements, 809 00:59:49,780 --> 00:59:52,752 the determinant will always be there. 810 00:59:52,752 --> 00:59:56,365 So this is really the determinant of C 811 00:59:56,365 --> 00:59:57,610 that will appear here. 812 01:00:00,420 --> 01:00:01,220 Yes? 813 01:00:01,220 --> 01:00:07,045 AUDIENCE: So are you inverting the individual elements of C 814 01:00:07,045 --> 01:00:11,180 or are you inverting the matrix C and taking its elements? 815 01:00:11,180 --> 01:00:14,940 PROFESSOR: Actually a very good point. 816 01:00:14,940 --> 01:00:19,780 I really wanted to write it as the inverse of the matrix 817 01:00:19,780 --> 01:00:22,350 and then peak the mn [INAUDIBLE]. 818 01:00:22,350 --> 01:00:25,890 So we imagine that we have the matrix. 819 01:00:25,890 --> 01:00:30,960 And these are the elements of some-- 820 01:00:30,960 --> 01:00:33,730 so I could have called this whatever I want. 821 01:00:33,730 --> 01:00:37,150 So I could have called the coefficients of x and n. 822 01:00:37,150 --> 01:00:44,610 I have chosen to regard them as the inverse 823 01:00:44,610 --> 01:00:48,260 of some other matrix C. And the reason for that 824 01:00:48,260 --> 01:00:52,350 becomes shortly clear, because the covariances will 825 01:00:52,350 --> 01:00:55,240 be related to the inverse of this matrix. 826 01:00:55,240 --> 01:00:58,731 And hence, that's the appropriate way to look at it. 827 01:00:58,731 --> 01:01:00,730 AUDIENCE: Can [INAUDIBLE] what C means up there? 828 01:01:00,730 --> 01:01:02,030 PROFESSOR: OK. 829 01:01:02,030 --> 01:01:05,845 So let's forget about this lambdas. 830 01:01:05,845 --> 01:01:09,080 So I would have in general for two variables 831 01:01:09,080 --> 01:01:12,640 some coefficient for x1 squared, some coefficient 832 01:01:12,640 --> 01:01:17,730 for x2 squared, and some coefficient for x1, x2. 833 01:01:17,730 --> 01:01:21,070 So I could call this a11. 834 01:01:21,070 --> 01:01:23,580 I could call this a22. 835 01:01:23,580 --> 01:01:26,430 I could call this 2a12. 836 01:01:26,430 --> 01:01:28,550 Or actually I could, if I wanted, 837 01:01:28,550 --> 01:01:33,250 just write it as a12 plus a21 x2x1 838 01:01:33,250 --> 01:01:36,930 or do a1 to an a21 would be the same. 839 01:01:36,930 --> 01:01:42,350 So what I could then regard this is as x1 2. 840 01:01:42,350 --> 01:01:50,440 The matrix a11, a12, a21, a22, x1, x2. 841 01:01:50,440 --> 01:01:54,304 So this is exactly the same as that. 842 01:01:54,304 --> 01:01:55,570 All right? 843 01:01:55,570 --> 01:02:01,490 So these objects here are the elements 844 01:02:01,490 --> 01:02:04,330 of this matrix C inverse. 845 01:02:04,330 --> 01:02:12,790 So I could call this x1, x2 some matrix A x1 x2. 846 01:02:12,790 --> 01:02:14,990 That A is 2 by 2 matrix. 847 01:02:14,990 --> 01:02:19,080 The name I have given to that 2 by 2 matrix in C inverse. 848 01:02:23,160 --> 01:02:23,660 Yes? 849 01:02:23,660 --> 01:02:25,404 AUDIENCE: The matrix is required to be 850 01:02:25,404 --> 01:02:25,840 symmetric though, isn't it? 851 01:02:25,840 --> 01:02:27,360 PROFESSOR: The matrix is required 852 01:02:27,360 --> 01:02:29,450 to be symmetric for any quadrant form. 853 01:02:29,450 --> 01:02:30,270 Yes. 854 01:02:30,270 --> 01:02:33,920 So when I wrote it initially, I wrote as 2 a12. 855 01:02:33,920 --> 01:02:36,410 And then I said, well, I can also write it 856 01:02:36,410 --> 01:02:38,730 this fashion provided the two of them are the same. 857 01:02:38,730 --> 01:02:39,230 Yes? 858 01:02:39,230 --> 01:02:42,150 AUDIENCE: How did you know the determinant of C belonged 859 01:02:42,150 --> 01:02:42,650 there? 860 01:02:42,650 --> 01:02:43,380 PROFESSOR: Pardon? 861 01:02:43,380 --> 01:02:45,020 AUDIENCE: How did you know that the determinant 862 01:02:45,020 --> 01:02:45,545 of C [INAUDIBLE]? 863 01:02:45,545 --> 01:02:46,128 PROFESSOR: OK. 864 01:02:46,128 --> 01:02:48,190 How do I know the determinant of C? 865 01:02:48,190 --> 01:02:51,330 Let's say I give you this form. 866 01:02:51,330 --> 01:02:55,270 And then I don't know what the normalization is. 867 01:02:55,270 --> 01:03:00,390 What I can do is I can do a change of variables from x1 x2 868 01:03:00,390 --> 01:03:06,690 to something like y1 y2 such that when I look at y1 and y2, 869 01:03:06,690 --> 01:03:10,220 the matrix becomes diagonal. 870 01:03:10,220 --> 01:03:13,130 So I can rotate the matrix. 871 01:03:13,130 --> 01:03:18,730 So any matrix I can imagine that I will find some U such 872 01:03:18,730 --> 01:03:24,870 that A U U dagger is this diagonal matrix lambda. 873 01:03:24,870 --> 01:03:29,370 Now under these procedures, one thing that does not change 874 01:03:29,370 --> 01:03:30,970 is the determinant. 875 01:03:30,970 --> 01:03:34,290 It's always the product of the eigenvalues. 876 01:03:34,290 --> 01:03:35,980 The way that I set up the problem, 877 01:03:35,980 --> 01:03:41,480 I said that if I hadn't made the problem to have cross terms, 878 01:03:41,480 --> 01:03:45,560 I knew the answers to be the product of that eigenvalues. 879 01:03:45,560 --> 01:03:49,870 So if you like, I can start from there and then do a rotation 880 01:03:49,870 --> 01:03:52,350 and have the more general form. 881 01:03:52,350 --> 01:03:54,800 The answer would stay as the determinant. 882 01:03:54,800 --> 01:03:55,571 Yes? 883 01:03:55,571 --> 01:03:57,820 AUDIENCE: The matrix should be positive as well or no? 884 01:03:57,820 --> 01:04:00,490 PROFESSOR: The matrix should be positive definite in order 885 01:04:00,490 --> 01:04:04,620 for the probability to be well-defined and exist, yes. 886 01:04:04,620 --> 01:04:06,090 OK. 887 01:04:06,090 --> 01:04:10,380 So if you like, by stating that this is a probability, 888 01:04:10,380 --> 01:04:12,650 I have imposed a number of conditions 889 01:04:12,650 --> 01:04:15,560 such as symmetry, as well as positivity. 890 01:04:18,310 --> 01:04:20,750 Yes. 891 01:04:20,750 --> 01:04:23,337 OK. 892 01:04:23,337 --> 01:04:24,670 But this is just linear algebra. 893 01:04:29,460 --> 01:04:31,929 I will assume that you know linear algebra. 894 01:04:34,710 --> 01:04:35,500 OK. 895 01:04:35,500 --> 01:04:40,790 So this property normalized Gaussian joint probability. 896 01:04:40,790 --> 01:04:44,570 We are interested in the characteristic function. 897 01:04:44,570 --> 01:04:48,256 So what we are interested is the joint Gaussian characteristic. 898 01:04:54,640 --> 01:04:58,010 And so again we saw the procedure 899 01:04:58,010 --> 01:05:00,636 was that I have to do the Fourier transform. 900 01:05:05,200 --> 01:05:08,660 So I have to take this probability that I 901 01:05:08,660 --> 01:05:13,600 have over there and do an integration product, 902 01:05:13,600 --> 01:05:17,470 say, alpha running from 1 to N dx 903 01:05:17,470 --> 01:05:27,750 alpha of e to the minus ik alpha x alpha. 904 01:05:27,750 --> 01:05:31,620 This product exists for all values. 905 01:05:31,620 --> 01:05:34,810 Then I have to multiply with this probability 906 01:05:34,810 --> 01:05:37,220 that I have up there, which would appear here. 907 01:05:41,339 --> 01:05:41,839 OK. 908 01:05:44,730 --> 01:05:49,570 Now, again maybe an easy way to imagine 909 01:05:49,570 --> 01:05:52,420 is what I was saying to previously. 910 01:05:52,420 --> 01:05:54,880 Let's imagine that I have rotated 911 01:05:54,880 --> 01:05:59,260 into a basis where everything is diagonal. 912 01:05:59,260 --> 01:06:03,350 Then in the rotated basis, all you need to do 913 01:06:03,350 --> 01:06:11,980 is to essentially do product of characteristic functions 914 01:06:11,980 --> 01:06:14,680 such as what we have over here. 915 01:06:14,680 --> 01:06:20,560 So the corresponding product to this first term 916 01:06:20,560 --> 01:06:26,250 would be exponential of minus i sum over N running 917 01:06:26,250 --> 01:06:29,640 from 1 to N k alpha lambda alpha. 918 01:06:32,910 --> 01:06:34,440 kn lambda n. 919 01:06:34,440 --> 01:06:39,140 I guess I'm using n as the variable here. 920 01:06:39,140 --> 01:06:43,440 And as long as things would be diagonal, 921 01:06:43,440 --> 01:06:46,670 the next ordered term would be a sum 922 01:06:46,670 --> 01:06:52,260 over alpha kn squared the corresponding eigenvalue 923 01:06:52,260 --> 01:06:54,790 inverted. 924 01:06:54,790 --> 01:07:00,560 So remember that in the diagonal form, each one of these sigmas 925 01:07:00,560 --> 01:07:03,690 would appear as the diagonal. 926 01:07:03,690 --> 01:07:07,950 If I do my rotation, essentially this term 927 01:07:07,950 --> 01:07:10,180 would not be affected. 928 01:07:10,180 --> 01:07:15,180 The next term would give me minus 1/2 sum over m and n 929 01:07:15,180 --> 01:07:19,860 rather than just having k1 squared k2 squared, et cetera. 930 01:07:19,860 --> 01:07:25,890 Just like here, I would have km kn. 931 01:07:25,890 --> 01:07:30,360 What happened previously was that each eigenvalue 932 01:07:30,360 --> 01:07:32,460 would get inverted. 933 01:07:32,460 --> 01:07:36,860 If you think about rotating a matrix, all of its eigenvalues 934 01:07:36,860 --> 01:07:41,310 are inverted, you are really rotating the inverse matrix. 935 01:07:41,310 --> 01:07:44,260 So this here would be the inverse of whatever matrix 936 01:07:44,260 --> 01:07:44,990 I have here. 937 01:07:44,990 --> 01:07:46,426 So this would be C mn. 938 01:07:49,620 --> 01:07:54,960 So I did will leave you to do the corresponding linear 939 01:07:54,960 --> 01:07:59,820 algebra here, but the answer is correct. 940 01:07:59,820 --> 01:08:10,630 So the answer is that the generator of cumulants 941 01:08:10,630 --> 01:08:15,090 for a joint Gaussian distribution 942 01:08:15,090 --> 01:08:27,479 has a form which has a bunch of the linear terms-- kn lambda n. 943 01:08:27,479 --> 01:08:29,514 And a bunch of second order terms, 944 01:08:29,514 --> 01:08:35,910 so we will have minus 1/2 sum over m and n km 945 01:08:35,910 --> 01:08:38,245 kn times some coefficient. 946 01:08:42,550 --> 01:08:46,670 And the series terminates here. 947 01:08:46,670 --> 01:08:52,710 So for the joint Gaussian, you have first cumulant. 948 01:08:52,710 --> 01:08:57,470 So the expectation value of nm cumulant 949 01:08:57,470 --> 01:09:01,430 is the same thing as lambda m. 950 01:09:01,430 --> 01:09:09,149 You have covariances or second cumulants xm, xn, C is Cmn. 951 01:09:09,149 --> 01:09:13,109 And in particular, the diagonal elements 952 01:09:13,109 --> 01:09:16,430 would correspond to the variances. 953 01:09:16,430 --> 01:09:23,490 And all the higher orders are 0 because there's 954 01:09:23,490 --> 01:09:25,250 no further term in the expansion. 955 01:09:39,750 --> 01:09:47,300 So for example, if I were to calculate this thing that I 956 01:09:47,300 --> 01:09:55,100 have on the board here for the case of a Gaussian, 957 01:09:55,100 --> 01:10:02,150 for the case of the Gaussian, I would not have this third term. 958 01:10:02,150 --> 01:10:04,200 So the answer that I would write down 959 01:10:04,200 --> 01:10:06,640 for the case of the third term would 960 01:10:06,640 --> 01:10:09,860 be something that didn't have this. 961 01:10:09,860 --> 01:10:13,785 And in the way that we have written things, 962 01:10:13,785 --> 01:10:16,120 the answer would have been x1 squared 963 01:10:16,120 --> 01:10:22,020 x2-- would be just a lambda 1 squared lambda 964 01:10:22,020 --> 01:10:27,050 2 plus sigma 1 squared, or let's call it 965 01:10:27,050 --> 01:10:36,240 C11, times lambda 2 plus 2 lambda 1 C12. 966 01:10:36,240 --> 01:10:37,410 And that's it. 967 01:11:01,670 --> 01:11:05,670 So there is something that follows from this 968 01:11:05,670 --> 01:11:13,070 that it is used a lot in field theory. 969 01:11:13,070 --> 01:11:15,792 And it's called Wick's theorem. 970 01:11:15,792 --> 01:11:20,140 So that's just a particular case of this, 971 01:11:20,140 --> 01:11:22,480 but let's state it anyway. 972 01:11:32,410 --> 01:11:51,000 So for Gaussian distributed variables of 0 mean, 973 01:11:51,000 --> 01:11:53,770 following condition applies. 974 01:11:53,770 --> 01:11:56,665 I can take the first variable raised 975 01:11:56,665 --> 01:12:00,450 to power n1, the second variable to n2, 976 01:12:00,450 --> 01:12:03,592 the last variable to some other nN 977 01:12:03,592 --> 01:12:08,530 and look at a joint expectation value such as this. 978 01:12:08,530 --> 01:12:16,220 And this is 0 if sum over alpha and alpha 979 01:12:16,220 --> 01:12:37,120 is odd and is called sum over all pairwise contraction 980 01:12:37,120 --> 01:12:42,056 if a sum over alpha and alpha is even. 981 01:12:47,920 --> 01:12:52,320 So actually, I have right here an example of this. 982 01:12:52,320 --> 01:12:56,110 If I have a Gaussian variable-- jointly 983 01:12:56,110 --> 01:13:00,100 distributed Gaussian variables where the means are all 0-- so 984 01:13:00,100 --> 01:13:05,110 if I say that lambda 1 and lambda 2 are 0, 985 01:13:05,110 --> 01:13:08,210 then this is an odd power, x1 squared x2. 986 01:13:08,210 --> 01:13:10,300 Because of the symmetry it has to be 0, 987 01:13:10,300 --> 01:13:13,790 but you explicitly see that every term that I have 988 01:13:13,790 --> 01:13:16,840 will be multiplying some power of [INAUDIBLE]. 989 01:13:16,840 --> 01:13:19,000 Whereas if for other than this, I 990 01:13:19,000 --> 01:13:27,150 was looking at something like x1 squared x2 x3 991 01:13:27,150 --> 01:13:32,240 where the net power is even, then I 992 01:13:32,240 --> 01:13:35,070 could sort of imagine putting them 993 01:13:35,070 --> 01:13:37,330 into these kinds of diagrams. 994 01:13:37,330 --> 01:13:43,950 Or alternatively, I can imagine pairing these things 995 01:13:43,950 --> 01:13:45,780 in all possible ways. 996 01:13:45,780 --> 01:13:48,246 So one pairing would be this with this, 997 01:13:48,246 --> 01:13:57,730 this with this, which would have given me x1 squared C x2 x3 C. 998 01:13:57,730 --> 01:14:02,180 Another pairing would have been x1 with x2. 999 01:14:02,180 --> 01:14:05,650 And then, naturally, x1 with x3. 1000 01:14:05,650 --> 01:14:11,495 So I would have gotten x1 with x2 covariance x1 1001 01:14:11,495 --> 01:14:12,820 with extreme x3 covariance. 1002 01:14:15,660 --> 01:14:21,690 But I could have connected the x1 to x2 or the second x1 1003 01:14:21,690 --> 01:14:22,980 to x2. 1004 01:14:22,980 --> 01:14:26,920 So this comes in 2 variance. 1005 01:14:26,920 --> 01:14:38,320 And so the answer here would be C11 C23 plus 2 C1 2 C13. 1006 01:14:38,320 --> 01:14:40,190 Yes? 1007 01:14:40,190 --> 01:14:45,930 AUDIENCE: In your writing of x1 to the n1 [INAUDIBLE]. 1008 01:14:45,930 --> 01:14:47,850 It should be the cumulant, right? 1009 01:14:47,850 --> 01:14:48,696 Or is it the moment? 1010 01:14:48,696 --> 01:14:49,946 PROFESSOR: This is the moment. 1011 01:14:49,946 --> 01:14:50,735 AUDIENCE: OK. 1012 01:14:50,735 --> 01:14:54,420 PROFESSOR: The contractions are the covariances. 1013 01:14:54,420 --> 01:14:56,042 AUDIENCE: OK. 1014 01:14:56,042 --> 01:15:00,580 PROFESSOR: So the point is that the Gaussian distribution 1015 01:15:00,580 --> 01:15:05,180 is completely characterized in terms of its covariances. 1016 01:15:05,180 --> 01:15:07,665 Once you know the covariances, essentially you 1017 01:15:07,665 --> 01:15:09,700 know everything. 1018 01:15:09,700 --> 01:15:13,080 And in particular, you may be interested in some particular 1019 01:15:13,080 --> 01:15:14,980 combination of x's. 1020 01:15:14,980 --> 01:15:18,490 And then you use to express that in terms 1021 01:15:18,490 --> 01:15:22,160 of all possible pairwise contractions, 1022 01:15:22,160 --> 01:15:24,480 which are the covariances. 1023 01:15:24,480 --> 01:15:29,730 And essentially, in all of field theory, 1024 01:15:29,730 --> 01:15:32,900 you expand around some kind of a Gaussian background 1025 01:15:32,900 --> 01:15:35,870 or Gaussian 0 toward the result. 1026 01:15:35,870 --> 01:15:37,520 And then in your perturbation theory 1027 01:15:37,520 --> 01:15:40,200 you need various powers of your field 1028 01:15:40,200 --> 01:15:43,079 or some combination of powers, and you express them 1029 01:15:43,079 --> 01:15:44,620 through these kinds of relationships. 1030 01:15:55,400 --> 01:15:56,000 Any questions? 1031 01:16:01,440 --> 01:16:01,940 OK. 1032 01:16:08,329 --> 01:16:08,870 This is fine. 1033 01:16:08,870 --> 01:16:10,850 Let's get rid of this. 1034 01:16:22,480 --> 01:16:22,980 OK. 1035 01:16:25,710 --> 01:16:33,500 Now there is one result that all of statistical mechanics 1036 01:16:33,500 --> 01:16:35,290 hangs on. 1037 01:16:35,290 --> 01:16:39,230 So I expect that as I get old and I 1038 01:16:39,230 --> 01:16:43,810 get infirm or whatever and my memory vanishes, 1039 01:16:43,810 --> 01:16:46,272 the last thing that I will remember before I die 1040 01:16:46,272 --> 01:16:47,730 would be the central limit theorem. 1041 01:16:58,750 --> 01:17:01,740 And why is this important is because you end up 1042 01:17:01,740 --> 01:17:05,960 in statistical physics adding lots of things. 1043 01:17:05,960 --> 01:17:09,960 So really, the question that you have or you should be asking 1044 01:17:09,960 --> 01:17:13,280 is thermodynamics is a very precise thing. 1045 01:17:13,280 --> 01:17:15,370 It says that heat goes from the higher 1046 01:17:15,370 --> 01:17:16,810 temperature to lower temperature. 1047 01:17:16,810 --> 01:17:21,610 It doesn't say it does that 50% of the time or 95% of the time. 1048 01:17:21,610 --> 01:17:23,630 It's a definite statement. 1049 01:17:23,630 --> 01:17:25,660 If I am telling you that ultimately I'm 1050 01:17:25,660 --> 01:17:29,660 going to express everything in terms of probabilities, 1051 01:17:29,660 --> 01:17:31,610 how does that jive? 1052 01:17:31,610 --> 01:17:34,480 The reason that it jives is because of this theorem. 1053 01:17:34,480 --> 01:17:40,290 It's because in order to go from the probabilistic description, 1054 01:17:40,290 --> 01:17:43,170 you will be dealing with so many different-- so many 1055 01:17:43,170 --> 01:17:46,630 large number of variables-- that probabilistic statements 1056 01:17:46,630 --> 01:17:49,590 actually become precise deterministic statements. 1057 01:17:49,590 --> 01:17:52,200 And that's captured by this theorem, which 1058 01:17:52,200 --> 01:18:01,160 says that let's look at the sum of N random variables. 1059 01:18:06,550 --> 01:18:14,332 And I will indicate the sum by x and my random variables 1060 01:18:14,332 --> 01:18:17,750 as small x's. 1061 01:18:17,750 --> 01:18:23,230 And let's say that this is, for the individual set of things 1062 01:18:23,230 --> 01:18:25,580 that I'm adding up together, some kind 1063 01:18:25,580 --> 01:18:28,690 of a joint probability distribution 1064 01:18:28,690 --> 01:18:31,670 out of which I take these random variables. 1065 01:18:31,670 --> 01:18:40,230 So each instance of this sum is selected from this joint PDF, 1066 01:18:40,230 --> 01:18:45,510 so x itself is a random variable because of possible choices 1067 01:18:45,510 --> 01:18:49,180 of different xi from this probability distribution. 1068 01:18:51,720 --> 01:18:56,420 So what I'm interested is what is the probability for the sum? 1069 01:18:56,420 --> 01:19:03,770 So what is the p that determines this sum? 1070 01:19:09,410 --> 01:19:13,950 I will go by the root of these characteristic functions. 1071 01:19:13,950 --> 01:19:19,210 I will say, OK, what's the expectation value of-- well, 1072 01:19:19,210 --> 01:19:24,170 let's-- what's the Fourier transform of this probability 1073 01:19:24,170 --> 01:19:26,730 distribution? 1074 01:19:26,730 --> 01:19:31,450 If we transform, by definition it is the expectation of e 1075 01:19:31,450 --> 01:19:37,320 to the minus ik this big X, which is the sum over all off 1076 01:19:37,320 --> 01:19:38,351 the small x's. 1077 01:19:47,640 --> 01:19:49,370 Do I have that definition somewhere? 1078 01:19:49,370 --> 01:19:50,820 I erased it. 1079 01:19:50,820 --> 01:19:56,290 Basically, what is this? 1080 01:19:56,290 --> 01:20:02,740 If this k was, in fact, different k's-- if I had a k1 1081 01:20:02,740 --> 01:20:05,940 multiplying x1, k2 multiplying x2, 1082 01:20:05,940 --> 01:20:08,790 that would be the definition of the joint characteristics 1083 01:20:08,790 --> 01:20:13,350 function for this joint probability distribution. 1084 01:20:13,350 --> 01:20:18,470 So what this is is you take the joint characteristic function, 1085 01:20:18,470 --> 01:20:27,010 which depends on k1 k2, all the way to kn. 1086 01:20:27,010 --> 01:20:28,830 And you set all of them to be the same. 1087 01:20:38,460 --> 01:20:41,530 So take the joint characteristic function 1088 01:20:41,530 --> 01:20:44,090 depends on N Fourier variables. 1089 01:20:44,090 --> 01:20:47,000 Put all of them the same k and you have that for the sum. 1090 01:20:50,230 --> 01:20:55,970 So I can certainly do that by adding a log here. 1091 01:21:00,600 --> 01:21:02,720 Nothing has changed. 1092 01:21:02,720 --> 01:21:09,340 I know that the log is the generator of the cumulants. 1093 01:21:09,340 --> 01:21:13,470 So this is a sum over, let's say, n 1094 01:21:13,470 --> 01:21:16,690 running from 1 to infinity minus ik 1095 01:21:16,690 --> 01:21:20,390 to the power of n over n factorial, 1096 01:21:20,390 --> 01:21:22,990 the joint cumulant of the sum. 1097 01:21:29,640 --> 01:21:31,790 So what is the expansion that I would 1098 01:21:31,790 --> 01:21:37,580 have for log of the joint characteristic function? 1099 01:21:37,580 --> 01:21:43,150 Well, typically I would say have at the lowest order k1 times 1100 01:21:43,150 --> 01:21:45,930 the mean of the first variable, k2 times 1101 01:21:45,930 --> 01:21:47,690 the mean of the second variable. 1102 01:21:47,690 --> 01:21:49,440 But all of them are the same. 1103 01:21:49,440 --> 01:21:52,550 So the first order, I would get minus i 1104 01:21:52,550 --> 01:21:58,480 the same k sum over n of the first cumulant 1105 01:21:58,480 --> 01:22:00,360 of the N-th variable. 1106 01:22:06,180 --> 01:22:09,160 Typically, this second order term, 1107 01:22:09,160 --> 01:22:11,080 I would have all kinds of products. 1108 01:22:11,080 --> 01:22:15,920 I would have k1 k3 k2 k4, as well as k1 squared. 1109 01:22:15,920 --> 01:22:18,190 But now all of them become the same, 1110 01:22:18,190 --> 01:22:23,780 and so what I will have is a minus ik squared. 1111 01:22:23,780 --> 01:22:34,260 But then I have all possible pairings mn of xm xn cumulants. 1112 01:22:36,802 --> 01:22:37,770 AUDIENCE: Question. 1113 01:22:37,770 --> 01:22:38,410 PROFESSOR: Yes? 1114 01:22:38,410 --> 01:22:40,170 AUDIENCE: [INAUDIBLE] expression you probably 1115 01:22:40,170 --> 01:22:41,000 should use different indices when you're 1116 01:22:41,000 --> 01:22:43,365 summing over elements of Taylor serious 1117 01:22:43,365 --> 01:22:45,828 and when you're summing over your [INAUDIBLE] 1118 01:22:45,828 --> 01:22:46,806 random variables. 1119 01:22:46,806 --> 01:22:53,163 Just-- it gets confusing when both indexes are n. 1120 01:22:53,163 --> 01:22:56,330 PROFESSOR: This here, you want me to right here, say, i? 1121 01:22:56,330 --> 01:22:58,752 AUDIENCE: Yeah. 1122 01:22:58,752 --> 01:23:00,890 PROFESSOR: OK. 1123 01:23:00,890 --> 01:23:02,978 And here I can write i and j. 1124 01:23:12,020 --> 01:23:15,180 So I think there's still a 2 factorial. 1125 01:23:15,180 --> 01:23:18,680 And then there's higher orders. 1126 01:23:18,680 --> 01:23:23,490 Essentially then, matching the coefficients of minus ik 1127 01:23:23,490 --> 01:23:27,470 from the left minus ik from the right 1128 01:23:27,470 --> 01:23:35,440 will enable me to calculate relationships between cumulants 1129 01:23:35,440 --> 01:23:39,980 of the sum and cumulants of the individual variables. 1130 01:23:39,980 --> 01:23:44,070 This first one of them is not particularly surprising. 1131 01:23:44,070 --> 01:23:46,980 You would say that the mean of the sum 1132 01:23:46,980 --> 01:23:49,540 is sum of the means of the individual variables. 1133 01:23:54,890 --> 01:24:02,290 The second statement is that the variance of the sum 1134 01:24:02,290 --> 01:24:11,410 really involves a pair, i and j, running from 1 to N. 1135 01:24:11,410 --> 01:24:15,030 So if these variable were independent, 1136 01:24:15,030 --> 01:24:18,230 you would be just adding the variances. 1137 01:24:18,230 --> 01:24:21,640 Since they are potentially dependent, 1138 01:24:21,640 --> 01:24:25,610 you have to also keep track of covariances. 1139 01:24:28,320 --> 01:24:33,430 And this kind of summation extends to higher and higher 1140 01:24:33,430 --> 01:24:38,130 cumulants, essentially including more and more powers 1141 01:24:38,130 --> 01:24:40,905 of cumulants that you would put on that side. 1142 01:24:47,170 --> 01:24:49,570 And what we do with that, I guess 1143 01:24:49,570 --> 01:24:52,030 we'll start next time around.