1 00:00:00,060 --> 00:00:01,780 The following content is provided 2 00:00:01,780 --> 00:00:04,019 under a Creative Commons license. 3 00:00:04,019 --> 00:00:06,870 Your support will help MIT OpenCourseWare continue 4 00:00:06,870 --> 00:00:10,730 to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,330 To make a donation or view additional materials 6 00:00:13,330 --> 00:00:17,217 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,217 --> 00:00:17,842 at ocw.mit.edu. 8 00:00:20,870 --> 00:00:25,050 PROFESSOR: We established that, essentially, what we want to do 9 00:00:25,050 --> 00:00:29,500 is to describe the properties of a system that 10 00:00:29,500 --> 00:00:30,253 is in equilibrium. 11 00:00:33,340 --> 00:00:39,060 And a system in equilibrium is characterized 12 00:00:39,060 --> 00:00:41,920 by a certain number of parameters. 13 00:00:41,920 --> 00:00:48,090 We discussed displacement and forces 14 00:00:48,090 --> 00:00:52,200 that are used for mechanical properties. 15 00:00:52,200 --> 00:00:56,570 We described how when systems are in thermal equilibrium, 16 00:00:56,570 --> 00:01:01,800 the exchange of heat requires that there is temperature 17 00:01:01,800 --> 00:01:04,590 that will be the same between them. 18 00:01:04,590 --> 00:01:07,000 So that was where the Zeroth Law came and told us 19 00:01:07,000 --> 00:01:11,410 that there is another function of state. 20 00:01:11,410 --> 00:01:15,040 Then, we saw that, from the First Law, 21 00:01:15,040 --> 00:01:18,020 there was energy, which is another important function 22 00:01:18,020 --> 00:01:19,180 of state. 23 00:01:19,180 --> 00:01:26,445 And from the Second Law, we arrived at entropy. 24 00:01:29,920 --> 00:01:33,000 And then by manipulating these, we 25 00:01:33,000 --> 00:01:37,100 generated a whole set of other functions, free energy, 26 00:01:37,100 --> 00:01:41,830 enthalpy, Gibbs free energy, the grand potential, and the list 27 00:01:41,830 --> 00:01:43,850 goes on. 28 00:01:43,850 --> 00:01:46,670 And when the system is in equilibrium, 29 00:01:46,670 --> 00:01:49,920 it has a well-defined values of these quantities. 30 00:01:49,920 --> 00:01:52,670 You go from one equilibrium to another equilibrium, 31 00:01:52,670 --> 00:01:54,740 and these quantities change. 32 00:01:54,740 --> 00:01:57,790 But of course, we saw that the number of degrees of freedom 33 00:01:57,790 --> 00:02:00,700 that you have to describe the system 34 00:02:00,700 --> 00:02:08,030 is indicated through looking at the changes in energy, which 35 00:02:08,030 --> 00:02:10,740 if you were only doing mechanical work, 36 00:02:10,740 --> 00:02:15,620 you would write as sum over all possible ways of introducing 37 00:02:15,620 --> 00:02:17,755 mechanical work into the system. 38 00:02:20,296 --> 00:02:21,670 Then, we saw that it was actually 39 00:02:21,670 --> 00:02:24,330 useful to separate out the chemical work. 40 00:02:24,330 --> 00:02:26,410 So we could also write this as a sum 41 00:02:26,410 --> 00:02:31,040 of an alpha chemical potential number of particles. 42 00:02:31,040 --> 00:02:34,800 But there was also ways of changing 43 00:02:34,800 --> 00:02:40,270 the energy of the system through addition of heat. 44 00:02:40,270 --> 00:02:44,040 And so ultimately, we saw that if there 45 00:02:44,040 --> 00:02:49,530 were n ways of doing chemical and mechanical work, 46 00:02:49,530 --> 00:02:56,400 and one way of introducing heat into the system, essentially 47 00:02:56,400 --> 00:03:00,440 n plus 1 variables are sufficient to determine 48 00:03:00,440 --> 00:03:02,430 where you are in this phase space. 49 00:03:02,430 --> 00:03:05,370 Once you have n plus 1 of that list, 50 00:03:05,370 --> 00:03:08,990 you can input, in principle, determine others 51 00:03:08,990 --> 00:03:11,380 as long as you have not chosen things 52 00:03:11,380 --> 00:03:13,470 that are really dependent on each other. 53 00:03:13,470 --> 00:03:16,060 So you have to choose independent ones, 54 00:03:16,060 --> 00:03:19,650 and we had some discussion of how that comes into play. 55 00:03:22,190 --> 00:03:26,080 So I said that today we will briefly 56 00:03:26,080 --> 00:03:29,140 conclude with the last, or the Third Law. 57 00:03:33,580 --> 00:03:35,790 This is the statement about trying 58 00:03:35,790 --> 00:03:39,440 to calculate the behavior of entropy 59 00:03:39,440 --> 00:03:41,880 as a function of temperature. 60 00:03:41,880 --> 00:03:46,650 And in principle, you can imagine 61 00:03:46,650 --> 00:03:51,545 as a function of some coordinate of your system-- capital X 62 00:03:51,545 --> 00:03:55,520 could indicate pressure, volume, anything. 63 00:03:55,520 --> 00:03:58,590 You calculate that at some particular value 64 00:03:58,590 --> 00:04:03,620 of temperature, T, T the difference in entropy 65 00:04:03,620 --> 00:04:07,255 that you would have between two points 66 00:04:07,255 --> 00:04:08,895 parametrized by X1 and X2. 67 00:04:11,590 --> 00:04:15,220 And in principle, what you need to do 68 00:04:15,220 --> 00:04:21,149 is to find some kind of a path for changing parameters 69 00:04:21,149 --> 00:04:28,340 from X1 to X2 and calculate, in a reversible process, how 70 00:04:28,340 --> 00:04:31,370 much heat you have to put into the system. 71 00:04:31,370 --> 00:04:35,955 Let's say at this fixed temperature, T, divide by T. T 72 00:04:35,955 --> 00:04:43,350 is not changing along the process from say X1 to X2. 73 00:04:43,350 --> 00:04:46,680 And this would be a difference between the entropy 74 00:04:46,680 --> 00:04:51,490 that you would have between these two 75 00:04:51,490 --> 00:04:56,540 quantities, between these two points. 76 00:04:56,540 --> 00:04:59,690 You could, in principle, then repeat this process 77 00:04:59,690 --> 00:05:06,540 at some lower temperature and keep going all the way down 78 00:05:06,540 --> 00:05:09,530 to 0 temperature. 79 00:05:09,530 --> 00:05:20,120 What Nernst observed was that as he went through this procedure 80 00:05:20,120 --> 00:05:28,250 to lower and lower temperatures, this difference-- Let's 81 00:05:28,250 --> 00:05:35,840 call it delta s of T going from X1 to X2 goes to 0. 82 00:05:42,320 --> 00:05:48,580 So it looks like, certainly at this temperature, 83 00:05:48,580 --> 00:05:52,270 there is a change in entropy going from one to another. 84 00:05:52,270 --> 00:05:53,680 There's also a change. 85 00:05:53,680 --> 00:05:56,530 This change gets smaller and smaller 86 00:05:56,530 --> 00:05:59,330 as if, when you get to 0 temperature, 87 00:05:59,330 --> 00:06:03,240 the value of your entropy is independent of X. Whatever 88 00:06:03,240 --> 00:06:06,600 X you choose, you'll have the same value of entropy. 89 00:06:09,390 --> 00:06:16,030 Now, that led to, after a while, to a more ambitious version 90 00:06:16,030 --> 00:06:19,520 statement of the Third Law that I will write down, 91 00:06:19,520 --> 00:06:31,515 which is that the entropy of all substances at the zero 92 00:06:31,515 --> 00:06:44,680 of thermodynamic temperature is the same and can be set to 0. 93 00:06:44,680 --> 00:06:52,500 Same universal constant, set to 0. 94 00:06:56,200 --> 00:06:59,810 It's, in principle, through these integration 95 00:06:59,810 --> 00:07:01,940 from one point to another point, the only thing 96 00:07:01,940 --> 00:07:07,520 that you can calculate is the difference between entropies. 97 00:07:07,520 --> 00:07:10,920 And essentially, this suggests that the difference 98 00:07:10,920 --> 00:07:14,610 between entropies goes to 0, but let's be more ambitious 99 00:07:14,610 --> 00:07:18,000 and say that even if you look at different substances 100 00:07:18,000 --> 00:07:26,240 and you go to 0 temperature, all of them have a unique value. 101 00:07:26,240 --> 00:07:30,920 And so there's more evidence for being 102 00:07:30,920 --> 00:07:35,830 able to do this for different substances via what 103 00:07:35,830 --> 00:07:41,930 is called allotropic state. 104 00:07:44,830 --> 00:07:50,450 So for example, some materials can exist potentially 105 00:07:50,450 --> 00:07:53,440 in two different crystalline states that 106 00:07:53,440 --> 00:07:58,390 are called allotropes, for example, sulfur 107 00:07:58,390 --> 00:08:01,200 as a function of temperature. 108 00:08:01,200 --> 00:08:10,130 If you lower it's temperature very slowly, 109 00:08:10,130 --> 00:08:19,740 it stays in some foreign all the way down to 0 temperature. 110 00:08:19,740 --> 00:08:24,480 So if you change its temperature rapidly, it stays in one form 111 00:08:24,480 --> 00:08:29,760 all the way to 0 temperature in crystalline structure 112 00:08:29,760 --> 00:08:32,559 that is called monoclinic. 113 00:08:32,559 --> 00:08:35,500 If you cool it very, very slowly, 114 00:08:35,500 --> 00:08:41,110 there is a temperature around 40 degrees Celsius 115 00:08:41,110 --> 00:08:45,860 at which it makes a transition to a different crystal 116 00:08:45,860 --> 00:08:48,140 structure. 117 00:08:48,140 --> 00:08:49,390 That is rhombohedral. 118 00:08:53,300 --> 00:08:57,116 And the thing that I am plotting here, 119 00:08:57,116 --> 00:08:59,240 as a function of temperature, is the heat capacity. 120 00:09:06,460 --> 00:09:11,320 And so if you are, let's say, around room temperature, 121 00:09:11,320 --> 00:09:16,910 in principle you can say there's two different forms of sulfur. 122 00:09:16,910 --> 00:09:20,670 One of them is truly stable, and the other is metastable. 123 00:09:20,670 --> 00:09:25,950 That is, in principle, if you rate what sufficiently 124 00:09:25,950 --> 00:09:27,700 is of the order of [? centuries ?], 125 00:09:27,700 --> 00:09:31,900 you can get the transition from this form to the stable form. 126 00:09:31,900 --> 00:09:34,770 But for our purposes, at room temperature, 127 00:09:34,770 --> 00:09:37,470 you would say that at the scale of times 128 00:09:37,470 --> 00:09:40,260 that I'm observing things, there are these 2 possible states 129 00:09:40,260 --> 00:09:45,560 that are both equilibrium states of the same substance. 130 00:09:45,560 --> 00:09:47,650 Now using these two equilibrium states, 131 00:09:47,650 --> 00:09:52,780 I can start to test this Nernst theorem generalized 132 00:09:52,780 --> 00:09:54,910 to different substances. 133 00:09:54,910 --> 00:09:57,500 If you, again, regard these two different things 134 00:09:57,500 --> 00:09:59,440 as different substances. 135 00:09:59,440 --> 00:10:03,580 You could say that if I want to calculate the entropy just 136 00:10:03,580 --> 00:10:08,270 slightly above the transition, I can come from two paths. 137 00:10:08,270 --> 00:10:12,610 I can either come from path number one. 138 00:10:12,610 --> 00:10:15,930 Along path number one, I would say 139 00:10:15,930 --> 00:10:22,200 that the entropy at this Tc plus is obtained 140 00:10:22,200 --> 00:10:27,520 by integrating degree heat capacity, 141 00:10:27,520 --> 00:10:39,020 so integral dT Cx of T divided by T. 142 00:10:39,020 --> 00:10:42,760 This combination is none other than dQ. 143 00:10:42,760 --> 00:10:47,180 Basically, the combination of heat capacity dT 144 00:10:47,180 --> 00:10:49,240 is the amount of heat that you have 145 00:10:49,240 --> 00:10:52,040 to put the substance to change its temperature. 146 00:10:52,040 --> 00:10:58,100 And you do this all the way from 0 to Tc plus. 147 00:10:58,100 --> 00:11:00,130 Let's say we go along this path that 148 00:11:00,130 --> 00:11:05,290 corresponds to this monoclinic way. 149 00:11:05,290 --> 00:11:11,260 And I'm using this Cm that corresponds to this as opposed 150 00:11:11,260 --> 00:11:15,400 to 0 that corresponds to this. 151 00:11:15,400 --> 00:11:19,850 Another thing that I can do-- and I made a mistake 152 00:11:19,850 --> 00:11:24,050 because what I really need to do is to, in principle, 153 00:11:24,050 --> 00:11:27,710 add to this some entropy that I would 154 00:11:27,710 --> 00:11:32,140 assign to this green state at 0 because this is the difference. 155 00:11:32,140 --> 00:11:36,270 So this is the entropy that I would 156 00:11:36,270 --> 00:11:41,080 assign to the monoclinic state at T close to 0. 157 00:11:41,080 --> 00:11:44,740 Going along the orange path, I would 158 00:11:44,740 --> 00:11:49,820 say that S evaluated at Tc plus is 159 00:11:49,820 --> 00:11:54,300 obtained by integrating from 0. 160 00:11:54,300 --> 00:11:59,110 Let's say to Tc minus dT, the heat 161 00:11:59,110 --> 00:12:01,730 capacity of this rhombic phase. 162 00:12:06,580 --> 00:12:10,160 But when I get to just below the transition, 163 00:12:10,160 --> 00:12:13,650 I want to go to just above the transition. 164 00:12:13,650 --> 00:12:18,080 I have to actually be put in certain amount of latent heat. 165 00:12:18,080 --> 00:12:25,130 So here I have to add latent heat L, always 166 00:12:25,130 --> 00:12:29,120 at the temperatures Tc, to gradually make the substance 167 00:12:29,120 --> 00:12:31,650 transition from one to the other. 168 00:12:31,650 --> 00:12:35,470 So I have to add here L of Tc. 169 00:12:35,470 --> 00:12:40,650 This would be the integration of dQ, 170 00:12:40,650 --> 00:12:46,160 but then I would have to add the entropy that I would assign 171 00:12:46,160 --> 00:12:49,910 to the orange state at 0 temperature. 172 00:12:49,910 --> 00:12:53,040 So this is something that you can do experimentally. 173 00:12:53,040 --> 00:12:55,930 You can evaluate at these integrals, and what you'll find 174 00:12:55,930 --> 00:12:59,970 is that these two things are the same. 175 00:12:59,970 --> 00:13:03,250 So this is yet another justification 176 00:13:03,250 --> 00:13:09,210 of this entropy being independent 177 00:13:09,210 --> 00:13:13,000 of where you start at 0 temperature. 178 00:13:13,000 --> 00:13:16,030 Again at this point, if you like, 179 00:13:16,030 --> 00:13:18,100 you can by [INAUDIBLE] state that this 180 00:13:18,100 --> 00:13:21,215 is 0 for everything will start with 0. 181 00:13:27,730 --> 00:13:35,130 So this is a supposed new law of thermodynamics. 182 00:13:35,130 --> 00:13:36,000 Is it useful? 183 00:13:36,000 --> 00:13:38,770 What can we deduce from that? 184 00:13:38,770 --> 00:13:40,830 So let's look at the consequences. 185 00:13:46,280 --> 00:13:48,740 First thing is so what I have established 186 00:13:48,740 --> 00:13:54,890 is that the limit as T goes to 0 of S, 187 00:13:54,890 --> 00:13:57,440 irrespective of whatever set of parameters 188 00:13:57,440 --> 00:14:02,920 I have-- so I pick T as one of my n plus one coordinates, 189 00:14:02,920 --> 00:14:06,490 and I put some other bunch of coordinates here. 190 00:14:06,490 --> 00:14:08,820 I take the limit of this going to 0. 191 00:14:08,820 --> 00:14:09,870 This becomes 0. 192 00:14:13,410 --> 00:14:17,790 So that means, almost by construction, 193 00:14:17,790 --> 00:14:24,540 that if I take the derivative of S with respect to any of these 194 00:14:24,540 --> 00:14:30,650 coordinates-- if I take then the limit as T goes to 0, 195 00:14:30,650 --> 00:14:33,790 this would be fixed T. This is 0. 196 00:14:39,690 --> 00:14:40,190 Fine. 197 00:14:40,190 --> 00:14:44,220 So basically, this is another way 198 00:14:44,220 --> 00:14:49,290 of stating that entropy differences go through 0. 199 00:14:49,290 --> 00:14:52,890 But it does have a consequence because one thing that you will 200 00:14:52,890 --> 00:14:55,797 frequently measure are quantities, 201 00:14:55,797 --> 00:14:56,630 such as extensivity. 202 00:15:01,160 --> 00:15:03,700 What do I mean by that? 203 00:15:03,700 --> 00:15:05,050 Let's pick a displacement. 204 00:15:05,050 --> 00:15:07,650 Could be the length of a wire. 205 00:15:07,650 --> 00:15:11,210 Could be the volume of a gas. 206 00:15:11,210 --> 00:15:17,860 And we can ask if I were to change temperature, 207 00:15:17,860 --> 00:15:21,030 how does that quantity change? 208 00:15:21,030 --> 00:15:26,230 So these are quantities typically called alpha. 209 00:15:26,230 --> 00:15:29,020 Actually, usually you would also divide 210 00:15:29,020 --> 00:15:35,115 by x to make them intensive because otherwise x 211 00:15:35,115 --> 00:15:36,910 being extensive, the whole quantity 212 00:15:36,910 --> 00:15:39,960 would have been extensive. 213 00:15:39,960 --> 00:15:44,840 Let's say we do this at fixed corresponding displacement. 214 00:15:44,840 --> 00:15:48,430 So something that is very relevant 215 00:15:48,430 --> 00:15:51,550 is you take the volume of gas who changes temperature 216 00:15:51,550 --> 00:15:54,350 at fixed pressure, and the volume of the gas 217 00:15:54,350 --> 00:15:59,070 will shrink or expand according to this extensive. 218 00:15:59,070 --> 00:16:04,970 Now, this can be related to this through Maxwell relationship. 219 00:16:04,970 --> 00:16:07,620 So let's see what I have to do. 220 00:16:07,620 --> 00:16:13,520 I have that dE is something like Jdx plus, 221 00:16:13,520 --> 00:16:17,990 according to what I have over there, TdS. 222 00:16:17,990 --> 00:16:21,940 I want to be able to write a Maxwell relation that 223 00:16:21,940 --> 00:16:25,420 relates a derivative of x. 224 00:16:25,420 --> 00:16:28,510 So I want to make x into a first derivative. 225 00:16:28,510 --> 00:16:31,640 So I look at E minus Jx. 226 00:16:31,640 --> 00:16:36,240 And this Jdx becomes minus xdJ. 227 00:16:36,240 --> 00:16:40,070 But I want to take a derivative of x with respect not 228 00:16:40,070 --> 00:16:44,590 s, but with respect to T. So I'll do that. 229 00:16:44,590 --> 00:16:48,130 This becomes a minus SdT. 230 00:16:48,130 --> 00:16:52,070 So now, I immediately see that I will have a Maxwell 231 00:16:52,070 --> 00:16:58,110 relation that says dx by dT at constant J 232 00:16:58,110 --> 00:17:02,850 is the same thing as dS by dJ at constant T. 233 00:17:02,850 --> 00:17:09,321 So this is the same thing by the Maxwell relation as dS 234 00:17:09,321 --> 00:17:23,000 by dJ at constant T. All right? 235 00:17:23,000 --> 00:17:27,599 This is one of these quantities, therefore, as T goes 0, 236 00:17:27,599 --> 00:17:28,590 this goes to 0. 237 00:17:28,590 --> 00:17:33,370 And therefore, the expansivity should go to 0. 238 00:17:33,370 --> 00:17:38,700 So any quantity that measures expansion, contraction, 239 00:17:38,700 --> 00:17:42,070 or some other change as a function of temperature, 240 00:17:42,070 --> 00:17:45,230 according to this law, as you go through 0 temperature, 241 00:17:45,230 --> 00:17:46,350 should go to 0. 242 00:17:49,780 --> 00:17:54,100 There's one other quantity that also goes to 0, 243 00:17:54,100 --> 00:17:57,810 and that's the heat capacity. 244 00:17:57,810 --> 00:18:04,160 So if I want to calculate the difference between entropy 245 00:18:04,160 --> 00:18:10,820 at some temperature T and some temperature 246 00:18:10,820 --> 00:18:15,570 at 0 along some particular path corresponding 247 00:18:15,570 --> 00:18:18,810 to some constant x for example, you 248 00:18:18,810 --> 00:18:20,640 would say that what I need to do is 249 00:18:20,640 --> 00:18:27,310 to integrate from 0 to T the heat that I have 250 00:18:27,310 --> 00:18:31,370 to put into the system at constant x. 251 00:18:31,370 --> 00:18:34,700 And so if I do that slowly enough, 252 00:18:34,700 --> 00:18:38,720 this heat I can write as CxdT. 253 00:18:38,720 --> 00:18:41,370 Cx, potentially, is a function of T. 254 00:18:41,370 --> 00:18:44,750 Actually, since I'm indicating T as the other point 255 00:18:44,750 --> 00:18:48,710 of integration, let me call the variable of integration T 256 00:18:48,710 --> 00:18:50,610 prime. 257 00:18:50,610 --> 00:18:56,620 So I take a path in which I change temperature. 258 00:18:56,620 --> 00:19:00,320 I calculate the heat capacity at constant x. 259 00:19:00,320 --> 00:19:02,560 Integrate it. 260 00:19:02,560 --> 00:19:06,434 Multiply by dT to convert it to T, and get the result. 261 00:19:09,220 --> 00:19:13,380 So all of these results that they have been formulating 262 00:19:13,380 --> 00:19:18,630 suggest that the result that you would get as a function of T, 263 00:19:18,630 --> 00:19:27,600 for entropy, is something that as T goes to 0, approaches 0. 264 00:19:27,600 --> 00:19:31,640 So it should be a perfectly nice, well-defined value 265 00:19:31,640 --> 00:19:35,210 at any finite temperature. 266 00:19:35,210 --> 00:19:38,170 Now, if you integrate a constant divided 267 00:19:38,170 --> 00:19:42,420 by T, divided by dT, then essentially the constant 268 00:19:42,420 --> 00:19:44,020 would give you a logarithm. 269 00:19:44,020 --> 00:19:48,100 And the logarithm would blow up as we go to 0 temperature. 270 00:19:48,100 --> 00:19:52,240 So the only way that this integral does not 271 00:19:52,240 --> 00:20:01,590 blow up on you-- so this is finite only if the limit as T 272 00:20:01,590 --> 00:20:08,390 goes to 0 of the heat capacities should also go to 0. 273 00:20:08,390 --> 00:20:12,550 So any heat capacity should also essentially vanish 274 00:20:12,550 --> 00:20:14,890 as you go to lower and lower temperature. 275 00:20:14,890 --> 00:20:17,940 This is something that you will see many, many times when 276 00:20:17,940 --> 00:20:20,880 you look at different heat capacities 277 00:20:20,880 --> 00:20:22,490 in the rest of the course. 278 00:20:25,030 --> 00:20:27,150 There is one other aspect of this 279 00:20:27,150 --> 00:20:30,900 that I will not really explain, but you can go and look 280 00:20:30,900 --> 00:20:36,410 at the notes or elsewhere, which is that another consequence is 281 00:20:36,410 --> 00:20:47,222 unattainability of T equals to 0 by any finite set 282 00:20:47,222 --> 00:20:47,805 of operations. 283 00:20:55,900 --> 00:21:00,130 Essentially, if you want to get to 0 temperature, 284 00:21:00,130 --> 00:21:03,840 you'll have to do something that cools you step by step. 285 00:21:03,840 --> 00:21:05,880 And the steps become smaller and smaller, 286 00:21:05,880 --> 00:21:08,680 and you have to repeat that many times. 287 00:21:08,680 --> 00:21:13,290 But that is another consequence. 288 00:21:13,290 --> 00:21:15,055 We'll leave that for the time being. 289 00:21:18,780 --> 00:21:25,430 I would like to, however, end by discussing some distinctions 290 00:21:25,430 --> 00:21:28,350 that are between these different laws. 291 00:21:31,390 --> 00:21:36,140 So if you think about whatever could 292 00:21:36,140 --> 00:21:42,210 be the microscopic origin, after all, I 293 00:21:42,210 --> 00:21:46,240 have emphasized that thermodynamics 294 00:21:46,240 --> 00:21:49,960 is a set of rules that you look at substances as black boxes 295 00:21:49,960 --> 00:21:53,550 and you try to deduce a certain number of things 296 00:21:53,550 --> 00:21:58,210 based on observations, such as what Nernst did over here. 297 00:21:58,210 --> 00:22:00,730 But you say, these black boxes, I 298 00:22:00,730 --> 00:22:03,750 know what is inside them in principle. 299 00:22:03,750 --> 00:22:07,820 It's composed of atoms, molecules, light, quark, 300 00:22:07,820 --> 00:22:09,670 whatever the microscope theory is 301 00:22:09,670 --> 00:22:14,380 that you want to assign to the components of that box. 302 00:22:14,380 --> 00:22:16,530 And I know the dynamics that governs 303 00:22:16,530 --> 00:22:18,770 these microscopic degrees of freedom. 304 00:22:18,770 --> 00:22:22,740 I should be able to get the laws of thermodynamics 305 00:22:22,740 --> 00:22:27,000 starting from the microscopic laws. 306 00:22:27,000 --> 00:22:30,170 Eventually, we will do that, and as we do that, 307 00:22:30,170 --> 00:22:34,560 we will find the origin of these different laws. 308 00:22:34,560 --> 00:22:39,670 Now, you won't be surprised that the First Law is intimately 309 00:22:39,670 --> 00:22:44,390 connected to the fact that any microscopic set of rules 310 00:22:44,390 --> 00:22:48,120 that you write down embodies the conservation of energy. 311 00:22:52,570 --> 00:22:56,780 And all you have to make sure is to understand 312 00:22:56,780 --> 00:23:01,750 precisely what heat is as a form of energy. 313 00:23:01,750 --> 00:23:07,060 And then if we regard heat as another form of energy, 314 00:23:07,060 --> 00:23:10,250 another component, it's really the conservation law 315 00:23:10,250 --> 00:23:10,870 that we have. 316 00:23:14,630 --> 00:23:17,190 Then, you have the Zeroth Law and the Second Law. 317 00:23:21,650 --> 00:23:25,530 The Zeroth Law and Second Law have to do with equilibrium 318 00:23:25,530 --> 00:23:28,980 and being able to go in some particular direction. 319 00:23:28,980 --> 00:23:35,170 And that always runs a fall of the microscopic laws of motion 320 00:23:35,170 --> 00:23:39,370 that are typically things that are time reversible where 321 00:23:39,370 --> 00:23:42,360 as the Zeroth Law and Second Law are not. 322 00:23:42,360 --> 00:23:46,390 And what we will see later on, through statistical mechanics, 323 00:23:46,390 --> 00:23:49,630 is that the origin of these laws is 324 00:23:49,630 --> 00:24:00,180 that we are dealing with large numbers of degrees of freedom. 325 00:24:03,760 --> 00:24:09,580 And once we adapt the proper perspective 326 00:24:09,580 --> 00:24:12,960 to looking at properties of large numbers of degrees 327 00:24:12,960 --> 00:24:15,330 of freedom, which will be a start 328 00:24:15,330 --> 00:24:19,000 to do the elements of that [? prescription ?] today, 329 00:24:19,000 --> 00:24:20,870 the Zeroth Law and Second Law emerge. 330 00:24:23,990 --> 00:24:32,790 Now the Third Law, you all know that once we 331 00:24:32,790 --> 00:24:37,590 go through this process, eventually for example, 332 00:24:37,590 --> 00:24:41,560 we get things for the description of entropy, which 333 00:24:41,560 --> 00:24:44,970 is related to some number of states 334 00:24:44,970 --> 00:24:48,740 that the system has indicated by g. 335 00:24:48,740 --> 00:24:55,880 And if you then want to have S going through 0, 336 00:24:55,880 --> 00:24:59,690 you would require that g goes to something 337 00:24:59,690 --> 00:25:04,600 that is order of 1-- of 1 if you like-- as T goes to 0. 338 00:25:08,120 --> 00:25:10,520 And typically, you would say that systems 339 00:25:10,520 --> 00:25:16,890 adopt their ground state, lowest energy state, at 0 temperature. 340 00:25:16,890 --> 00:25:19,340 And so this is somewhat a statement 341 00:25:19,340 --> 00:25:23,470 about the uniqueness of the state of all possible systems 342 00:25:23,470 --> 00:25:26,120 at low temperature. 343 00:25:26,120 --> 00:25:31,420 Now, if you think about the gas in this room, 344 00:25:31,420 --> 00:25:36,520 and let's imagine that the particles of this gas 345 00:25:36,520 --> 00:25:40,830 either don't interact, which is maybe a little bit unrealistic, 346 00:25:40,830 --> 00:25:42,640 but maybe repel each other. 347 00:25:42,640 --> 00:25:44,810 So let's say you have a bunch of particles 348 00:25:44,810 --> 00:25:46,910 that just repel each other. 349 00:25:46,910 --> 00:25:51,180 Then, there is really no reason why, 350 00:25:51,180 --> 00:25:54,850 as I go to lower and lower temperatures, 351 00:25:54,850 --> 00:25:59,930 the number of configurations of the molecules should decrease. 352 00:25:59,930 --> 00:26:03,180 All configurations that I draw that they don't overlap 353 00:26:03,180 --> 00:26:05,900 have roughly the same energy. 354 00:26:05,900 --> 00:26:11,250 And indeed, if I look at say any one of these properties, 355 00:26:11,250 --> 00:26:15,330 like the expansivity of a gas at constant pressure 356 00:26:15,330 --> 00:26:19,230 which is given in fact with a minus sign. 357 00:26:19,230 --> 00:26:23,850 dV by dT at constant pressure would be the analog of one 358 00:26:23,850 --> 00:26:25,820 of these extensivities. 359 00:26:25,820 --> 00:26:32,290 If I use the Ideal Gas Law-- So for ideal gas, 360 00:26:32,290 --> 00:26:35,530 we've seen that PV is proportional to let's 361 00:26:35,530 --> 00:26:38,480 say some temperature. 362 00:26:38,480 --> 00:26:42,290 Then, dV by dT at constant pressure 363 00:26:42,290 --> 00:26:46,390 is none other than V over T. So this would give me 364 00:26:46,390 --> 00:26:59,270 1 over V, V over T. Probably don't need it on this. 365 00:26:59,270 --> 00:27:03,000 This is going to give me 1 over T. 366 00:27:03,000 --> 00:27:07,410 So not only doesn't it go to 0 at 0 temperature, 367 00:27:07,410 --> 00:27:10,960 if the Ideal Gas Law was satisfied, 368 00:27:10,960 --> 00:27:13,140 the extensivity would actually diverge 369 00:27:13,140 --> 00:27:17,590 at 0 temperature as different as you want. 370 00:27:17,590 --> 00:27:21,760 So clearly the Ideal Gas Law, if it was applicable 371 00:27:21,760 --> 00:27:24,210 all the way down to 0 temperature, 372 00:27:24,210 --> 00:27:27,330 would violate the Third Law of thermodynamics. 373 00:27:27,330 --> 00:27:30,560 Again, not surprising given that I have told you 374 00:27:30,560 --> 00:27:33,770 that a gas of classical particles with repulsion 375 00:27:33,770 --> 00:27:36,880 has many states. 376 00:27:36,880 --> 00:27:39,850 Now, we will see later on in the course 377 00:27:39,850 --> 00:27:48,220 that once we include quantum mechanics, then as you 378 00:27:48,220 --> 00:27:50,910 go to 0 temperature, these particles 379 00:27:50,910 --> 00:27:54,472 will have a unique state. 380 00:27:54,472 --> 00:27:59,177 If they are bosons, they will be together in one wave function. 381 00:27:59,177 --> 00:28:01,260 If they are fermions, they will arrange themselves 382 00:28:01,260 --> 00:28:05,190 appropriately so that, because of quantum mechanics, 383 00:28:05,190 --> 00:28:08,530 all of these laws would certainly breakdown at T equals 384 00:28:08,530 --> 00:28:09,750 to 0. 385 00:28:09,750 --> 00:28:13,080 You will get 0 entropy, and you would get consistency 386 00:28:13,080 --> 00:28:16,010 with all of these things. 387 00:28:16,010 --> 00:28:18,650 So somehow, the nature of the Third Law 388 00:28:18,650 --> 00:28:22,470 is different from the other laws because its validity 389 00:28:22,470 --> 00:28:28,365 rests on being able to be living in a world 390 00:28:28,365 --> 00:28:31,010 where quantum mechanics applies. 391 00:28:31,010 --> 00:28:34,350 So in principle, you could have imagined some other universe 392 00:28:34,350 --> 00:28:36,700 where h-bar equals to 0, and then 393 00:28:36,700 --> 00:28:39,580 the Third Law of thermodynamics would not hold there 394 00:28:39,580 --> 00:28:41,990 whereas the Zeroth Law and Second Law would hold. 395 00:28:41,990 --> 00:28:42,839 Yes? 396 00:28:42,839 --> 00:28:45,616 AUDIENCE: Are there any known exceptions to the Third Law? 397 00:28:45,616 --> 00:28:47,240 Are we going to [? account for them? ?] 398 00:28:52,330 --> 00:28:55,160 PROFESSOR: For equilibrium-- So this 399 00:28:55,160 --> 00:28:59,010 is actually an interesting question. 400 00:28:59,010 --> 00:29:03,050 What do I know about-- classically, 401 00:29:03,050 --> 00:29:07,550 I can certainly come up with lots of examples that violate. 402 00:29:07,550 --> 00:29:11,320 So your question then amounts if I say that quantum mechanics is 403 00:29:11,320 --> 00:29:15,130 necessary, do I know that the ground 404 00:29:15,130 --> 00:29:18,790 state of a quantum mechanical system is unique. 405 00:29:18,790 --> 00:29:22,830 And I don't know of a proof of that for interacting system. 406 00:29:22,830 --> 00:29:27,380 I don't know of a case that's violated, but as far as I know, 407 00:29:27,380 --> 00:29:32,240 there is no proof that I give you an interacting Hamiltonian 408 00:29:32,240 --> 00:29:36,550 for a quantum system, and there's a unique ground state. 409 00:29:36,550 --> 00:29:39,110 And I should say that there'd be no-- 410 00:29:39,110 --> 00:29:42,350 and I'm sure you know of cases where the ground state is not 411 00:29:42,350 --> 00:29:44,880 unique like a ferromagnet. 412 00:29:44,880 --> 00:29:50,040 But the point is not that g should be exactly one, 413 00:29:50,040 --> 00:29:54,460 but that the limit of log g divided 414 00:29:54,460 --> 00:29:56,460 by the number of degrees of freedom 415 00:29:56,460 --> 00:30:02,280 that you have should go to 0 as n goes to infinity. 416 00:30:02,280 --> 00:30:07,620 So something like a ferromagnet may have many ground states, 417 00:30:07,620 --> 00:30:09,590 but the number of ground states is not 418 00:30:09,590 --> 00:30:13,440 proportional to the number of sites, the number of spins, 419 00:30:13,440 --> 00:30:15,770 and this entity will go to 0. 420 00:30:15,770 --> 00:30:19,330 So all the cases that we know, the ground state 421 00:30:19,330 --> 00:30:24,100 is either unique or is order of one. 422 00:30:24,100 --> 00:30:27,300 But I don't know a theorem that says that should be the case. 423 00:30:34,600 --> 00:30:36,550 So this is the last thing that I wanted 424 00:30:36,550 --> 00:30:39,100 to say about thermodynamics. 425 00:30:39,100 --> 00:30:40,910 Are there any questions in general? 426 00:30:45,850 --> 00:30:50,170 So I laid out the necessity of having 427 00:30:50,170 --> 00:30:56,240 some kind of a description of microscopic degrees of freedom 428 00:30:56,240 --> 00:31:00,680 that ultimately will allow us to prove 429 00:31:00,680 --> 00:31:03,140 the laws of thermodynamics. 430 00:31:03,140 --> 00:31:08,570 And that will come through statistical mechanics, which 431 00:31:08,570 --> 00:31:12,410 as the name implies, has to have certain amount 432 00:31:12,410 --> 00:31:16,010 of statistic characters to it. 433 00:31:16,010 --> 00:31:18,230 What does that mean? 434 00:31:18,230 --> 00:31:22,240 It means that you have to abandon a description of motion 435 00:31:22,240 --> 00:31:26,360 that is fully deterministic for one 436 00:31:26,360 --> 00:31:27,725 that is based on probability. 437 00:31:30,860 --> 00:31:34,760 Now, I could have told you first the degrees of freedom 438 00:31:34,760 --> 00:31:37,090 and what is the description that we 439 00:31:37,090 --> 00:31:40,610 need for them to be probabilistic, 440 00:31:40,610 --> 00:31:43,740 but I find it more useful to first lay out 441 00:31:43,740 --> 00:31:45,720 what the language of probability is 442 00:31:45,720 --> 00:31:48,915 that we will be using and then bring 443 00:31:48,915 --> 00:31:52,200 in the description of the microscopic degrees of freedom 444 00:31:52,200 --> 00:31:55,690 within this language. 445 00:31:55,690 --> 00:32:06,590 So if we go first with definitions-- 446 00:32:06,590 --> 00:32:12,610 and you could, for example, go to the branch of mathematics 447 00:32:12,610 --> 00:32:16,290 that deals with probability, and you will encounter something 448 00:32:16,290 --> 00:32:21,970 like this that what probability describes is a random variable. 449 00:32:27,050 --> 00:32:36,470 Let's call it X, which has a number of possible outcomes, 450 00:32:36,470 --> 00:32:49,740 which we put together into a set of outcomes, S. 451 00:32:49,740 --> 00:33:00,410 And this set can be discrete as would be the case if you were 452 00:33:00,410 --> 00:33:05,030 tossing a coin, and the outcomes would 453 00:33:05,030 --> 00:33:11,680 be either a head or a tail, or we were throwing a dice, 454 00:33:11,680 --> 00:33:15,215 and the outcomes would be the faces 1 through 6. 455 00:33:19,050 --> 00:33:23,180 And we will encounter mostly actually cases 456 00:33:23,180 --> 00:33:25,150 where S is continuous. 457 00:33:28,610 --> 00:33:34,300 Like for example, if I want to describe the velocity of a gas 458 00:33:34,300 --> 00:33:39,950 particle in this room, I need to specify the three components 459 00:33:39,950 --> 00:33:44,530 of velocity that can be anywhere, let's say, 460 00:33:44,530 --> 00:33:46,210 in the range of real numbers. 461 00:33:51,130 --> 00:34:10,929 And again, mathematicians would say that to each event, 462 00:34:10,929 --> 00:34:20,080 which is a subset of possible outcomes, 463 00:34:20,080 --> 00:34:30,480 is assigned a value which we must satisfy the following 464 00:34:30,480 --> 00:34:30,980 properties. 465 00:34:35,360 --> 00:34:41,210 First thing is the probability of anything 466 00:34:41,210 --> 00:34:44,440 is a positive number. 467 00:34:44,440 --> 00:34:46,763 And so this is positivity. 468 00:34:55,100 --> 00:34:57,550 The second thing is additivity. 469 00:35:01,360 --> 00:35:08,180 That is the probability of two events, A or B, 470 00:35:08,180 --> 00:35:13,360 is the sum total of the probabilities 471 00:35:13,360 --> 00:35:17,395 if A and B are disjoint or distinct. 472 00:35:23,389 --> 00:35:24,930 And finally, there's a normalization. 473 00:35:31,110 --> 00:35:33,980 That if you're event is that something should happen 474 00:35:33,980 --> 00:35:38,888 the entire set, the probability that you assign to that is 1. 475 00:35:42,310 --> 00:35:45,100 So these are formal statements. 476 00:35:45,100 --> 00:35:49,200 And if you are a mathematician, you start from there, 477 00:35:49,200 --> 00:35:52,020 and you prove theorems. 478 00:35:52,020 --> 00:35:59,900 But from our perspective, the first question to ask 479 00:35:59,900 --> 00:36:08,960 is how to determine this quantity probability 480 00:36:08,960 --> 00:36:12,070 that something should happen. 481 00:36:12,070 --> 00:36:18,670 If it is useful and I want to do something real world about it, 482 00:36:18,670 --> 00:36:23,340 I should be able to measure it or assign values to it. 483 00:36:23,340 --> 00:36:28,290 And very roughly again, in theory, 484 00:36:28,290 --> 00:36:34,240 we can assign probabilities two different ways. 485 00:36:34,240 --> 00:36:36,940 One way is called objective. 486 00:36:41,010 --> 00:36:45,300 And from the perspective of us as physicists 487 00:36:45,300 --> 00:36:48,390 corresponds to what would be an experimental procedure. 488 00:36:51,700 --> 00:37:06,020 And if it is assigning p of e as the frequency of outcomes 489 00:37:06,020 --> 00:37:16,400 in large number of trials, i.e. you 490 00:37:16,400 --> 00:37:22,770 would say that the probability that event A is obtained 491 00:37:22,770 --> 00:37:27,940 is the number of times you would get outcome A divided 492 00:37:27,940 --> 00:37:34,040 by the total number of trials as n goes to infinity. 493 00:37:34,040 --> 00:37:40,790 So for example, if you want to assign a probability that when 494 00:37:40,790 --> 00:37:48,120 you throw a dice that face 1 comes up, what you could do 495 00:37:48,120 --> 00:37:52,590 is you could make a table of the number of times 496 00:37:52,590 --> 00:37:57,690 1 shows up divided by the number of times you throw the dice. 497 00:37:57,690 --> 00:38:04,050 Maybe you throw it 100 times, and you get 15. 498 00:38:04,050 --> 00:38:08,767 You throw it 200 times, and you get-- 499 00:38:08,767 --> 00:38:09,850 that is probably too much. 500 00:38:09,850 --> 00:38:14,340 Let's say 15-- you get 35. 501 00:38:14,340 --> 00:38:19,765 And you do it 300 times, and you get something close to 48. 502 00:38:19,765 --> 00:38:24,640 The ratio of these things, as the number 503 00:38:24,640 --> 00:38:27,330 gets larger and larger, hopefully 504 00:38:27,330 --> 00:38:30,130 will converge to something that you would call the probability. 505 00:38:32,770 --> 00:38:38,380 Now, it turns out that in statistical physics, 506 00:38:38,380 --> 00:38:42,660 we will assign things through a totally different procedure 507 00:38:42,660 --> 00:38:46,330 which is subjective. 508 00:38:46,330 --> 00:38:51,140 If you like, it's more theoretical, 509 00:38:51,140 --> 00:39:07,020 which is based on uncertainty among all outcomes. 510 00:39:11,340 --> 00:39:16,224 Because if I were to subjectively assign 511 00:39:16,224 --> 00:39:21,740 to throwing the dice and coming up with value of 1, 512 00:39:21,740 --> 00:39:28,010 I would say, well, there's six possible faces for the dice. 513 00:39:28,010 --> 00:39:30,900 I don't know anything about this dice being loaded, 514 00:39:30,900 --> 00:39:34,750 so I will say they are all equally alike. 515 00:39:34,750 --> 00:39:37,570 Now, that may or may not be a correct assumption. 516 00:39:37,570 --> 00:39:38,520 You could test it. 517 00:39:38,520 --> 00:39:40,180 You could maybe throw it many times. 518 00:39:40,180 --> 00:39:43,250 You will find that either the dice is loaded or not loaded 519 00:39:43,250 --> 00:39:45,050 and this is correct or not. 520 00:39:45,050 --> 00:39:48,830 But you begin by making this assumption. 521 00:39:48,830 --> 00:39:52,620 And this is actually, we will see later on, exactly 522 00:39:52,620 --> 00:39:54,890 the type of assumption that you would 523 00:39:54,890 --> 00:39:57,210 be making in statistical physics. 524 00:40:05,979 --> 00:40:07,520 Any question about these definitions? 525 00:40:10,980 --> 00:40:17,020 So let's again proceed slowly to get 526 00:40:17,020 --> 00:40:25,430 some definitions established by looking at one random variable. 527 00:40:25,430 --> 00:40:31,890 So this is the next section on one random variable. 528 00:40:34,770 --> 00:40:38,770 And I will assume that I'll look at the case 529 00:40:38,770 --> 00:40:40,670 of the continuous random variable. 530 00:40:40,670 --> 00:40:49,270 So x can be any real number minus infinity to infinity. 531 00:40:49,270 --> 00:40:52,170 Now, a number of definitions. 532 00:40:52,170 --> 00:40:59,260 I will use the term Cumulative-- make 533 00:40:59,260 --> 00:41:25,420 sure I'll use the-- Cumulative Probability Function, CPF, 534 00:41:25,420 --> 00:41:28,690 that for this one random variable, 535 00:41:28,690 --> 00:41:32,460 I will indicate by capital P of x. 536 00:41:35,070 --> 00:41:37,710 And the meaning of this is that capital P 537 00:41:37,710 --> 00:41:46,700 of x is the probability of outcome less than x. 538 00:41:54,130 --> 00:42:00,390 So generically, we say that x can 539 00:42:00,390 --> 00:42:04,020 take all values along the real line. 540 00:42:04,020 --> 00:42:05,820 And there is this function that I 541 00:42:05,820 --> 00:42:12,710 want to plot that I will call big P of x Now big P 542 00:42:12,710 --> 00:42:16,290 of x is a probability, therefore, 543 00:42:16,290 --> 00:42:20,030 it has to be positive according to the first item that we 544 00:42:20,030 --> 00:42:21,840 have over there. 545 00:42:21,840 --> 00:42:27,020 And it will be less than 1 because the net probability 546 00:42:27,020 --> 00:42:30,440 for everything toward here is equal to 1. 547 00:42:30,440 --> 00:42:33,320 So asymptotically, where I go all the way 548 00:42:33,320 --> 00:42:35,450 to infinity, the probability that I 549 00:42:35,450 --> 00:42:41,590 will get some number along the line-- I have to get something, 550 00:42:41,590 --> 00:42:45,530 so it should automatically go to 1 here. 551 00:42:45,530 --> 00:42:51,450 And every element of probability is positive, 552 00:42:51,450 --> 00:42:55,590 so it's a function that should gradually go down. 553 00:42:55,590 --> 00:42:58,454 And presumably, it will behave something 554 00:42:58,454 --> 00:42:59,370 like this generically. 555 00:43:06,170 --> 00:43:10,110 Once we have the Cumulative Probability Function, 556 00:43:10,110 --> 00:43:24,530 we can immediately construct the Probability Density Function, 557 00:43:24,530 --> 00:43:33,350 PDF, which is the derivative of the above. 558 00:43:33,350 --> 00:43:42,290 P of x is the derivative of big P of x with respect to the x. 559 00:43:42,290 --> 00:43:48,730 And so if I just take here the curve that I have above 560 00:43:48,730 --> 00:43:51,850 and take its derivative, the derivative 561 00:43:51,850 --> 00:43:54,360 will look something like this. 562 00:44:01,470 --> 00:44:07,340 Essentially, clearly by the definition of the derivative, 563 00:44:07,340 --> 00:44:13,260 this quantity is therefore ability 564 00:44:13,260 --> 00:44:23,120 of outcome in the interval x to x plus dx 565 00:44:23,120 --> 00:44:25,274 divided by the size of the interval dx. 566 00:44:29,050 --> 00:44:34,680 couple of things to remind you of, one of them 567 00:44:34,680 --> 00:44:39,690 is that the Cumulative Probability is a probability. 568 00:44:39,690 --> 00:44:43,540 It's a dimensionless number between 0 and 1. 569 00:44:43,540 --> 00:44:48,440 Probability Density is obtained by taking a derivative, 570 00:44:48,440 --> 00:44:54,570 so it has dimensions that are inverse of whatever this x is. 571 00:44:54,570 --> 00:44:59,270 So if I change my variable from meters to centimeters, 572 00:44:59,270 --> 00:45:01,570 let's say, the value of this function 573 00:45:01,570 --> 00:45:04,530 would change by a factor of 100. 574 00:45:04,530 --> 00:45:09,490 And secondly, while the Probability Density 575 00:45:09,490 --> 00:45:14,680 is positive, its value is not bounded. 576 00:45:14,680 --> 00:45:16,648 It can be anywhere that you like. 577 00:45:23,490 --> 00:45:27,050 One other, again, minor definition 578 00:45:27,050 --> 00:45:28,600 is expectation value. 579 00:45:34,810 --> 00:45:39,990 So I can pick some function of x. 580 00:45:39,990 --> 00:45:42,000 This could be x itself. 581 00:45:42,000 --> 00:45:43,970 It could be x squared. 582 00:45:43,970 --> 00:45:49,810 It could be sine x, x cubed minus x squared. 583 00:45:49,810 --> 00:45:52,880 The expectation value of this is defined 584 00:45:52,880 --> 00:45:58,310 by integrating the Probability Density 585 00:45:58,310 --> 00:46:00,906 against the value of the function. 586 00:46:06,620 --> 00:46:15,260 So essentially, what that says is 587 00:46:15,260 --> 00:46:30,440 that if I pick some function of x-- function 588 00:46:30,440 --> 00:46:33,160 can be positive, negative, et cetera. 589 00:46:33,160 --> 00:46:42,640 So maybe I have a function such as this-- then the value 590 00:46:42,640 --> 00:46:44,430 of x is random. 591 00:46:44,430 --> 00:46:47,460 If x is in this interval, this would 592 00:46:47,460 --> 00:46:50,870 be the corresponding contribution to f of x. 593 00:46:50,870 --> 00:46:55,605 And I have to look at all possible values of x. 594 00:47:00,100 --> 00:47:02,710 Question? 595 00:47:02,710 --> 00:47:06,935 Now, very associated to this is a change of variables. 596 00:47:14,660 --> 00:47:23,040 You would say that if x is random, then f of x is random. 597 00:47:23,040 --> 00:47:26,420 So if I ask you what is the value of x squared, 598 00:47:26,420 --> 00:47:29,780 and for one random variable, I get this. 599 00:47:29,780 --> 00:47:32,220 The value of x squared would be this. 600 00:47:32,220 --> 00:47:34,660 If I get this, the value of x squared 601 00:47:34,660 --> 00:47:37,160 would be something else. 602 00:47:37,160 --> 00:47:41,035 So if x is random, f of x is itself a random variable. 603 00:47:45,210 --> 00:47:52,510 So f of x is a random variable, and you 604 00:47:52,510 --> 00:47:55,220 can ask what is the probability, let's 605 00:47:55,220 --> 00:48:00,000 say, the Probability Density Function that I would associate 606 00:48:00,000 --> 00:48:02,040 with the value of this. 607 00:48:02,040 --> 00:48:05,320 Let's say what's the probability that I will find it 608 00:48:05,320 --> 00:48:10,710 in the interval between small f and small f plus df. 609 00:48:10,710 --> 00:48:12,916 This will be Pf f of f. 610 00:48:17,052 --> 00:48:21,950 You would say that the probability that I would find 611 00:48:21,950 --> 00:48:27,560 the value of the function that is in this interval 612 00:48:27,560 --> 00:48:33,190 corresponds to finding a value of x that is in this interval. 613 00:48:33,190 --> 00:48:36,020 So what I can do, the probability 614 00:48:36,020 --> 00:48:38,990 that I find the value of f in this interval, 615 00:48:38,990 --> 00:48:43,260 according to what I have here, is the Probability Density 616 00:48:43,260 --> 00:48:44,912 multiplied by df. 617 00:48:44,912 --> 00:48:45,745 Is there a question? 618 00:48:48,250 --> 00:48:50,790 No. 619 00:48:50,790 --> 00:48:53,380 So the probability that I'm in this interval 620 00:48:53,380 --> 00:48:56,520 translates to the probability that I'm in this interval. 621 00:48:56,520 --> 00:49:01,965 So that's probability p of x dx. 622 00:49:04,650 --> 00:49:07,500 But that's boring. 623 00:49:07,500 --> 00:49:09,920 I want to look at the situation maybe where 624 00:49:09,920 --> 00:49:11,565 the function is something like this. 625 00:49:15,350 --> 00:49:19,730 Then, you say that f is in this interval provided 626 00:49:19,730 --> 00:49:27,220 that x is either here or here or here. 627 00:49:27,220 --> 00:49:33,964 So what I really need to do is to solve 628 00:49:33,964 --> 00:49:40,200 f of x equals to f for x. 629 00:49:40,200 --> 00:49:43,010 And maybe there will be solutions 630 00:49:43,010 --> 00:49:48,030 that will be x1, x2, x3, et cetera. 631 00:49:48,030 --> 00:49:50,960 And what I need to do is to sum over 632 00:49:50,960 --> 00:49:53,850 the contributions of all of those solutions. 633 00:49:57,420 --> 00:49:58,830 So here, it's three solutions. 634 00:50:01,560 --> 00:50:08,010 Then, you would say the Probability Density 635 00:50:08,010 --> 00:50:15,440 is the sum over i P of xi, the xi by df, 636 00:50:15,440 --> 00:50:17,460 which is really the slopes. 637 00:50:17,460 --> 00:50:21,050 The slopes translate the size of this interval 638 00:50:21,050 --> 00:50:23,010 to the size of that interval. 639 00:50:23,010 --> 00:50:25,770 You can see that here, the slope is very sharp. 640 00:50:25,770 --> 00:50:28,190 The size of this interval is small. 641 00:50:28,190 --> 00:50:31,450 It could be wider accordingly, so I 642 00:50:31,450 --> 00:50:34,937 need to multiply by dxi by df. 643 00:50:39,160 --> 00:50:43,940 So I have to multiply by dx by df evaluated at xi. 644 00:50:43,940 --> 00:50:47,165 That's essentially the value of the derivative of f. 645 00:50:50,430 --> 00:50:57,070 Now, sometimes, it is easy to forget these things 646 00:50:57,070 --> 00:50:59,300 that I write over here. 647 00:50:59,300 --> 00:51:01,640 And you would say, well obviously, 648 00:51:01,640 --> 00:51:06,000 the probability of something that is positive. 649 00:51:06,000 --> 00:51:09,890 But without being careful, it is easy to violate 650 00:51:09,890 --> 00:51:11,620 such basic condition. 651 00:51:11,620 --> 00:51:13,020 And I violated it here. 652 00:51:16,740 --> 00:51:19,160 Anybody see where I violated it. 653 00:51:22,120 --> 00:51:24,710 Yeah, the slope here is positive. 654 00:51:24,710 --> 00:51:26,060 The slope here is positive. 655 00:51:26,060 --> 00:51:28,240 The slope here is negative. 656 00:51:28,240 --> 00:51:33,130 So I am subtracting a probability here. 657 00:51:33,130 --> 00:51:34,920 So what I really should do-- it really 658 00:51:34,920 --> 00:51:37,490 doesn't matter whether the slope is this way or that way. 659 00:51:37,490 --> 00:51:39,590 I will pick up the same interval, 660 00:51:39,590 --> 00:51:44,144 so make sure you don't forget the absolute values that 661 00:51:44,144 --> 00:51:45,596 go accordingly. 662 00:51:45,596 --> 00:51:48,210 So this is the standard way that you 663 00:51:48,210 --> 00:51:50,480 would make change of variables. 664 00:51:50,480 --> 00:51:52,162 Yes? 665 00:51:52,162 --> 00:51:52,828 AUDIENCE: Sorry. 666 00:51:52,828 --> 00:51:56,701 In the center of that board, on the second line, it says Pf. 667 00:51:56,701 --> 00:51:58,084 Is that an x or a times? 668 00:52:00,715 --> 00:52:02,340 PROFESSOR: In the center of this board? 669 00:52:02,340 --> 00:52:03,678 This one? 670 00:52:03,678 --> 00:52:06,140 AUDIENCE: Yeah. 671 00:52:06,140 --> 00:52:10,460 PROFESSOR: So the value of the function 672 00:52:10,460 --> 00:52:12,930 is a random variable, right? 673 00:52:12,930 --> 00:52:14,540 It can come up to be here. 674 00:52:14,540 --> 00:52:16,390 It can come up to be here. 675 00:52:16,390 --> 00:52:21,720 And so there is, as any other one parameter random variable, 676 00:52:21,720 --> 00:52:25,210 a Probability Density associated with that. 677 00:52:25,210 --> 00:52:30,350 That Probability Density I have called P of f 678 00:52:30,350 --> 00:52:32,440 to indicate that it is the variable 679 00:52:32,440 --> 00:52:34,910 f that I'm considering as opposed 680 00:52:34,910 --> 00:52:37,370 to what I wrote originally that was 681 00:52:37,370 --> 00:52:39,810 associated with the value of x. 682 00:52:39,810 --> 00:52:42,282 AUDIENCE: But what you have written on the left-hand side, 683 00:52:42,282 --> 00:52:44,750 it looks like your x [? is random. ?] 684 00:52:44,750 --> 00:52:47,160 PROFESSOR: Oh, this was supposed to be a multiplication 685 00:52:47,160 --> 00:52:48,110 sign, so sorry. 686 00:52:48,110 --> 00:52:49,635 AUDIENCE: Thank you. 687 00:52:49,635 --> 00:52:50,510 PROFESSOR: Thank you. 688 00:52:56,445 --> 00:52:56,945 Yes? 689 00:52:56,945 --> 00:53:01,900 AUDIENCE: CP-- that function, is this [INAUDIBLE]? 690 00:53:01,900 --> 00:53:04,020 PROFESSOR: Yes. 691 00:53:04,020 --> 00:53:07,670 So you're asking whether this-- so I constructed something, 692 00:53:07,670 --> 00:53:11,660 and my statement is that the integral from minus infinity 693 00:53:11,660 --> 00:53:21,100 to infinity df Pf of f better be one which is the normalization. 694 00:53:21,100 --> 00:53:24,420 So if you're asking about this, essentially, 695 00:53:24,420 --> 00:53:30,020 you would say the integral dx p of x 696 00:53:30,020 --> 00:53:34,200 is the integral dx dP by dx, right? 697 00:53:34,200 --> 00:53:36,710 That was the definition p of x. 698 00:53:36,710 --> 00:53:38,930 And the integral of the derivative 699 00:53:38,930 --> 00:53:43,830 is the value of the function evaluated at its two extremes. 700 00:53:43,830 --> 00:53:47,360 And this is one minus 0. 701 00:53:47,360 --> 00:53:51,845 So by construction, it is, of course, normalized 702 00:53:51,845 --> 00:53:54,100 in this fashion. 703 00:53:54,100 --> 00:53:55,310 Is that what you were asking? 704 00:53:55,310 --> 00:53:58,226 AUDIENCE: I was asking about the first possibility 705 00:53:58,226 --> 00:54:00,970 of cumulative probability function. 706 00:54:00,970 --> 00:54:04,220 PROFESSOR: So the cumulative probability, 707 00:54:04,220 --> 00:54:09,540 its constraint is that the limit as its variable 708 00:54:09,540 --> 00:54:14,030 goes to infinity, it should go to 1. 709 00:54:14,030 --> 00:54:15,650 That's the normalization. 710 00:54:15,650 --> 00:54:19,660 The normalization here is that the probability 711 00:54:19,660 --> 00:54:23,300 of the entire set is 1. 712 00:54:23,300 --> 00:54:26,000 Cumulative adds the probabilities 713 00:54:26,000 --> 00:54:29,270 to be anywhere up to point x. 714 00:54:29,270 --> 00:54:32,160 So I have achieved being anywhere on the line 715 00:54:32,160 --> 00:54:35,700 by going through this point. 716 00:54:35,700 --> 00:54:41,477 But certainly, the integral of P of xdx is not equal to 1 717 00:54:41,477 --> 00:54:42,685 if that's what you're asking. 718 00:54:45,370 --> 00:54:48,494 The integral of small p of x is 1. 719 00:54:51,973 --> 00:54:53,464 Yes? 720 00:54:53,464 --> 00:54:56,446 AUDIENCE: Are we assuming the function is invertible? 721 00:55:00,430 --> 00:55:03,380 PROFESSOR: Well, rigorously speaking, this function 722 00:55:03,380 --> 00:55:08,090 is not invertible because for a value of f, 723 00:55:08,090 --> 00:55:10,810 there are three possible values of x. 724 00:55:10,810 --> 00:55:14,150 So it's not a function, but you can certainly 725 00:55:14,150 --> 00:55:18,070 solve for f of x equals to f to find particular values. 726 00:55:27,630 --> 00:55:31,520 So again, maybe it is useful to work 727 00:55:31,520 --> 00:55:33,230 through one example of this. 728 00:55:37,320 --> 00:55:44,220 So let's say that you have a probability that 729 00:55:44,220 --> 00:55:50,890 is of the form e to the minus lambda absolute value of x. 730 00:55:50,890 --> 00:55:58,550 So as a function of x, the Probability Density 731 00:55:58,550 --> 00:56:04,430 falls off exponentially on both sides. 732 00:56:04,430 --> 00:56:08,340 And again, I have to ensure that when I integrate this from 0 733 00:56:08,340 --> 00:56:11,280 to infinity, I will get one. 734 00:56:11,280 --> 00:56:14,090 The integral from 0 to infinity is 735 00:56:14,090 --> 00:56:16,580 1 over lambda, from minus infinity 736 00:56:16,580 --> 00:56:19,420 to zero by symmetry is 1 over lambda. 737 00:56:19,420 --> 00:56:26,940 So it's really I have to divide by 2 lambda-- to lambda over 2. 738 00:56:26,940 --> 00:56:27,440 Sorry. 739 00:56:34,680 --> 00:56:39,650 Now, suppose I change variables to F, which is x squared. 740 00:56:42,350 --> 00:56:46,840 So I want to know what the probability is 741 00:56:46,840 --> 00:56:53,070 for a particular value of x squared that I will call f. 742 00:56:53,070 --> 00:56:57,720 So then what I have to do is to solve this. 743 00:56:57,720 --> 00:57:03,920 And this will give me x is minus plus square root of small f. 744 00:57:03,920 --> 00:57:10,440 If I ask for what f of-- for what value of x, x squared 745 00:57:10,440 --> 00:57:15,500 equals to f, then I have these two solutions. 746 00:57:15,500 --> 00:57:18,670 So according to the formula that I have, 747 00:57:18,670 --> 00:57:22,480 I have to, first of all, evaluate this 748 00:57:22,480 --> 00:57:26,540 at these two possible routes. 749 00:57:26,540 --> 00:57:33,160 In both cases, I will get minus lambda square root of f. 750 00:57:33,160 --> 00:57:35,610 Because of the absolute value, both of them 751 00:57:35,610 --> 00:57:38,290 will give you the same thing. 752 00:57:38,290 --> 00:57:43,350 And then I have to look at this derivative. 753 00:57:43,350 --> 00:57:52,394 So if I look at this, I can see that df by dx equals to 2x. 754 00:57:52,394 --> 00:57:57,180 The locations that I have to evaluate are at plus minus 755 00:57:57,180 --> 00:57:58,800 square root of f. 756 00:57:58,800 --> 00:58:04,470 So the value of the slope is minus plus to square root of f. 757 00:58:04,470 --> 00:58:07,840 And according to that formula, what I have to do 758 00:58:07,840 --> 00:58:09,600 is to put the inverse of that. 759 00:58:09,600 --> 00:58:13,680 So I have to put for one solution, 1 over 2 square root 760 00:58:13,680 --> 00:58:15,460 of f. 761 00:58:15,460 --> 00:58:19,920 For the other one, I have to put 1 over minus 2 square root 762 00:58:19,920 --> 00:58:24,560 of f, which would be a disaster if I didn't convert this 763 00:58:24,560 --> 00:58:27,650 to an absolute value. 764 00:58:27,650 --> 00:58:30,900 And if I did convert that to an absolute value, 765 00:58:30,900 --> 00:58:36,322 what I would get is lambda over 2 square root of f 766 00:58:36,322 --> 00:58:41,000 e to the minus lambda root f. 767 00:58:41,000 --> 00:58:46,180 It is important to note that this solution will only 768 00:58:46,180 --> 00:58:51,340 exist only if f is positive. 769 00:58:51,340 --> 00:58:55,040 And there's no solution if f is negative, 770 00:58:55,040 --> 00:59:00,200 which means that if I wanted to plot a Probability 771 00:59:00,200 --> 00:59:03,770 Density for this function f, which 772 00:59:03,770 --> 00:59:08,420 is x squared as a function of f, it will only 773 00:59:08,420 --> 00:59:12,510 have values for positive values of x squared. 774 00:59:12,510 --> 00:59:16,300 There's nothing for negative values. 775 00:59:16,300 --> 00:59:19,300 For positive values, I have this function 776 00:59:19,300 --> 00:59:22,200 that's exponentially decays. 777 00:59:22,200 --> 00:59:27,380 Yet at f equals to 0 diverges. 778 00:59:27,380 --> 00:59:30,770 One reason I chose that example is 779 00:59:30,770 --> 00:59:34,170 to emphasize that these Probability Density 780 00:59:34,170 --> 00:59:39,206 functions can even go all the way infinity. 781 00:59:39,206 --> 00:59:42,880 The requirement, however, is that you 782 00:59:42,880 --> 00:59:47,790 should be able to integrate across the infinity because 783 00:59:47,790 --> 00:59:51,840 integrating across the infinity should give you a finite number 784 00:59:51,840 --> 00:59:53,660 less than 1. 785 00:59:53,660 --> 00:59:58,490 And so the type of divergence that you could have is limited. 786 00:59:58,490 --> 01:00:00,500 1 over square root of f is fine. 787 01:00:00,500 --> 01:00:02,000 1/f is not accepted. 788 01:00:06,380 --> 01:00:07,555 Yes? 789 01:00:07,555 --> 01:00:09,680 AUDIENCE: I have a doubt about [? the postulate. ?] 790 01:00:09,680 --> 01:00:16,850 It says that if you raise the value of f slowly, 791 01:00:16,850 --> 01:00:20,260 you will eventually get to-- yeah, that point right there. 792 01:00:20,260 --> 01:00:22,982 So if the prescription that we have of summing over 793 01:00:22,982 --> 01:00:27,025 the different roots, at some point, the roots, 794 01:00:27,025 --> 01:00:27,715 they converge. 795 01:00:27,715 --> 01:00:28,340 PROFESSOR: Yes. 796 01:00:28,340 --> 01:00:30,552 AUDIENCE: So at some point, we stop summing over 2 797 01:00:30,552 --> 01:00:31,956 and we start summing over 1. 798 01:00:31,956 --> 01:00:35,240 It just seems a little bit strange. 799 01:00:35,240 --> 01:00:36,070 PROFESSOR: Yeah. 800 01:00:36,070 --> 01:00:40,280 If you are up here, you have only one term in the sum. 801 01:00:40,280 --> 01:00:43,200 If you are down here, you have three terms. 802 01:00:43,200 --> 01:00:46,120 And that's really just the property of the curve 803 01:00:46,120 --> 01:00:47,870 that I have drawn. 804 01:00:47,870 --> 01:00:51,530 And so over here, I have only one root. 805 01:00:51,530 --> 01:00:53,040 Over here, I have three roots. 806 01:00:53,040 --> 01:00:54,960 And this is not surprising. 807 01:00:54,960 --> 01:00:58,190 There are many situations in mathematics or physics 808 01:00:58,190 --> 01:01:01,140 where you encounter situations where, 809 01:01:01,140 --> 01:01:04,670 as you change some parameters, new solutions, new roots, 810 01:01:04,670 --> 01:01:05,650 appear. 811 01:01:05,650 --> 01:01:11,030 And so if this was really some kind of a physical system, 812 01:01:11,030 --> 01:01:13,690 you would probably encounter some kind 813 01:01:13,690 --> 01:01:16,712 of a singularity of phase transitions at this point. 814 01:01:20,960 --> 01:01:21,825 Yes? 815 01:01:21,825 --> 01:01:24,675 AUDIENCE: But how does the equation deal with that when 816 01:01:24,675 --> 01:01:27,060 [INAUDIBLE]? 817 01:01:27,060 --> 01:01:28,570 PROFESSOR: Let's see. 818 01:01:28,570 --> 01:01:34,390 So if I am approaching that point, what I find 819 01:01:34,390 --> 01:01:40,540 is that the f by the x goes to 0. 820 01:01:40,540 --> 01:01:44,500 So the x by df has some kind of infinity or singularity, 821 01:01:44,500 --> 01:01:46,590 so we have to deal with that. 822 01:01:46,590 --> 01:01:49,670 If you want, we can choose a particular form 823 01:01:49,670 --> 01:01:52,140 of that function and see what happens. 824 01:01:52,140 --> 01:01:55,230 But actually, we have that already over here 825 01:01:55,230 --> 01:01:58,440 because the function f that I plotted for you 826 01:01:58,440 --> 01:02:05,740 as a function of x has this behavior 827 01:02:05,740 --> 01:02:09,035 that, for some range of f, you have two solutions. 828 01:02:14,280 --> 01:02:17,940 So for negative values of f, I have no solution. 829 01:02:17,940 --> 01:02:21,000 So this curve, after having rotated, 830 01:02:21,000 --> 01:02:24,390 is precisely an example of what is happening here. 831 01:02:24,390 --> 01:02:26,950 And you see what the consequence of that is. 832 01:02:26,950 --> 01:02:29,910 The consequence of that is that as I approach here 833 01:02:29,910 --> 01:02:33,810 and the two solutions merge, I have the singularity that 834 01:02:33,810 --> 01:02:36,679 is ultimately manifested in here. 835 01:02:47,450 --> 01:02:49,060 So in principle, yes. 836 01:02:49,060 --> 01:02:51,230 When you make these changes of variables 837 01:02:51,230 --> 01:02:55,630 and you have functions that have multiple solution 838 01:02:55,630 --> 01:02:58,840 behavior like that, you have to worry about this. 839 01:03:02,940 --> 01:03:04,065 Let me go down here. 840 01:03:07,690 --> 01:03:10,860 One other definition that, again, you've 841 01:03:10,860 --> 01:03:13,350 probably seen, before we go through something 842 01:03:13,350 --> 01:03:16,483 that I hope you haven't seen, moment. 843 01:03:19,430 --> 01:03:22,110 A form of this expectation value-- actually, here we 844 01:03:22,110 --> 01:03:24,950 did with x squared, but in general, we 845 01:03:24,950 --> 01:03:29,480 can calculate the expectation value of x to the m. 846 01:03:29,480 --> 01:03:37,120 And sometimes, that is called mth moment is the integral 0 847 01:03:37,120 --> 01:03:41,130 to infinity dx x to the m p of x. 848 01:03:56,760 --> 01:04:00,300 Now, I expect that after this point, 849 01:04:00,300 --> 01:04:01,940 you would have seen everything. 850 01:04:01,940 --> 01:04:06,820 But next one maybe half of you have seen. 851 01:04:06,820 --> 01:04:10,870 And the next item, which we will use a lot, 852 01:04:10,870 --> 01:04:12,720 is the characteristic function. 853 01:04:24,860 --> 01:04:29,590 So given that I have some probability distribution 854 01:04:29,590 --> 01:04:34,220 p of x, I can calculate various expectation values. 855 01:04:34,220 --> 01:04:38,470 I calculate the expectation value of e to the minus ikx. 856 01:04:43,200 --> 01:04:47,140 This is, by definition that you have, 857 01:04:47,140 --> 01:04:50,750 I have to integrate over the domain of x-- 858 01:04:50,750 --> 01:04:55,140 let's say from minus infinity to infinity-- p of x against e 859 01:04:55,140 --> 01:04:56,140 to the minus ikx. 860 01:05:01,550 --> 01:05:05,560 And you say, well, what's special about that? 861 01:05:05,560 --> 01:05:11,470 I know that to be the Fourier transform of p of x. 862 01:05:11,470 --> 01:05:13,220 And it is true. 863 01:05:13,220 --> 01:05:17,790 And you also know how to invert the Fourier transform. 864 01:05:17,790 --> 01:05:20,460 That is if you know the characteristic function, which 865 01:05:20,460 --> 01:05:23,170 is another name for the Fourier transform of a probability 866 01:05:23,170 --> 01:05:26,770 distribution, you would get the p of x 867 01:05:26,770 --> 01:05:33,490 back by the integral over k divided by 2pi, 868 01:05:33,490 --> 01:05:39,460 the way that I chose the things, into the ikx p tilde of k. 869 01:05:39,460 --> 01:05:42,600 Basically, this is the standard relationship 870 01:05:42,600 --> 01:05:45,520 between these objects. 871 01:05:45,520 --> 01:05:50,090 So this is just a Fourrier transform. 872 01:05:50,090 --> 01:05:56,320 Now, something that appears a lot in statistical calculations 873 01:05:56,320 --> 01:05:58,530 and implicit in lots of things that we 874 01:05:58,530 --> 01:06:03,185 do in statistical mechanics is a generating function. 875 01:06:12,250 --> 01:06:17,880 I can take the characteristic function p tilde of k. 876 01:06:17,880 --> 01:06:21,620 It's a function of this Fourrier variable, k. 877 01:06:21,620 --> 01:06:24,360 And I can do an expansion in that. 878 01:06:24,360 --> 01:06:29,000 I can do the expansion inside the expectation value 879 01:06:29,000 --> 01:06:34,460 because e to the minus ikx I can write as a sum over n running 880 01:06:34,460 --> 01:06:39,810 from 0 to infinity minus ik to the power of m divided by n 881 01:06:39,810 --> 01:06:42,540 factorial x to the nth. 882 01:06:45,330 --> 01:06:48,880 This is the expansion of the exponential. 883 01:06:48,880 --> 01:06:55,170 The variable here is x, so I can take everything else outside. 884 01:06:55,170 --> 01:07:01,040 And what I see is that if I make an expansion 885 01:07:01,040 --> 01:07:06,370 of the characteristic function, the coefficient 886 01:07:06,370 --> 01:07:11,240 of k to the n up to some trivial factor of n factorial 887 01:07:11,240 --> 01:07:14,130 will give me the nth moment. 888 01:07:14,130 --> 01:07:16,910 That is once you have calculated the Fourrier 889 01:07:16,910 --> 01:07:20,610 transform, or the characteristic function, you can expand it. 890 01:07:20,610 --> 01:07:23,430 And you can, from out of that expansion, you 891 01:07:23,430 --> 01:07:26,690 can extract all the moments essentially. 892 01:07:26,690 --> 01:07:30,550 So this is expansion generates for you the moments, 893 01:07:30,550 --> 01:07:34,250 hence the generating function. 894 01:07:34,250 --> 01:07:37,220 You could even do something like this. 895 01:07:37,220 --> 01:07:47,880 You could multiply e to the ikx0 for some x0 p tilde of k. 896 01:07:47,880 --> 01:07:51,510 And that would be the expectation value of e 897 01:07:51,510 --> 01:07:56,780 to the ikx minus x0. 898 01:07:56,780 --> 01:08:00,450 And you can expand that, and you would 899 01:08:00,450 --> 01:08:06,600 generate all of the moments not around the origin, 900 01:08:06,600 --> 01:08:08,350 but around the point x0. 901 01:08:13,960 --> 01:08:17,830 So simple manipulations of the characteristic function 902 01:08:17,830 --> 01:08:23,840 can shift and give you other set of moments 903 01:08:23,840 --> 01:08:25,000 around different points. 904 01:08:34,800 --> 01:08:38,560 So the Fourier transform, or characteristic function, 905 01:08:38,560 --> 01:08:42,290 is the generator of moments. 906 01:08:42,290 --> 01:08:46,240 An even more important property is 907 01:08:46,240 --> 01:08:48,986 possessed by the cumulant generating function. 908 01:08:58,229 --> 01:09:07,260 So you have the characteristic function, 909 01:09:07,260 --> 01:09:08,899 the Fourier transform. 910 01:09:08,899 --> 01:09:13,609 You take its log, so another function of k. 911 01:09:13,609 --> 01:09:16,810 You start expanding this function in covers of k. 912 01:09:21,189 --> 01:09:29,370 Add the coefficients of that, you call cumulants. 913 01:09:33,420 --> 01:09:38,060 So I essentially repeated the definition that I had up there. 914 01:09:38,060 --> 01:09:44,497 I took a log, and all I did is I put this subscript c 915 01:09:44,497 --> 01:09:47,729 to go from moments to cumulants. 916 01:09:47,729 --> 01:09:53,890 And also, I have to start the series from 1 as opposed to 0. 917 01:09:53,890 --> 01:10:00,960 And essentially, I can find the relationship between cumulants 918 01:10:00,960 --> 01:10:04,370 and moments by writing this as a log 919 01:10:04,370 --> 01:10:08,980 of the characteristic function, which 920 01:10:08,980 --> 01:10:14,150 is 1 plus some n plus 1 to infinity 921 01:10:14,150 --> 01:10:21,190 of minus ik to the n over n factorial, the nth moments. 922 01:10:21,190 --> 01:10:26,120 So inside the log, I have the moments. 923 01:10:26,120 --> 01:10:30,020 Outside the log, I have the cumulants. 924 01:10:30,020 --> 01:10:35,910 And if I have a log of 1 plus epsilon, 925 01:10:35,910 --> 01:10:41,180 I can use the expansion of this as epsilon minus 926 01:10:41,180 --> 01:10:45,380 epsilon squared over 2 epsilon cubed over 3 minus epsilon 927 01:10:45,380 --> 01:10:49,500 to the fourth over 4, et cetera. 928 01:10:49,500 --> 01:10:55,940 And this will enable me to then match powers of minus ik 929 01:10:55,940 --> 01:11:00,710 on the left and powers of minus ik on the right. 930 01:11:00,710 --> 01:11:03,530 You can see that the first thing that I will find 931 01:11:03,530 --> 01:11:09,600 is that the expectation value of x-- the first power, 932 01:11:09,600 --> 01:11:13,800 the first term that I have here is minus ik to the mean. 933 01:11:13,800 --> 01:11:16,630 Take the log, I will get that. 934 01:11:16,630 --> 01:11:22,230 So essentially, what I get is that the first cumulant 935 01:11:22,230 --> 01:11:26,760 on the left is the first moment that I 936 01:11:26,760 --> 01:11:29,660 will get from the expansion on the right. 937 01:11:29,660 --> 01:11:32,460 And this is, of course, called the mean of the distribution. 938 01:11:34,970 --> 01:11:40,680 The second cumulant, I will have two contributions, 939 01:11:40,680 --> 01:11:45,304 one from epsilon, the other from minus epsilon squared over 2. 940 01:11:45,304 --> 01:11:48,220 And If you go through that, you will 941 01:11:48,220 --> 01:11:52,680 get that it is expectation value of x squared 942 01:11:52,680 --> 01:11:57,320 minus the average of x, the mean squared, which 943 01:11:57,320 --> 01:12:01,450 is none other than the expectation value of x 944 01:12:01,450 --> 01:12:06,910 around the mean squared, which is clearly a positive quantity. 945 01:12:06,910 --> 01:12:08,410 And this is the variance. 946 01:12:14,420 --> 01:12:16,300 And you can keep going. 947 01:12:16,300 --> 01:12:26,360 The third cumulant is x cubed minus 3 average of x squared 948 01:12:26,360 --> 01:12:32,402 average of x plus 2 average of x itself cubed. 949 01:12:32,402 --> 01:12:33,915 It is called the skewness. 950 01:12:36,900 --> 01:12:40,340 I don't write the formula for the next one 951 01:12:40,340 --> 01:12:42,050 which is called a [? cortosis ?]. 952 01:12:42,050 --> 01:12:45,910 And you keep going and so forth. 953 01:12:53,390 --> 01:13:01,220 So it turns out that this hierarchy of cumulants, 954 01:13:01,220 --> 01:13:04,710 essentially, is a hierarchy of the most important things 955 01:13:04,710 --> 01:13:09,540 that you can know about a random variable. 956 01:13:09,540 --> 01:13:17,140 So if I tell you that the outcome of some experiment 957 01:13:17,140 --> 01:13:23,409 is some number x, distribute it somehow-- I 958 01:13:23,409 --> 01:13:25,450 guess the first thing that you would like to know 959 01:13:25,450 --> 01:13:28,375 is whether the typical values that you get 960 01:13:28,375 --> 01:13:33,100 are of the order of 1, are of the order of million, whatever. 961 01:13:33,100 --> 01:13:36,380 So somehow, the mean is something 962 01:13:36,380 --> 01:13:40,510 that tells you something that is most important is zeroth order 963 01:13:40,510 --> 01:13:45,030 thing that you want to know about the variable. 964 01:13:45,030 --> 01:13:47,130 But the next thing that you might want to know 965 01:13:47,130 --> 01:13:50,250 is, well, what's the spread? 966 01:13:50,250 --> 01:13:52,830 How far does this thing go? 967 01:13:52,830 --> 01:13:58,090 And then the variance will tell you something about the spread. 968 01:13:58,090 --> 01:13:59,750 So the next thing that you want to do 969 01:13:59,750 --> 01:14:02,770 is maybe if given the spread, am I 970 01:14:02,770 --> 01:14:06,080 more likely to get things that are on one side or things 971 01:14:06,080 --> 01:14:08,390 that are on the other side. 972 01:14:08,390 --> 01:14:12,780 So the measure of its asymmetry, right versus left, 973 01:14:12,780 --> 01:14:16,329 is provided by the third cumulant, which is the skewness 974 01:14:16,329 --> 01:14:16,870 and so forth. 975 01:14:20,180 --> 01:14:24,380 So typically, the very first few members 976 01:14:24,380 --> 01:14:28,340 of this hierarchy of cumulants tells you 977 01:14:28,340 --> 01:14:30,985 the most important information that you 978 01:14:30,985 --> 01:14:32,110 need about the probability. 979 01:14:37,100 --> 01:14:38,860 Now, I will mention to you, and I 980 01:14:38,860 --> 01:14:44,040 guess we probably will deal with it more next time around, 981 01:14:44,040 --> 01:14:51,700 the result that is in some sense the backbone or granddaddy 982 01:14:51,700 --> 01:14:57,080 of all graphical expansions that are carrying [INAUDIBLE]. 983 01:14:57,080 --> 01:15:00,800 And that's a relationship between the moments 984 01:15:00,800 --> 01:15:04,770 and cumulants that I will express graphically. 985 01:15:04,770 --> 01:15:14,970 So this is graphical representation 986 01:15:14,970 --> 01:15:20,860 of moments in terms of cumulants. 987 01:15:26,470 --> 01:15:29,440 Essentially, what I'm saying is that you 988 01:15:29,440 --> 01:15:33,100 can go through the procedure as I outlined. 989 01:15:33,100 --> 01:15:37,240 And if you want to calculate minus ik to the fifth power 990 01:15:37,240 --> 01:15:41,740 so that you find the description of the fifth cumulant in terms 991 01:15:41,740 --> 01:15:44,475 of the moment, you'll have to do a lot of work 992 01:15:44,475 --> 01:15:49,460 in expanding the log and powers of this object and making sure 993 01:15:49,460 --> 01:15:53,790 that you don't make any mistakes in the coefficient. 994 01:15:53,790 --> 01:15:58,570 There is a way to circumvent that graphically 995 01:15:58,570 --> 01:16:00,460 and get the relationship. 996 01:16:00,460 --> 01:16:03,710 So how do we do that? 997 01:16:03,710 --> 01:16:16,010 You'll represent nth cumulant as let's say a bag of endpoints. 998 01:16:20,100 --> 01:16:28,640 So let's say this entity will represent the third cumulant. 999 01:16:28,640 --> 01:16:31,700 It's a bag with three points. 1000 01:16:31,700 --> 01:16:37,520 This-- one, two, three, four, five, six-- 1001 01:16:37,520 --> 01:16:39,391 will represent the sixth cumulant. 1002 01:16:42,340 --> 01:16:54,645 Then, the nth moment is some of all ways 1003 01:16:54,645 --> 01:17:04,680 of distributing end points amongst bags. 1004 01:17:11,360 --> 01:17:13,664 So what do I mean? 1005 01:17:13,664 --> 01:17:19,140 So I want to calculate the first moment x. 1006 01:17:19,140 --> 01:17:22,370 That would correspond to one point. 1007 01:17:22,370 --> 01:17:25,260 And really, there's only one diagram 1008 01:17:25,260 --> 01:17:28,470 I can put the bag around it or not 1009 01:17:28,470 --> 01:17:31,280 that would correspond to this. 1010 01:17:31,280 --> 01:17:35,410 And that corresponds to the first cumulant, 1011 01:17:35,410 --> 01:17:39,120 basically rewriting what I had before. 1012 01:17:39,120 --> 01:17:43,010 If I want to look at the second moment, the second moment 1013 01:17:43,010 --> 01:17:44,650 I need two points. 1014 01:17:44,650 --> 01:17:48,770 The two points I can either put in the same bag or I 1015 01:17:48,770 --> 01:17:52,180 can put into two separate bags. 1016 01:17:52,180 --> 01:17:56,510 And the first one corresponds to calculating 1017 01:17:56,510 --> 01:17:59,650 the second cumulant. 1018 01:17:59,650 --> 01:18:02,390 The second term corresponds to two ways 1019 01:18:02,390 --> 01:18:05,460 in which their first cumulant has appeared, 1020 01:18:05,460 --> 01:18:07,090 so I have to squared x. 1021 01:18:10,010 --> 01:18:17,080 if I want to calculate the third moment, I need three dots. 1022 01:18:17,080 --> 01:18:21,400 The three dots I can either put in one bag 1023 01:18:21,400 --> 01:18:27,950 or I can take one of them out and keep two of them in a bag. 1024 01:18:27,950 --> 01:18:30,000 And here I had the choice of three things 1025 01:18:30,000 --> 01:18:32,770 that I could've pulled out. 1026 01:18:32,770 --> 01:18:38,370 Or, I could have all of them in individual bags of their own. 1027 01:18:38,370 --> 01:18:43,290 And mathematically, the first term corresponds to x cubed c. 1028 01:18:43,290 --> 01:18:46,680 The third term corresponds to three versions 1029 01:18:46,680 --> 01:18:49,460 of the variance times the mean. 1030 01:18:49,460 --> 01:18:54,040 And the last term is just the mean cubed. 1031 01:18:54,040 --> 01:18:57,740 And you can massage this expression 1032 01:18:57,740 --> 01:19:03,200 to see that I get the expression that I have for the skewness. 1033 01:19:03,200 --> 01:19:05,860 I didn't offhand remember the relationship 1034 01:19:05,860 --> 01:19:09,790 that I have to write down for the fourth cumulant. 1035 01:19:09,790 --> 01:19:12,210 But I can graphically, immediately get 1036 01:19:12,210 --> 01:19:15,140 the relationship for the fourth moment 1037 01:19:15,140 --> 01:19:20,020 in terms of the fourth cumulant which is this entity. 1038 01:19:20,020 --> 01:19:23,860 Or, four ways that I can take one of the back 1039 01:19:23,860 --> 01:19:29,440 and maintain three in the bag, three ways in which I have 1040 01:19:29,440 --> 01:19:37,190 two bags of two, six ways in which I can have a bag of two 1041 01:19:37,190 --> 01:19:42,165 and two things that are individually apart, and one 1042 01:19:42,165 --> 01:19:44,970 way in which there are four things that 1043 01:19:44,970 --> 01:19:47,350 are independent of each other. 1044 01:19:47,350 --> 01:19:53,700 And this becomes x to the fourth cumulant, the fourth cumulant, 1045 01:19:53,700 --> 01:19:58,800 4 times the third cumulant times the mean, 1046 01:19:58,800 --> 01:20:03,740 3 times the square of the variance, 1047 01:20:03,740 --> 01:20:10,010 6 times the variance multiplied by the mean squared, 1048 01:20:10,010 --> 01:20:15,630 and the mean raised to the fourth power. 1049 01:20:15,630 --> 01:20:16,690 And you can keep going. 1050 01:20:24,642 --> 01:20:29,130 AUDIENCE: Is the variance not squared in the third term? 1051 01:20:29,130 --> 01:20:31,030 PROFESSOR: Did I forget that? 1052 01:20:31,030 --> 01:20:32,030 Yes, thank you. 1053 01:20:42,340 --> 01:20:43,830 All right. 1054 01:20:43,830 --> 01:20:48,010 So the proof of this is really just the two-line algebra 1055 01:20:48,010 --> 01:20:51,960 exponentiating these expressions that we have over here. 1056 01:20:51,960 --> 01:20:55,710 But it's much nicer to represent that graphically. 1057 01:20:55,710 --> 01:20:59,700 And so now you can go between things very easily. 1058 01:20:59,700 --> 01:21:05,310 And what I will show next time is how, using this machinery, 1059 01:21:05,310 --> 01:21:09,960 you can calculate any moment of a Gaussian, 1060 01:21:09,960 --> 01:21:12,650 for example, in just a matter of seconds as opposed 1061 01:21:12,650 --> 01:21:16,980 to having to do integrations and things like that. 1062 01:21:16,980 --> 01:21:19,900 So that's what we will do next time will 1063 01:21:19,900 --> 01:21:23,120 be to apply this machinery to various probability 1064 01:21:23,120 --> 01:21:25,290 distribution, such as a Gaussian, 1065 01:21:25,290 --> 01:21:28,590 that we are likely to encounter again and again.