1
00:00:00,820 --> 00:00:03,140
PROFESSOR: The greatest
common divisor of two numbers
2
00:00:03,140 --> 00:00:04,890
is easy to compute.
3
00:00:04,890 --> 00:00:08,850
And that's a factor will play
a crucial role in the number
4
00:00:08,850 --> 00:00:12,760
theory we're going to develop,
and the properties of some
5
00:00:12,760 --> 00:00:17,100
of the modern codes that
are based on number theory.
6
00:00:17,100 --> 00:00:20,870
The efficient way to compute
the GCD of two numbers
7
00:00:20,870 --> 00:00:22,940
is based on a
classical algorithm
8
00:00:22,940 --> 00:00:25,140
known as Euclidean
algorithm, which
9
00:00:25,140 --> 00:00:27,840
is several thousand years old.
10
00:00:27,840 --> 00:00:32,680
And let's describe
how it works now.
11
00:00:32,680 --> 00:00:38,480
So the Euclidean algorithm is
based on the following lemma,
12
00:00:38,480 --> 00:00:40,340
which we'll call
the remainder lemma,
13
00:00:40,340 --> 00:00:43,650
and it says that if a
and b are two integers,
14
00:00:43,650 --> 00:00:46,510
then the greatest common
divisor of a and b
15
00:00:46,510 --> 00:00:49,920
is the same as the greatest
common divisor of b,
16
00:00:49,920 --> 00:00:54,720
and the remainder of a divided
by b-- providing, of course,
17
00:00:54,720 --> 00:00:58,140
b is not 0, because otherwise
you can't divide by b.
18
00:00:58,140 --> 00:01:00,622
OK how do you make
sense out of this?
19
00:01:00,622 --> 00:01:01,330
Why is this true?
20
00:01:01,330 --> 00:01:02,913
Well, it's actually
a very easy proof.
21
00:01:02,913 --> 00:01:07,090
Remember that by the
so-called division algorithm--
22
00:01:07,090 --> 00:01:12,134
or it's really a theorem--
if you divide a by b
23
00:01:12,134 --> 00:01:14,300
and we're doing integer
division, what that means is
24
00:01:14,300 --> 00:01:17,710
you find a quotient of a
divided by b in the quotient,
25
00:01:17,710 --> 00:01:18,626
and a remainder.
26
00:01:18,626 --> 00:01:20,000
And the quotient
has the property
27
00:01:20,000 --> 00:01:24,030
that q times b plus the
remainder is equal to a.
28
00:01:24,030 --> 00:01:27,190
The remainder is always
going to be smaller than a.
29
00:01:27,190 --> 00:01:30,950
It will be the range from 0
up to, but not including, a.
30
00:01:30,950 --> 00:01:34,930
OK, if you look at this
simple expression, what
31
00:01:34,930 --> 00:01:39,900
becomes apparent is that if
you've got a divisor of two
32
00:01:39,900 --> 00:01:42,990
out of three of
these terms, then
33
00:01:42,990 --> 00:01:44,980
it's going to divide
the third term.
34
00:01:44,980 --> 00:01:49,420
So for example, if you
have a divisor of b and r,
35
00:01:49,420 --> 00:01:52,730
then the sum of those
two things is also
36
00:01:52,730 --> 00:01:55,180
going to have the same
divisor, which means
37
00:01:55,180 --> 00:01:57,080
that a will have that divisor.
38
00:01:57,080 --> 00:02:01,120
If something divides both
a and b, then it divides r.
39
00:02:01,120 --> 00:02:04,672
And if it divides b
and r, it divides a.
40
00:02:04,672 --> 00:02:08,495
And that means that
a and b and b and r
41
00:02:08,495 --> 00:02:10,470
have exactly the same divisors.
42
00:02:10,470 --> 00:02:13,510
They not only have the same
greatest common divisor,
43
00:02:13,510 --> 00:02:14,860
all their divisors are the same.
44
00:02:14,860 --> 00:02:18,180
So obviously, the
greatest one is the same.
45
00:02:18,180 --> 00:02:21,980
And that proves this
key remainder lemma.
46
00:02:21,980 --> 00:02:24,580
Well, the remainder lemma now
gives us a very lovely way
47
00:02:24,580 --> 00:02:25,960
to compute the GCD.
48
00:02:25,960 --> 00:02:27,620
And here's an example.
49
00:02:27,620 --> 00:02:31,370
Suppose I want to compute
the GCD of 899 and 493.
50
00:02:31,370 --> 00:02:34,650
A is 899, b is 493.
51
00:02:34,650 --> 00:02:38,030
Well, so I want this
GCD, 899 of 493.
52
00:02:38,030 --> 00:02:43,660
Well, according to the remainder
lemma, if I divide 899 by 493,
53
00:02:43,660 --> 00:02:48,620
I get a quotient of 1,
and a remainder of 406.
54
00:02:48,620 --> 00:02:52,180
So that means that
899 and 493 have
55
00:02:52,180 --> 00:03:00,790
the same GCD as 493 and 406.
56
00:03:00,790 --> 00:03:04,210
That is the original number
b, and the new remainder 406.
57
00:03:04,210 --> 00:03:07,660
But now, I can
divide 493 by 406.
58
00:03:07,660 --> 00:03:11,750
I get a quotient of zero
and a remainder of 87.
59
00:03:11,750 --> 00:03:14,990
So 406 and 87 have the same GCD.
60
00:03:14,990 --> 00:03:19,930
Dividing 406 by 87, I get that
87 and 58 have the same GCD.
61
00:03:19,930 --> 00:03:27,830
Dividing 87 by 58, I get that
58 and 29 have the same GCD.
62
00:03:27,830 --> 00:03:33,250
And now I win, because look,
when I divide 58 by 29,
63
00:03:33,250 --> 00:03:35,660
I get a remainder of 0.
64
00:03:35,660 --> 00:03:39,360
And the GCD of anything
and 0 is that thing.
65
00:03:39,360 --> 00:03:43,290
So the GCD of 29 and 0 is 0.
66
00:03:43,290 --> 00:03:46,070
I guess the only exception
is the GCD of 0 and 0,
67
00:03:46,070 --> 00:03:48,740
which is not defined.
68
00:03:48,740 --> 00:03:53,540
But if it's not 0, then
the GCD of x and 0 is x.
69
00:03:53,540 --> 00:03:54,230
And there it is.
70
00:03:54,230 --> 00:03:59,610
So I've just found that the
GCD of 899 and 493 is 29.
71
00:03:59,610 --> 00:04:01,400
And this is a quite
fast algorithm,
72
00:04:01,400 --> 00:04:03,660
because I keep dividing
the numbers that I
73
00:04:03,660 --> 00:04:07,260
have by each other,
and it gets small fast.
74
00:04:07,260 --> 00:04:09,240
We'll be more precise
about that in a minute.
75
00:04:09,240 --> 00:04:13,830
OK, it's a good exercise
in state machine thinking
76
00:04:13,830 --> 00:04:15,570
and practice in
program verification
77
00:04:15,570 --> 00:04:17,800
to reformulate the
Euclidean algorithm,
78
00:04:17,800 --> 00:04:19,759
or formulate it explicitly
as a state machine.
79
00:04:19,759 --> 00:04:22,190
It's a very simple
kind of state machine.
80
00:04:22,190 --> 00:04:24,660
The states of this Euclidean
algorithm state machine
81
00:04:24,660 --> 00:04:26,350
will be pairs of
non-negative integers.
82
00:04:26,350 --> 00:04:29,340
So the states are n cross
n, the Cartesian product
83
00:04:29,340 --> 00:04:32,300
of the non-negative
integers, with itself.
84
00:04:32,300 --> 00:04:34,650
The start state is going
to be the pair a, b,
85
00:04:34,650 --> 00:04:37,990
whose GCD I want to compute.
86
00:04:37,990 --> 00:04:42,000
And the transitions are
simply repeatedly applying
87
00:04:42,000 --> 00:04:43,050
the remainder lemma.
88
00:04:43,050 --> 00:04:47,890
Namely, if I'm in
state x, y, where
89
00:04:47,890 --> 00:04:52,050
you think of x as and y as the
GCD that I'm trying to compute,
90
00:04:52,050 --> 00:04:56,120
I simply convert x and y
to y, and the remainder
91
00:04:56,120 --> 00:04:58,250
of x divided by y.
92
00:04:58,250 --> 00:05:02,680
And I keep doing that
as long as y is not 0.
93
00:05:02,680 --> 00:05:05,470
OK, very simple state
machine-- really,
94
00:05:05,470 --> 00:05:08,010
just one transition rule.
95
00:05:08,010 --> 00:05:10,810
Well, according to
the lemma, since I'm
96
00:05:10,810 --> 00:05:14,960
replacing the GCD of x
and y by the GCD of y
97
00:05:14,960 --> 00:05:17,000
and the remainder
of x divided by y,
98
00:05:17,000 --> 00:05:19,850
the GCD is actually
staying constant.
99
00:05:19,850 --> 00:05:22,790
This transition
preserves the GCD
100
00:05:22,790 --> 00:05:26,480
that's left in the pair
of registers, x and y.
101
00:05:26,480 --> 00:05:29,330
So what we can say is that
since the GCD of x and y
102
00:05:29,330 --> 00:05:32,120
doesn't change from
one step to another,
103
00:05:32,120 --> 00:05:35,340
we can say that the GCD
of x and y at any point
104
00:05:35,340 --> 00:05:38,790
is equal to its original value,
which is the GCD of a and b.
105
00:05:38,790 --> 00:05:40,380
So in other words,
this equation,
106
00:05:40,380 --> 00:05:42,820
GCD of x and y in
the current state
107
00:05:42,820 --> 00:05:46,490
is equal to GCD of a and
b, the GCD of a and b
108
00:05:46,490 --> 00:05:49,550
that we started with,
is a preserved invariant
109
00:05:49,550 --> 00:05:50,160
of the state.
110
00:05:50,160 --> 00:05:54,900
So p of a state xy, the
property that GCD of x and y
111
00:05:54,900 --> 00:05:59,170
is the original GCD is
a preserved invariant
112
00:05:59,170 --> 00:06:00,990
of the state machine.
113
00:06:00,990 --> 00:06:04,220
Moreover, p of start
is trivially true,
114
00:06:04,220 --> 00:06:08,290
because at the start,
x and y are a equals b.
115
00:06:08,290 --> 00:06:10,560
So p of x and y is just
saying the GCD of a and b
116
00:06:10,560 --> 00:06:12,690
is equal to GCD of a and b.
117
00:06:12,690 --> 00:06:13,280
Cool.
118
00:06:13,280 --> 00:06:16,620
So I've got that this
property is true at the start,
119
00:06:16,620 --> 00:06:18,600
and it's preserved
by the transitions.
120
00:06:18,600 --> 00:06:20,900
So the invariance
principle tells me
121
00:06:20,900 --> 00:06:24,350
that if the program
stops, I'm going
122
00:06:24,350 --> 00:06:27,820
to have the GCD of x
and y when it terminates
123
00:06:27,820 --> 00:06:30,440
is equal to the actual
GCD that I want.
124
00:06:30,440 --> 00:06:33,050
And that enables us to
prove partial correctness.
125
00:06:33,050 --> 00:06:35,450
The claim is that if
this program terminates--
126
00:06:35,450 --> 00:06:38,830
we haven't determined that it
does yet-- but at termination,
127
00:06:38,830 --> 00:06:45,595
if any, I claim that x is left
in-- that the GCD of a and b
128
00:06:45,595 --> 00:06:47,230
is left in register x.
129
00:06:47,230 --> 00:06:49,780
The value of x at
the end is going
130
00:06:49,780 --> 00:06:51,040
to be the GCD of and and b.
131
00:06:51,040 --> 00:06:51,789
Well, why is that?
132
00:06:51,789 --> 00:06:55,430
Well, look-- at termination,
what we know is that y is 0.
133
00:06:55,430 --> 00:06:57,470
That's the only way that
this procedure stops,
134
00:06:57,470 --> 00:07:01,230
because otherwise, the
transition rule is applicable.
135
00:07:01,230 --> 00:07:03,650
So that means that
when y equals 0
136
00:07:03,650 --> 00:07:09,480
at termination, what we
have is that since y is 0,
137
00:07:09,480 --> 00:07:13,420
GCD of x and y is equal
to the GCD of x and 0.
138
00:07:13,420 --> 00:07:17,046
And that's equal to x, assuming,
again, that x is positive,
139
00:07:17,046 --> 00:07:18,480
or not 0.z
140
00:07:18,480 --> 00:07:20,330
So x is the GCD of x and y.
141
00:07:20,330 --> 00:07:23,170
And by the invariant,
the GCD of x and y
142
00:07:23,170 --> 00:07:25,180
is equal to the GCD of a and b.
143
00:07:25,180 --> 00:07:26,970
So I've prove this little fact.
144
00:07:26,970 --> 00:07:30,630
This procedure correctly
computes the GCD of a and b,
145
00:07:30,630 --> 00:07:34,830
leaving the answer in
register x, if it terminates.
146
00:07:34,830 --> 00:07:37,710
Well, of course it terminates,
and it terminates fast.
147
00:07:37,710 --> 00:07:40,610
So let's see why.
148
00:07:40,610 --> 00:07:46,390
Notice that at each transition,
we're going to replace x by y,
149
00:07:46,390 --> 00:07:50,050
and y by the remainder
of x divided by y.
150
00:07:50,050 --> 00:07:51,650
Let's just assume
for simplicity that
151
00:07:51,650 --> 00:07:55,400
of the [? pairings, ?] y
that x is the bigger one.
152
00:07:55,400 --> 00:08:01,460
So there's two cases of why
these numbers are getting small
153
00:08:01,460 --> 00:08:02,240
fast.
154
00:08:02,240 --> 00:08:06,900
The first case is suppose
that y is less than x over 2,
155
00:08:06,900 --> 00:08:08,370
or less than or
equal to x over 2.
156
00:08:08,370 --> 00:08:15,870
Well, since at this step,
you're going to replace x by y,
157
00:08:15,870 --> 00:08:19,020
it means that you're replacing
x by something that's
158
00:08:19,020 --> 00:08:19,920
less than half x.
159
00:08:19,920 --> 00:08:23,420
So x gets halved at this step.
160
00:08:23,420 --> 00:08:24,870
What about if y is big?
161
00:08:24,870 --> 00:08:27,330
Well, if y is bigger
than x over 2,
162
00:08:27,330 --> 00:08:32,270
then the remainder of x divided
by y is simply x minus y.
163
00:08:32,270 --> 00:08:34,490
And it's going to be
less than x over 2.
164
00:08:34,490 --> 00:08:38,870
But that's going to be the
value of y after the next step.
165
00:08:38,870 --> 00:08:40,659
So y is going to
be halved either
166
00:08:40,659 --> 00:08:43,130
at this step or the next
step when it's replaced
167
00:08:43,130 --> 00:08:45,000
by the remainder of x and y.
168
00:08:45,000 --> 00:08:49,460
And the net result is that
y it gets cut in half,
169
00:08:49,460 --> 00:08:53,610
or even smaller, at
every other step, which
170
00:08:53,610 --> 00:08:58,290
means that this procedure can't
continue for more than twice
171
00:08:58,290 --> 00:09:02,730
the log to the base 2 of the
original value of y, which
172
00:09:02,730 --> 00:09:06,930
is b number of steps,
because that's how many
173
00:09:06,930 --> 00:09:10,200
halves you can do before
you start hitting 0.
174
00:09:10,200 --> 00:09:13,920
So we've just shown that
this procedure holds
175
00:09:13,920 --> 00:09:16,970
in logarithmic number
of steps, which
176
00:09:16,970 --> 00:09:20,140
is the same as saying that
it's about the length of b
177
00:09:20,140 --> 00:09:24,970
in binary, and even fewer
steps than the length of b
178
00:09:24,970 --> 00:09:25,710
in decimal.
179
00:09:25,710 --> 00:09:30,470
The GCD algorithm is
really very efficient.