WEBVTT

00:00:00.070 --> 00:00:02.430
The following content is
provided under a Creative

00:00:02.430 --> 00:00:03.820
Commons license.

00:00:03.820 --> 00:00:06.060
Your support will help
MIT OpenCourseWare

00:00:06.060 --> 00:00:10.140
continue to offer high quality
educational resources for free.

00:00:10.140 --> 00:00:12.690
To make a donation or to
view additional materials

00:00:12.690 --> 00:00:16.600
from hundreds of MIT courses,
visit MIT OpenCourseWare

00:00:16.600 --> 00:00:17.255
at ocw.mit.edu.

00:00:25.835 --> 00:00:26.960
PROFESSOR: All right, guys.

00:00:26.960 --> 00:00:28.800
So let's get started.

00:00:28.800 --> 00:00:31.190
Welcome back from what I
hope was an exciting holiday

00:00:31.190 --> 00:00:32.560
for everyone.

00:00:32.560 --> 00:00:35.360
So today we're going to talk
about user authentication.

00:00:35.360 --> 00:00:37.890
So the basic challenge that
we want to address today

00:00:37.890 --> 00:00:42.420
is how can human users prove
their identity to a program?

00:00:42.420 --> 00:00:45.680
In particular, the paper that
was assigned for today's class

00:00:45.680 --> 00:00:47.635
addresses an
existential question

00:00:47.635 --> 00:00:48.930
in the security community.

00:00:48.930 --> 00:00:53.240
Is there anything better than
passwords for authentication?

00:00:53.240 --> 00:00:57.430
So at a high level it seems like
passwords are a terrible idea.

00:00:57.430 --> 00:01:00.010
So they have very low entropy,
its very easy for attackers

00:01:00.010 --> 00:01:01.380
to guess them.

00:01:01.380 --> 00:01:03.130
Also the security
questions that we

00:01:03.130 --> 00:01:05.481
use to recover
from lost passwords

00:01:05.481 --> 00:01:07.480
often have even lower
entropy than the passwords

00:01:07.480 --> 00:01:10.330
themselves, which also
seems like a problem.

00:01:10.330 --> 00:01:15.180
And even worse, users typically
will use the same password

00:01:15.180 --> 00:01:16.987
across a lot of different sites.

00:01:16.987 --> 00:01:19.195
So that means that the
vulnerability in one password,

00:01:19.195 --> 00:01:22.820
if it's easy to guess, could
expose a user's activity

00:01:22.820 --> 00:01:24.400
across a wide range of sites.

00:01:24.400 --> 00:01:27.030
So as the paper for
today's class states,

00:01:27.030 --> 00:01:28.930
I love this quote, "the
continued domination

00:01:28.930 --> 00:01:31.620
of passwords over
all of the methods

00:01:31.620 --> 00:01:34.850
of in-user authentication
is a major embarrassment

00:01:34.850 --> 00:01:36.110
for security researchers."

00:01:36.110 --> 00:01:37.920
All right, so the community
just seething out there,

00:01:37.920 --> 00:01:39.460
they want some
better alternative.

00:01:39.460 --> 00:01:41.380
But it's not clear
if there actually

00:01:41.380 --> 00:01:45.910
is an authentication scheme
that actually totally dominates

00:01:45.910 --> 00:01:48.630
passwords, that's more usable,
that's more deployable,

00:01:48.630 --> 00:01:49.830
that's more secure.

00:01:49.830 --> 00:01:52.210
So in today's lecture, we'll
basically do three things.

00:01:52.210 --> 00:01:53.710
So first of all,
we're going to look

00:01:53.710 --> 00:01:55.970
and we're going to see how
current passwords can work.

00:01:55.970 --> 00:01:58.660
Then we're going to talk
about the desirable properties

00:01:58.660 --> 00:02:01.630
at a high level for any
authentication scheme.

00:02:01.630 --> 00:02:05.112
And then we're finally going to
look at what the paper gives us

00:02:05.112 --> 00:02:07.320
in terms of metrics for
authenticating authentication

00:02:07.320 --> 00:02:08.740
schemes, and we're
going to see how

00:02:08.740 --> 00:02:10.156
some of these other
authentication

00:02:10.156 --> 00:02:12.230
schemes actually
compared to passwords.

00:02:12.230 --> 00:02:14.860
So in [INAUDIBLE]
what is a password?

00:02:14.860 --> 00:02:26.250
So a password is a
secret that is shared

00:02:26.250 --> 00:02:30.400
between a user and a server.

00:02:34.540 --> 00:02:37.800
So the naive implementation
of a password scheme

00:02:37.800 --> 00:02:41.160
is to basically
just have a table

00:02:41.160 --> 00:02:44.780
on the server side that
essentially just maps

00:02:44.780 --> 00:02:50.258
user names to passwords.

00:02:50.258 --> 00:02:52.008
That's the simplest
way for you to imagine

00:02:52.008 --> 00:02:54.980
implementing one of the
authentication schemes-- user

00:02:54.980 --> 00:02:58.280
passes into their user name
and the password, server

00:02:58.280 --> 00:02:59.826
network does a look
up in this table,

00:02:59.826 --> 00:03:01.700
compares the password
of the client supplied,

00:03:01.700 --> 00:03:02.360
what's in here.

00:03:02.360 --> 00:03:04.320
If everything's good,
the user's authenticated.

00:03:04.320 --> 00:03:06.176
So clearly the
problem with this is

00:03:06.176 --> 00:03:09.212
that if the attacker
compromises the server,

00:03:09.212 --> 00:03:10.670
then he can just
look at this table

00:03:10.670 --> 00:03:13.959
and then get all the uses
passwords in the queue.

00:03:13.959 --> 00:03:15.270
So that's clearly a bad thing.

00:03:15.270 --> 00:03:19.170
So perhaps an
improved solution is

00:03:19.170 --> 00:03:23.280
to have the server store
a table that looks like.

00:03:23.280 --> 00:03:25.320
So once again, it'd
match the user name

00:03:25.320 --> 00:03:31.055
but now it actually match
to hash of the password.

00:03:34.360 --> 00:03:37.000
So user client's gonna
supply their clear text

00:03:37.000 --> 00:03:39.590
password to the
server, the server

00:03:39.590 --> 00:03:41.480
will then take that
clear text password,

00:03:41.480 --> 00:03:43.870
hash it, do look at the
table, and once again see

00:03:43.870 --> 00:03:46.620
if the user is who he or
she says that they are.

00:03:46.620 --> 00:03:49.490
So the advantage
of this scheme is

00:03:49.490 --> 00:03:52.080
that by designed
these hash functions

00:03:52.080 --> 00:03:54.040
are difficult to invert.

00:03:54.040 --> 00:03:57.304
So if this table is
lost, it's leaked somehow

00:03:57.304 --> 00:03:58.928
or the attacker
compromised the server,

00:03:58.928 --> 00:04:00.969
and the attacker could
look at these things here,

00:04:00.969 --> 00:04:03.180
but it's difficult
for the attackers

00:04:03.180 --> 00:04:05.695
to say, OK, this sort of
string of random alpha

00:04:05.695 --> 00:04:07.460
numeric characters here.

00:04:07.460 --> 00:04:10.592
Here's a pre-image that
was used as the input

00:04:10.592 --> 00:04:13.660
of the hast function
[INAUDIBLE] that value there.

00:04:13.660 --> 00:04:16.089
So that at least
is the nice thing

00:04:16.089 --> 00:04:18.720
about these hashes in theory.

00:04:18.720 --> 00:04:21.370
Now in practice,
attackers don't actually

00:04:21.370 --> 00:04:23.540
have to launch
brute force attacks

00:04:23.540 --> 00:04:28.150
to figure out what the preimages
for these hash values are.

00:04:28.150 --> 00:04:30.770
So attackers can actually
take advantage of the fact

00:04:30.770 --> 00:04:36.595
that passwords in practice
have skewed distribution.

00:04:40.200 --> 00:04:43.150
And by skewed
distributions, I mean

00:04:43.150 --> 00:04:45.850
that-- let's say that we
knew that all passwords were

00:04:45.850 --> 00:04:47.150
20 characters long.

00:04:47.150 --> 00:04:50.460
It's not like users actually
pick passwords that's

00:04:50.460 --> 00:04:54.080
sort of exist in all
places in that space of 20

00:04:54.080 --> 00:04:55.340
possible characters.

00:04:55.340 --> 00:05:00.580
In practice, people pick
passwords like 1, 2, 3 or todd

00:05:00.580 --> 00:05:02.002
or things like this.

00:05:02.002 --> 00:05:03.960
So in fact there's been
these empirical studies

00:05:03.960 --> 00:05:08.180
of how passwords work
and a lot of times

00:05:08.180 --> 00:05:18.764
these studies find things
like the top 5,000 passwords

00:05:18.764 --> 00:05:21.710
cover about 20% of users.

00:05:25.032 --> 00:05:26.490
So what that means,
in other words,

00:05:26.490 --> 00:05:29.970
is that the attacker has
a database of those 5,000

00:05:29.970 --> 00:05:30.840
passwords.

00:05:30.840 --> 00:05:32.830
The attacker can
just hash those,

00:05:32.830 --> 00:05:37.050
and then when the attacker looks
at this stolen password table,

00:05:37.050 --> 00:05:39.640
can just see if one
of those things that

00:05:39.640 --> 00:05:44.408
come from this 5,000 large
list match over here.

00:05:44.408 --> 00:05:46.344
And so empirically
speaking, the attacker

00:05:46.344 --> 00:05:49.260
would be able to recover about
20% of passwords that way.

00:05:49.260 --> 00:05:55.050
And so, folks at Yahoo
found that passwords

00:05:55.050 --> 00:06:02.832
have roughly 10 to 20 bits
of intricate, 10 to 20 bits

00:06:02.832 --> 00:06:04.760
of randomness in them.

00:06:04.760 --> 00:06:08.360
And that's actually
not that big.

00:06:08.360 --> 00:06:10.435
So, for example, if you
think about what might

00:06:10.435 --> 00:06:11.560
this hash function here be?

00:06:11.560 --> 00:06:14.620
So maybe it's something like
shop, something like this.

00:06:14.620 --> 00:06:17.880
So modern machines
actually calculate millions

00:06:17.880 --> 00:06:20.260
of these hashes every second.

00:06:20.260 --> 00:06:22.660
So the fact that hash
function by design

00:06:22.660 --> 00:06:25.050
are suppose to be
easy to calculate

00:06:25.050 --> 00:06:26.450
so it'd be fast calculate.

00:06:26.450 --> 00:06:27.950
Combined with this
fact that there'd

00:06:27.950 --> 00:06:29.700
be skewed password
distributions,

00:06:29.700 --> 00:06:32.500
means that in principle, this
scheme here is not as secure

00:06:32.500 --> 00:06:34.510
as it might seem.

00:06:34.510 --> 00:06:36.800
So one thing you
can imagine to try

00:06:36.800 --> 00:06:40.660
to make life more
difficult on the attacker

00:06:40.660 --> 00:06:46.860
is you could imagine that you
use expensive key derivation

00:06:46.860 --> 00:06:47.360
function.

00:06:53.290 --> 00:06:55.280
And so by key
derivation function,

00:06:55.280 --> 00:06:58.867
I just mean this thing up here.

00:06:58.867 --> 00:07:01.200
This thing that's taking the
passwords as input and then

00:07:01.200 --> 00:07:03.505
generate something that's
stored on the server.

00:07:03.505 --> 00:07:05.213
So what's nice about
these key derivation

00:07:05.213 --> 00:07:09.915
functions is it actually
have tunable cost.

00:07:09.915 --> 00:07:11.930
So you can basically
turn this knob

00:07:11.930 --> 00:07:14.516
and make that function
run slower or faster

00:07:14.516 --> 00:07:15.640
depending on what you want.

00:07:15.640 --> 00:07:17.525
And so the idea here
is that, let's say

00:07:17.525 --> 00:07:19.650
that you're going to use
a key derivation function.

00:07:19.650 --> 00:07:28.020
So assume these examples are
like PBKDF2, or maybe BCrypt

00:07:28.020 --> 00:07:30.901
so you can look these up using
the miracle of the internet

00:07:30.901 --> 00:07:32.400
if you care to know
more about them.

00:07:32.400 --> 00:07:34.330
But the base idea
is let's imagine

00:07:34.330 --> 00:07:36.040
that one of these key
derivation function

00:07:36.040 --> 00:07:40.820
took a second to calculate, as
opposed to a few milliseconds.

00:07:40.820 --> 00:07:42.490
That actually makes
the attacker's job

00:07:42.490 --> 00:07:45.760
much more difficult. Because
when the attacker is trying

00:07:45.760 --> 00:07:49.090
to, let's say, generate
values for these 5,000 topmost

00:07:49.090 --> 00:07:51.720
passwords, it's going to
take the attacker much longer

00:07:51.720 --> 00:07:52.760
to do that.

00:07:52.760 --> 00:07:55.770
So does that all makes
sense how these things work?

00:07:55.770 --> 00:07:56.940
Pretty straight forward.

00:07:56.940 --> 00:07:59.260
So internally these key
derivation functions

00:07:59.260 --> 00:08:02.675
often operate by repeatedly
calling a hash multiple,

00:08:02.675 --> 00:08:03.500
multiple times.

00:08:03.500 --> 00:08:05.960
So that's all pretty
straightforward.

00:08:05.960 --> 00:08:08.712
So you might say, well,
does this solve the problem?

00:08:08.712 --> 00:08:10.753
So can we just use these
expensive key derivation

00:08:10.753 --> 00:08:12.590
function and be done with it?

00:08:12.590 --> 00:08:14.920
So if this was a security
class, the answer is no.

00:08:14.920 --> 00:08:17.820
So one problem is that the
adversary can build something

00:08:17.820 --> 00:08:23.470
called rainbow tables.

00:08:23.470 --> 00:08:29.990
And so a rainbow table
is basically just a map

00:08:29.990 --> 00:08:35.490
of a password to hash out.

00:08:39.039 --> 00:08:43.532
And so the insight here is that
even if the system is using

00:08:43.532 --> 00:08:45.665
one of these expensive
key derivation function,

00:08:45.665 --> 00:08:49.840
the attacker can calculate
one of these tables once.

00:08:49.840 --> 00:08:52.396
It might be a little bit painful
because each key derivation

00:08:52.396 --> 00:08:53.950
function indication is slow.

00:08:53.950 --> 00:08:56.780
But the attacker can build
this table once and then use

00:08:56.780 --> 00:09:00.030
that to crack all subsequent
systems the attacker can

00:09:00.030 --> 00:09:04.120
break into that use that
same key derivation function.

00:09:04.120 --> 00:09:05.980
So that's how
rainbow tables work.

00:09:05.980 --> 00:09:07.827
And once again, to
maximize the cost benefit

00:09:07.827 --> 00:09:09.660
of building this rainbow
table, the attacker

00:09:09.660 --> 00:09:12.700
could take advantage of the
skewed password distributions

00:09:12.700 --> 00:09:13.450
I can see up here.

00:09:13.450 --> 00:09:15.040
So the attacker might
only build a rainbow table

00:09:15.040 --> 00:09:17.245
for some small set of
all possible passwords.

00:09:17.245 --> 00:09:19.910
AUDIENCE: So salting makes
this much more difficult.

00:09:19.910 --> 00:09:21.410
PROFESSOR: Yeah,
yeah, that's right.

00:09:21.410 --> 00:09:24.250
So we're going to get to salting
I believe in a couple seconds.

00:09:24.250 --> 00:09:24.890
That's right.

00:09:24.890 --> 00:09:27.290
So at a high level, if
you don't use salting,

00:09:27.290 --> 00:09:29.620
rainbow tables actually
allow the attacker

00:09:29.620 --> 00:09:32.030
to spend some effort offline,
calculate this table,

00:09:32.030 --> 00:09:34.430
and then sort of
amortized the cost

00:09:34.430 --> 00:09:36.119
of calculating that
table over breaking

00:09:36.119 --> 00:09:37.535
many different
password databases.

00:09:41.455 --> 00:09:44.510
So the next thing that we can
think about to improve things

00:09:44.510 --> 00:09:45.255
is salting.

00:09:45.255 --> 00:09:46.630
I swear that guy
was not a plant,

00:09:46.630 --> 00:09:49.180
I will give you your
$20 after class.

00:09:49.180 --> 00:09:50.990
So how does salting work?

00:09:50.990 --> 00:09:52.448
So the basic thing
is you just want

00:09:52.448 --> 00:09:54.950
input some additional
randomness into the way

00:09:54.950 --> 00:09:56.750
that the passwords generated.

00:09:56.750 --> 00:10:02.450
So basically, you want to
take this hash function

00:10:02.450 --> 00:10:05.172
and you want to put some
salt in there-- which

00:10:05.172 --> 00:10:08.657
I'll explain in a second--
and then the password.

00:10:08.657 --> 00:10:10.865
And this is the thing that
you saw on the server side

00:10:10.865 --> 00:10:11.656
in the [INAUDIBLE].

00:10:11.656 --> 00:10:12.660
So what is this salt?

00:10:12.660 --> 00:10:16.880
And you just think of it as just
a string, a long string that's

00:10:16.880 --> 00:10:20.370
provided as sort of a first
part to this hash function.

00:10:20.370 --> 00:10:23.640
So why is it better
to use this scheme?

00:10:23.640 --> 00:10:25.440
And know that the
salt is actually

00:10:25.440 --> 00:10:28.500
stored on the clear
text on the server side.

00:10:28.500 --> 00:10:30.879
So you might be thinking OK,
well if that salt is stored

00:10:30.879 --> 00:10:32.640
on the clear text
in the server side,

00:10:32.640 --> 00:10:36.030
it seemed like a server can both
steal the table that matched

00:10:36.030 --> 00:10:38.330
user names to passwords
and the attacker can also

00:10:38.330 --> 00:10:41.109
steal the salt. So
why is that useful?

00:10:41.109 --> 00:10:43.650
AUDIENCE: Because if you picked
the top most common password,

00:10:43.650 --> 00:10:46.107
you can't just use it
once and find a new user.

00:10:46.107 --> 00:10:47.440
PROFESSOR: That's exactly right.

00:10:47.440 --> 00:10:49.180
So basically what
this does is this

00:10:49.180 --> 00:10:52.580
prevents the attacker from
building a single rainbow table

00:10:52.580 --> 00:10:56.050
and then using that rainbow
table against all instances

00:10:56.050 --> 00:10:57.930
of that hash function.

00:10:57.930 --> 00:10:59.970
And so you can
basically think of this

00:10:59.970 --> 00:11:02.776
as sort of uniquifying
passwords even if they

00:11:02.776 --> 00:11:04.810
are the same, basically.

00:11:04.810 --> 00:11:07.166
So this is what a lot of
systems do in practice, they

00:11:07.166 --> 00:11:09.370
use this notion of salt here.

00:11:09.370 --> 00:11:10.840
And so the best
practices for this

00:11:10.840 --> 00:11:12.360
so you want to
choose a salt that's

00:11:12.360 --> 00:11:14.776
long Because you're going to
essentially think of the salt

00:11:14.776 --> 00:11:18.240
as adding more bits to
this pseudo-password right.

00:11:18.240 --> 00:11:19.490
So more bits is always better.

00:11:19.490 --> 00:11:21.031
And the other thing
you want to do to

00:11:21.031 --> 00:11:23.390
is that whenever the user
changes his or her password,

00:11:23.390 --> 00:11:25.480
you typically want to
change that salt too.

00:11:25.480 --> 00:11:29.165
So one reason for that is
let's say that users are lazy

00:11:29.165 --> 00:11:31.750
and they want to pick the
same password multiple times.

00:11:31.750 --> 00:11:34.678
Changing the salt will
ensure that the thing that's

00:11:34.678 --> 00:11:37.303
stored in the password database
will actually be different even

00:11:37.303 --> 00:11:38.440
it that password's the same.

00:11:38.440 --> 00:11:40.106
I think there was a
questions somewhere.

00:11:40.106 --> 00:11:41.550
AUDIENCE: Why's it called salt?

00:11:41.550 --> 00:11:43.750
PROFESSOR: I'm actually
not sure why it's called

00:11:43.750 --> 00:11:45.060
salt, that's a good question.

00:11:45.060 --> 00:11:46.680
I'm sure there's some
answer to this though.

00:11:46.680 --> 00:11:47.450
It's like why are
cookies called cookies?

00:11:47.450 --> 00:11:49.836
The internet will know
but I actually don't know.

00:11:49.836 --> 00:11:52.800
AUDIENCE: Add some
[INAUDIBLE] to the hash number

00:11:52.800 --> 00:11:55.270
hash [INAUDIBLE].

00:11:55.270 --> 00:11:56.382
PROFESSOR: There we go.

00:11:56.382 --> 00:11:58.090
I'm glad that we're
getting this on film,

00:11:58.090 --> 00:11:59.255
cause I feel this
how we're going

00:11:59.255 --> 00:12:00.338
to get our Touring awards.

00:12:00.338 --> 00:12:01.530
That's right.

00:12:01.530 --> 00:12:03.790
I'm sure there's some
answer on the internet,

00:12:03.790 --> 00:12:05.370
so I'll look that up later.

00:12:05.370 --> 00:12:08.280
But does that all
basically makes sense?

00:12:08.280 --> 00:12:12.720
OK so these approaches are
fairly straightforward.

00:12:12.720 --> 00:12:16.980
So what I've assume so far
is that somehow the client

00:12:16.980 --> 00:12:20.466
is transmitting the
password to the server.

00:12:20.466 --> 00:12:23.090
But I haven't actually specified
how that transition's actually

00:12:23.090 --> 00:12:23.923
going to take place.

00:12:27.270 --> 00:12:35.880
So how do we transmit
these passwords?

00:12:35.880 --> 00:12:39.500
So the first idea you
might have would be,

00:12:39.500 --> 00:12:43.960
well, we'll just
send the password

00:12:43.960 --> 00:12:46.730
in the clear over the network.

00:12:46.730 --> 00:12:49.344
This is clearly
cartoonishly bad,

00:12:49.344 --> 00:12:51.510
because then there could
be a network attacker who's

00:12:51.510 --> 00:12:54.007
basically snooping
and seeing the traffic

00:12:54.007 --> 00:12:54.840
that you're sending.

00:12:54.840 --> 00:12:56.798
And let's see if we can
just take that password

00:12:56.798 --> 00:12:59.249
right off the wire and
then impersonate you.

00:12:59.249 --> 00:13:00.790
So we always start
with the straw man

00:13:00.790 --> 00:13:02.970
before I show you the other
straw men, which of course are

00:13:02.970 --> 00:13:03.840
also fatally flawed.

00:13:03.840 --> 00:13:05.815
So first thing you
think about is sending

00:13:05.815 --> 00:13:07.285
a password in the clear.

00:13:07.285 --> 00:13:08.785
Another thing you
might think, which

00:13:08.785 --> 00:13:10.860
would be a little
bit better perhaps,

00:13:10.860 --> 00:13:18.200
is perhaps we send the password
over an encrypted connection.

00:13:23.345 --> 00:13:27.464
And so we use some type
of cryptography here.

00:13:27.464 --> 00:13:29.630
Maybe there's some secret
key or something like that

00:13:29.630 --> 00:13:31.540
and that's what we
use to transform

00:13:31.540 --> 00:13:34.240
the password before we send
it over the connection.

00:13:34.240 --> 00:13:35.942
So at a high level,
encryption always

00:13:35.942 --> 00:13:37.400
seems to make things
better, right?

00:13:37.400 --> 00:13:38.200
Trademark.

00:13:38.200 --> 00:13:41.179
But the problem is that
unless you think carefully

00:13:41.179 --> 00:13:43.595
about how you're using things
like encryption and hashing,

00:13:43.595 --> 00:13:45.473
you may not be getting
the security benefits

00:13:45.473 --> 00:13:46.530
that you think you're getting.

00:13:46.530 --> 00:13:48.120
Because, for example,
what if there's

00:13:48.120 --> 00:13:50.450
someone who's sitting
between you-- the client--

00:13:50.450 --> 00:13:53.426
and the server, this proverbial
man in the middle attacker,

00:13:53.426 --> 00:13:55.050
who's actually snooping
on your traffic

00:13:55.050 --> 00:13:57.580
and pretending to be the server.

00:13:57.580 --> 00:14:00.370
If you send encrypted
data, you haven't actually

00:14:00.370 --> 00:14:02.600
authenticated the
other end, then

00:14:02.600 --> 00:14:06.150
you could still be opening
up yourself to problems.

00:14:06.150 --> 00:14:07.960
Because if the client
just, let's say,

00:14:07.960 --> 00:14:10.410
picked some random key,
sends it to some entity

00:14:10.410 --> 00:14:12.970
on the other side who may
or may not be the server.

00:14:12.970 --> 00:14:15.906
It is not the
server, [INAUDIBLE].

00:14:15.906 --> 00:14:19.490
You are sending something to
some person, who will then be

00:14:19.490 --> 00:14:21.390
able to get all your secrets.

00:14:21.390 --> 00:14:23.740
And so similarly,
people might think well

00:14:23.740 --> 00:14:25.810
what if I don't send
the raw password

00:14:25.810 --> 00:14:27.615
but I send a hash
of the passwords.

00:14:27.615 --> 00:14:29.240
That actually doesn't
give you anything

00:14:29.240 --> 00:14:30.260
in and of itself either.

00:14:30.260 --> 00:14:32.720
Because whether you send
the password or the hash

00:14:32.720 --> 00:14:34.780
of a password-- I mean,
a hash of the password

00:14:34.780 --> 00:14:37.800
has the same sort of semantic
power as the original password

00:14:37.800 --> 00:14:38.794
itself.

00:14:38.794 --> 00:14:40.585
If you haven't
authenticated the other side

00:14:40.585 --> 00:14:43.110
if you haven't authenticated
the server or things like this.

00:14:43.110 --> 00:14:44.740
So the basic point
with this discussion

00:14:44.740 --> 00:14:49.440
here is just to stress the fact
that just adding encryption

00:14:49.440 --> 00:14:51.730
or just adding hashing
doesn't necessarily

00:14:51.730 --> 00:14:53.690
give you any additional powers.

00:14:53.690 --> 00:14:56.160
If the client can't authenticate
who he or she is sending

00:14:56.160 --> 00:14:59.620
the password to then the client
could be mistakenly divulging

00:14:59.620 --> 00:15:03.430
that password with someone they
don't intend to divulged it to.

00:15:03.430 --> 00:15:07.620
So perhaps a better
idea than these two

00:15:07.620 --> 00:15:12.155
is to use what they call a
challenge response protocol.

00:15:17.200 --> 00:15:20.070
And here's an example of a
very simple challenge response

00:15:20.070 --> 00:15:21.090
protocol.

00:15:21.090 --> 00:15:26.140
So let's say we've
got the client here,

00:15:26.140 --> 00:15:30.700
and then you've got
the server over here.

00:15:30.700 --> 00:15:36.340
So the client says,
hi, I'm Alice.

00:15:39.450 --> 00:15:45.470
And then the server response
with some challenge seam,

00:15:45.470 --> 00:15:48.900
some quantity that the
server got to pick.

00:15:48.900 --> 00:15:54.670
And then the client
is going to respond

00:15:54.670 --> 00:15:58.950
with the hash of that
server sent challenge,

00:15:58.950 --> 00:16:02.898
and then you can concatenate
that with the password.

00:16:06.350 --> 00:16:09.490
So at this point, the server
can take this quantity.

00:16:09.490 --> 00:16:11.830
The server knows the
challenge that it sent.

00:16:11.830 --> 00:16:13.950
And presumably the server
knows the password,

00:16:13.950 --> 00:16:16.530
so the server can
[INAUDIBLE] this quantity

00:16:16.530 --> 00:16:19.780
and see it actually
matches what the user sent.

00:16:19.780 --> 00:16:21.720
So what's nice
about this protocol

00:16:21.720 --> 00:16:24.950
is that if we ignore man in the
middle attacks for a second,

00:16:24.950 --> 00:16:28.985
the server is now confident
that the user's actually Alice,

00:16:28.985 --> 00:16:31.331
because only Alice would
know this password here.

00:16:31.331 --> 00:16:33.830
And what's nice about this is
that if the server is actually

00:16:33.830 --> 00:16:36.120
the attacker-- so
in other words,

00:16:36.120 --> 00:16:39.442
if Alice sent this thing
to someone who's not

00:16:39.442 --> 00:16:41.400
the person who she's
trying to authenticate to,

00:16:41.400 --> 00:16:43.957
then the attacker still
doesn't know the password.

00:16:43.957 --> 00:16:45.990
Because the attacker
got to choose C,

00:16:45.990 --> 00:16:48.126
but the attacker doesn't
know what this is.

00:16:48.126 --> 00:16:49.500
And so basically
for the attacker

00:16:49.500 --> 00:16:50.969
to figure out what
the password is,

00:16:50.969 --> 00:16:52.760
the attacker has to be
able to, once again,

00:16:52.760 --> 00:16:54.324
invert these hash functions.

00:16:54.324 --> 00:16:55.282
Do you have a question?

00:16:55.282 --> 00:16:57.282
AUDIENCE: I'm just curious,
how can you not make

00:16:57.282 --> 00:17:01.178
a client do the hashing?

00:17:01.178 --> 00:17:01.678
[INAUDIBLE]

00:17:10.329 --> 00:17:13.300
PROFESSOR: So let's see,
so your proposed scheme

00:17:13.300 --> 00:17:20.370
is that the client side is
going to call this thing?

00:17:20.370 --> 00:17:22.495
AUDIENCE: Yeah, so instead
of setting the password,

00:17:22.495 --> 00:17:26.478
and having the server hash
the password and check it,

00:17:26.478 --> 00:17:28.482
the client would just
send the hash password.

00:17:28.482 --> 00:17:30.815
PROFESSOR: The client would
just sent the hash password.

00:17:36.430 --> 00:17:37.980
So there's a couple reasons.

00:17:37.980 --> 00:17:40.642
So one reason, as
we'll discuss later,

00:17:40.642 --> 00:17:42.350
is that there's going
to be things called

00:17:42.350 --> 00:17:43.772
anti-hammering defenses right.

00:17:43.772 --> 00:17:45.230
Anti-hammering
defenses is designed

00:17:45.230 --> 00:17:48.544
to prevent a bad client
from continually asking,

00:17:48.544 --> 00:17:50.335
is this the password,
is this the password,

00:17:50.335 --> 00:17:51.330
is this the password?

00:17:51.330 --> 00:17:53.121
So then as a result,
it's easier for things

00:17:53.121 --> 00:17:55.150
to be on the server side
as on the client side.

00:17:55.150 --> 00:17:57.340
But suffice it to
say, you can, in fact,

00:17:57.340 --> 00:17:59.882
do the hash on the client side.

00:17:59.882 --> 00:18:01.590
Using JavaScripts or
something like this.

00:18:01.590 --> 00:18:03.185
But the basic idea
is that somehow you

00:18:03.185 --> 00:18:06.770
have to have the computational
expense be very, very large,

00:18:06.770 --> 00:18:10.620
because that's going to prevent
the attacker from just guessing

00:18:10.620 --> 00:18:13.617
what the password is quickly.

00:18:13.617 --> 00:18:14.700
Is there another question?

00:18:14.700 --> 00:18:16.878
AUDIENCE: Well I just
wanted to point out

00:18:16.878 --> 00:18:18.822
that if the client
does the hashing,

00:18:18.822 --> 00:18:23.196
then it's [INAUDIBLE] because
your password is the hash.

00:18:23.196 --> 00:18:25.140
PROFESSOR: So that's true.

00:18:25.140 --> 00:18:26.920
AUDIENCE: So if
somebody get the table

00:18:26.920 --> 00:18:28.900
from the server
[INAUDIBLE] using

00:18:28.900 --> 00:18:31.251
it to hash they can log in.

00:18:31.251 --> 00:18:32.250
PROFESSOR: That's right.

00:18:32.250 --> 00:18:34.041
Yeah, it gets a little
bit subtle sometimes

00:18:34.041 --> 00:18:37.160
depending on who can
pick, for example,

00:18:37.160 --> 00:18:38.487
these challenge values.

00:18:38.487 --> 00:18:40.820
Because if client and servers
can pick challenge values,

00:18:40.820 --> 00:18:43.130
so that makes it more or
less difficult for the client

00:18:43.130 --> 00:18:44.280
to launch those
types of attacks.

00:18:44.280 --> 00:18:46.405
So for example, like one
problem with this protocol

00:18:46.405 --> 00:18:49.700
here is that
basically the client

00:18:49.700 --> 00:18:54.000
doesn't get to inject
any randomness into this.

00:18:54.000 --> 00:18:55.500
So you can imagine
that you can make

00:18:55.500 --> 00:18:59.440
this protocol more difficult
for the server to invert.

00:18:59.440 --> 00:19:01.976
If the client actually got
to choose some challenge that

00:19:01.976 --> 00:19:04.476
was put in here, so you got the
server side challenge verses

00:19:04.476 --> 00:19:05.720
the client side challenge.

00:19:05.720 --> 00:19:06.886
But you're right about that.

00:19:09.110 --> 00:19:11.670
Any other questions?

00:19:11.670 --> 00:19:13.790
OK.

00:19:13.790 --> 00:19:17.240
So yeah, so this segues is
discussion we're just having.

00:19:19.890 --> 00:19:22.960
So even though to
break this, the server

00:19:22.960 --> 00:19:25.860
would have to invert
this hash, the attacker

00:19:25.860 --> 00:19:29.132
could still try to do one of
these brute force attacks.

00:19:29.132 --> 00:19:30.840
So one way that we
can prevent the server

00:19:30.840 --> 00:19:32.160
from doing these
brute force attacks

00:19:32.160 --> 00:19:33.876
is to choose one of these
expensive hash functions

00:19:33.876 --> 00:19:35.060
like we were discussing before.

00:19:35.060 --> 00:19:36.559
Another thing, as
we just discussed,

00:19:36.559 --> 00:19:39.640
is that you could actually
allow the client to,

00:19:39.640 --> 00:19:44.070
for example, choose its
own client chosen challenge

00:19:44.070 --> 00:19:44.850
over here.

00:19:44.850 --> 00:19:46.225
And so that
essentially would act

00:19:46.225 --> 00:19:48.960
as like a client chosen salt.
So that would essentially

00:19:48.960 --> 00:19:50.950
make it more difficult
for the hacker

00:19:50.950 --> 00:19:52.760
to do things like build
up a rainbow table.

00:19:52.760 --> 00:19:56.590
Because note that if the
servers is the attacker here,

00:19:56.590 --> 00:19:59.830
the server always can pick the
same challenge value again,

00:19:59.830 --> 00:20:02.190
again, and again, allowing
to build the rainbow table.

00:20:02.190 --> 00:20:04.300
But if when the
client responded back,

00:20:04.300 --> 00:20:06.870
the client also
included some salt,

00:20:06.870 --> 00:20:09.086
some client chosen
challenge that it included,

00:20:09.086 --> 00:20:10.460
then they'll
prevent the attacker

00:20:10.460 --> 00:20:12.900
from building one of
the rainbow tables.

00:20:12.900 --> 00:20:15.361
So does that all make sense?

00:20:15.361 --> 00:20:15.860
OK.

00:20:19.580 --> 00:20:23.300
So yeah, one thing
that I mentioned

00:20:23.300 --> 00:20:26.920
that might be useful
to do is implementing

00:20:26.920 --> 00:20:28.222
these anti-hammer defenses.

00:20:33.770 --> 00:20:40.560
And so anti-hammering defenses
are basically designed to rate

00:20:40.560 --> 00:20:50.800
limit the number
of password guesses

00:20:50.800 --> 00:20:53.630
that a bad client can issue.

00:20:59.900 --> 00:21:03.210
Because the idea here is that
if you've got some clients who's

00:21:03.210 --> 00:21:05.320
trying to launch one
of these brute force

00:21:05.320 --> 00:21:06.754
guesses against
the password, you

00:21:06.754 --> 00:21:08.670
don't want that client
to be able to sit there

00:21:08.670 --> 00:21:10.795
in a tight loop and just
say, is this the password,

00:21:10.795 --> 00:21:12.910
is this the password,
is this the password?

00:21:12.910 --> 00:21:14.830
So one way we can
do anti-hamming

00:21:14.830 --> 00:21:16.556
it just do that rate limiting.

00:21:16.556 --> 00:21:18.170
So the server will
say, I will only

00:21:18.170 --> 00:21:21.150
accept let's say three
password guesses per second

00:21:21.150 --> 00:21:22.650
from any particular client.

00:21:22.650 --> 00:21:28.710
You could also mention imagine
implementing timeouts here.

00:21:28.710 --> 00:21:31.550
So maybe the client can issue
a bunch of password requests

00:21:31.550 --> 00:21:33.970
in a row, but then after, let's
say, 10 of them are wrong,

00:21:33.970 --> 00:21:35.594
the server says, OK
you got to hold on,

00:21:35.594 --> 00:21:39.340
I will not accept any more
requests from you for,

00:21:39.340 --> 00:21:42.770
let's say, 10 seconds,
something like that.

00:21:42.770 --> 00:21:44.610
And so both of these
things are designed

00:21:44.610 --> 00:21:46.220
for preventing
brute force attacks.

00:21:46.220 --> 00:21:48.912
And so, for example,
like some smart cars have

00:21:48.912 --> 00:21:50.860
these types of
defenses, some TPNs

00:21:50.860 --> 00:21:53.150
have these kinds of
defenses to basically stop

00:21:53.150 --> 00:21:56.000
against this brute force attack.

00:21:56.000 --> 00:21:58.250
So why is it important
for you to use

00:21:58.250 --> 00:21:59.880
these anti-hammering defenses?

00:21:59.880 --> 00:22:01.370
Well one reason
why it's important

00:22:01.370 --> 00:22:03.570
is as we discussed
these passwords have

00:22:03.570 --> 00:22:05.640
so little entropy.

00:22:05.640 --> 00:22:08.110
So because passwords typically
have so little entropy,

00:22:08.110 --> 00:22:10.337
it's really important
to prevent the attacker

00:22:10.337 --> 00:22:12.670
from just trying to cycle
through that low entropy space

00:22:12.670 --> 00:22:13.940
very, very quickly.

00:22:13.940 --> 00:22:15.940
So as you may be aware,
a lot of websites

00:22:15.940 --> 00:22:21.042
have these format constraints
that push upon you

00:22:21.042 --> 00:22:22.630
for your passwords.

00:22:22.630 --> 00:22:24.437
They'll say things
like your password must

00:22:24.437 --> 00:22:31.036
have a punctuation, it must
have a mixture of numbers

00:22:31.036 --> 00:22:33.410
and letters, you must have
uppercase and lowercase stuff,

00:22:33.410 --> 00:22:34.546
so and so forth.

00:22:34.546 --> 00:22:36.920
And so what those constraints
are trying to get you to do

00:22:36.920 --> 00:22:38.760
is they're trying
to get you to expand

00:22:38.760 --> 00:22:40.660
the entropy of the password.

00:22:40.660 --> 00:22:43.490
But what's problematic though
is that it's not really

00:22:43.490 --> 00:22:46.210
these formatted constraints
that we should be caring about.

00:22:46.210 --> 00:22:48.980
It's the actual entropy
of the password itself.

00:22:48.980 --> 00:22:51.680
So it turns out even if people
were given these constraints--

00:22:51.680 --> 00:22:52.960
like you have to use
punctuation, characters,

00:22:52.960 --> 00:22:55.275
and stuff like that-- the
entropy of resulting password

00:22:55.275 --> 00:22:56.844
is often quite low.

00:22:56.844 --> 00:22:58.885
So for example, people
will often put punctuation

00:22:58.885 --> 00:22:59.885
at the beginning or end.

00:22:59.885 --> 00:23:02.218
Because they don't want to
be troubled to remember like,

00:23:02.218 --> 00:23:04.900
do I have like a dollar sign
in the middle or something?

00:23:04.900 --> 00:23:08.720
And so as it turns out, these
format requirements oftentimes

00:23:08.720 --> 00:23:11.850
don't make dictionary
attacks much harder

00:23:11.850 --> 00:23:14.070
for a sophisticated adversary.

00:23:14.070 --> 00:23:18.240
And the reason is because,
basically, the dictionary

00:23:18.240 --> 00:23:20.540
attacker can leverage
these observations

00:23:20.540 --> 00:23:22.720
about how people
pick passwords even

00:23:22.720 --> 00:23:24.360
in the presence of constraints.

00:23:24.360 --> 00:23:26.910
So for example, if the attacker
knows that people typically

00:23:26.910 --> 00:23:28.630
put punctuation at the
beginning or the end,

00:23:28.630 --> 00:23:30.720
just incorporate that into
your dictionary attack.

00:23:30.720 --> 00:23:32.595
And so an actually really
interesting website

00:23:32.595 --> 00:23:35.995
you can go to that's
called Telepathwords.

00:23:40.130 --> 00:23:41.770
And so what's neat
about this site

00:23:41.770 --> 00:23:44.390
is that it has a
little text box.

00:23:44.390 --> 00:23:46.745
So you can type a character
into that text box--

00:23:46.745 --> 00:23:48.870
you're pretending that
you're entering a password--

00:23:48.870 --> 00:23:51.070
and Telepathwords
will try to guess

00:23:51.070 --> 00:23:52.960
what your next character is.

00:23:52.960 --> 00:23:54.595
So as you type
additional characters,

00:23:54.595 --> 00:23:56.800
it'll have a little drop
down box which says,

00:23:56.800 --> 00:23:59.091
were you going to put this,
were you going to put this?

00:23:59.091 --> 00:24:02.380
It will give you a
little blurb that says,

00:24:02.380 --> 00:24:04.035
here's what I think
that you were going

00:24:04.035 --> 00:24:05.650
to enter this next password.

00:24:05.650 --> 00:24:07.290
So how does Telepathwords work?

00:24:07.290 --> 00:24:09.350
So it basically has
a bunch of databases.

00:24:09.350 --> 00:24:11.705
It has a database
of common passwords.

00:24:15.030 --> 00:24:21.930
It also has a list
of popular phrases

00:24:21.930 --> 00:24:25.504
that it's taken from websites.

00:24:25.504 --> 00:24:28.040
And it also has this
set of heuristics

00:24:28.040 --> 00:24:36.570
which describe common user
biases in picking passwords.

00:24:36.570 --> 00:24:38.210
So for example,
one funny bias is

00:24:38.210 --> 00:24:39.796
that people will
often-- when they

00:24:39.796 --> 00:24:41.170
are forced with
these constraints

00:24:41.170 --> 00:24:43.503
to say you must use punctuation,
stuff like that-- a lot

00:24:43.503 --> 00:24:47.460
of times when they're picking
characters for the password,

00:24:47.460 --> 00:24:50.994
they will use keys that
are adjacent to each other.

00:24:50.994 --> 00:24:52.660
So in other words,
they'll be very small

00:24:52.660 --> 00:24:54.690
edit distance in physical
space with respect

00:24:54.690 --> 00:24:56.920
to edit distance in
the actual password.

00:24:56.920 --> 00:24:59.510
So what a Telepathwords does
is it has the database here,

00:24:59.510 --> 00:25:01.720
so when you type in things
it's running these models.

00:25:01.720 --> 00:25:02.670
And it's saying,
statistically speaking,

00:25:02.670 --> 00:25:05.424
here's the most likely thing
that you're going to type next.

00:25:05.424 --> 00:25:07.652
So it's almost like auto
complete for passwords.

00:25:07.652 --> 00:25:09.235
And so what's funny
is that this shows

00:25:09.235 --> 00:25:11.151
once again that if you
have these constraints,

00:25:11.151 --> 00:25:14.150
they actually don't protect
you that much if there are some

00:25:14.150 --> 00:25:17.500
of these underlying a priori
distributions of things

00:25:17.500 --> 00:25:19.870
that the attacker
can't leverage.

00:25:19.870 --> 00:25:21.766
I think there was a question?

00:25:21.766 --> 00:25:25.970
AUDIENCE: Yeah so it seems
like if an attacker is

00:25:25.970 --> 00:25:28.162
too sophisticated
that they could

00:25:28.162 --> 00:25:31.571
try guessing like a bunch
of IP addresses and things

00:25:31.571 --> 00:25:34.980
which only would prevent
hammering [INAUDIBLE].

00:25:42.684 --> 00:25:44.100
PROFESSOR: Yeah,
it's very tricky.

00:25:44.100 --> 00:25:45.100
Now that's a good point.

00:25:45.100 --> 00:25:47.659
So anti-hammering
basically sounds well

00:25:47.659 --> 00:25:50.500
what's the scope of the attack
that you're trying to prevent?

00:25:50.500 --> 00:25:54.055
So if you're concerned
about distributed attackers

00:25:54.055 --> 00:25:57.250
and a network system, it does
become very, very subtle.

00:25:57.250 --> 00:26:00.202
And suffice it to say that
the notion of anti-hammering

00:26:00.202 --> 00:26:02.410
or [INAUDIBLE] systems, and
also the notion of things

00:26:02.410 --> 00:26:05.080
like clipfraud, for example.

00:26:05.080 --> 00:26:06.700
So in other words,
how does someone

00:26:06.700 --> 00:26:08.590
who's running an
advertising campaign online

00:26:08.590 --> 00:26:10.665
determine if someone's
actually putting the link

00:26:10.665 --> 00:26:13.070
and actually paying someone
for those clicks, verses

00:26:13.070 --> 00:26:15.560
this is just spammer who
got some box just sitting

00:26:15.560 --> 00:26:17.200
there clicking on stuff.

00:26:17.200 --> 00:26:19.241
So suffice it to say
there's a lot of distributed

00:26:19.241 --> 00:26:21.690
heuristics that try to
solve those problems.

00:26:21.690 --> 00:26:23.980
And in many cases, it's
not a science, it's an art.

00:26:23.980 --> 00:26:26.480
But your [INAUDIBLE] correct
and in the distributed setting,

00:26:26.480 --> 00:26:30.980
things get much more
difficult. All right,

00:26:30.980 --> 00:26:32.930
so does this all make sense?

00:26:32.930 --> 00:26:35.330
AUDIENCE: What about the
cryptographic anti-hammering

00:26:35.330 --> 00:26:36.770
defenses?

00:26:36.770 --> 00:26:40.800
Most of the time you end up
sending a hash on the line

00:26:40.800 --> 00:26:44.855
[INAUDIBLE] that when
you get out of it

00:26:44.855 --> 00:26:46.595
is exactly what
you would get out

00:26:46.595 --> 00:26:48.178
the password of the
hashable password?

00:26:50.571 --> 00:26:52.490
I know there are
protocols like SRP

00:26:52.490 --> 00:26:56.160
or there are some zero
knowledge protocols.

00:26:56.160 --> 00:26:57.062
PROFESSOR: Yeah, so--

00:26:57.062 --> 00:26:58.520
AUDIENCE: That you
use in practice?

00:26:58.520 --> 00:26:59.311
PROFESSOR: They do.

00:27:01.820 --> 00:27:03.980
Those protocols
provides some stronger

00:27:03.980 --> 00:27:05.160
cryptographic guarantees.

00:27:05.160 --> 00:27:06.500
A lot of times they
are not backwards

00:27:06.500 --> 00:27:08.900
compatible with current systems,
which is why in practice you

00:27:08.900 --> 00:27:09.470
don't see them used a lot.

00:27:09.470 --> 00:27:10.928
But yeah, there
are some protocols,

00:27:10.928 --> 00:27:14.900
for example, that
allow the server to not

00:27:14.900 --> 00:27:17.840
have any notion of
the password at all.

00:27:17.840 --> 00:27:20.220
So there's some zero knowledge
type thing or whatever.

00:27:20.220 --> 00:27:21.719
So those things do
work in practice.

00:27:21.719 --> 00:27:24.505
But one of the things that this
paper says is very interesting

00:27:24.505 --> 00:27:26.880
is that you basically go
through all these authentication

00:27:26.880 --> 00:27:29.190
schemes and they say,
OK, here's passwords.

00:27:29.190 --> 00:27:30.190
Yeah, they kind of suck.

00:27:30.190 --> 00:27:31.360
Here's some other
things that are actually

00:27:31.360 --> 00:27:32.770
much stronger on
security access,

00:27:32.770 --> 00:27:35.500
but then they all fail on
deployability or usability

00:27:35.500 --> 00:27:36.560
and things like that.

00:27:36.560 --> 00:27:39.970
And so that's one of the
interesting and slightly sad

00:27:39.970 --> 00:27:41.890
outcomes of this
paper that maybe

00:27:41.890 --> 00:27:44.185
even though we have all
these much stronger security

00:27:44.185 --> 00:27:46.680
for the protocols,
we can't deploy them

00:27:46.680 --> 00:27:50.164
for some usability reasons
or some [INAUDIBLE] reason.

00:27:54.440 --> 00:27:56.277
So that's just a fun
site to go to right.

00:27:56.277 --> 00:27:58.360
So they claim that they
don't store your passwords

00:27:58.360 --> 00:28:00.660
so you take them at their
word if you want to.

00:28:00.660 --> 00:28:03.520
But it is very interesting to
just sit down and think like,

00:28:03.520 --> 00:28:04.870
what password I generate?

00:28:04.870 --> 00:28:07.340
And then type into this,
and see how accurate

00:28:07.340 --> 00:28:09.685
it is in guessing what
the next thing will be.

00:28:09.685 --> 00:28:12.090
It even covers things
like the popular heuristic

00:28:12.090 --> 00:28:15.760
like take a popular phrase
that has multiple words,

00:28:15.760 --> 00:28:18.180
and then only take the
first letter of each word.

00:28:18.180 --> 00:28:19.650
So this thing is
very, very good.

00:28:19.650 --> 00:28:21.100
Very, very scary too.

00:28:21.100 --> 00:28:23.402
OK so that's Telepathwords.

00:28:23.402 --> 00:28:25.110
And so one thing that
is also interesting

00:28:25.110 --> 00:28:30.070
when you think about is
in your password scheme,

00:28:30.070 --> 00:28:33.760
is it vulnerable to
offline guessing.

00:28:37.290 --> 00:28:43.740
So this was a problem
that Kerberos before that.

00:28:43.740 --> 00:28:51.550
And then also V5 without
this thing they call preauth.

00:28:51.550 --> 00:28:55.090
So the basic idea is that in
these versions of Kerberos,

00:28:55.090 --> 00:28:58.530
anyone could ask the KDC for
a ticket that would encrypted

00:28:58.530 --> 00:29:00.610
with the users password.

00:29:00.610 --> 00:29:04.149
So basically, the KDC did
not authenticate requests

00:29:04.149 --> 00:29:05.440
that were coming from a client.

00:29:05.440 --> 00:29:07.500
Now the thing that
the KDC would return

00:29:07.500 --> 00:29:12.180
was, in fact-- there
are some set of bits

00:29:12.180 --> 00:29:13.980
here that the KDC would return.

00:29:13.980 --> 00:29:16.275
I'm sure you don't want to
think about this ugly set

00:29:16.275 --> 00:29:17.340
of cryptographic
printers anymore.

00:29:17.340 --> 00:29:18.839
But suffice it to
say, the KDC would

00:29:18.839 --> 00:29:21.430
return this stuff
that was encrypted

00:29:21.430 --> 00:29:24.490
with the key of the client.

00:29:24.490 --> 00:29:26.510
That's what will come
back to the client side.

00:29:26.510 --> 00:29:30.420
So the problem with this is
that because the server did not

00:29:30.420 --> 00:29:34.730
check who was sending this
encrypted set of things to,

00:29:34.730 --> 00:29:38.520
the attacker can basically
get this thing here and then

00:29:38.520 --> 00:29:40.900
try to just guess what KC is.

00:29:40.900 --> 00:29:43.856
Just guess that KC is some
value, try to encrypt this,

00:29:43.856 --> 00:29:44.980
see if it looks reasonable.

00:29:44.980 --> 00:29:47.720
If not, try to guess
another KC, decrypt this,

00:29:47.720 --> 00:29:48.970
see if it looks reasonable.

00:29:48.970 --> 00:29:52.270
And the reason why the attacker
can launch this type of attack,

00:29:52.270 --> 00:29:54.950
is that this thing
here, this TGT actually

00:29:54.950 --> 00:29:57.370
has a known format.

00:29:57.370 --> 00:29:59.420
So it has things in
here like timestamps,

00:29:59.420 --> 00:30:02.010
and it has things in here like
various link field would have

00:30:02.010 --> 00:30:03.870
to be internally consistent.

00:30:03.870 --> 00:30:06.970
And so that basically
helps the attacker.

00:30:06.970 --> 00:30:10.380
Because if the attacker guesses
the KC, gets this thing here,

00:30:10.380 --> 00:30:12.550
a decrypted thing, and
the internal fields

00:30:12.550 --> 00:30:14.600
don't check out,
the attacker knows

00:30:14.600 --> 00:30:16.453
that it picked the
wrong KC, so they

00:30:16.453 --> 00:30:18.480
can go on and pick another KC.

00:30:18.480 --> 00:30:24.570
And so, in Kerberos V5,
basically the client

00:30:24.570 --> 00:30:30.330
has to send in this thing
that it sends over to the KDC,

00:30:30.330 --> 00:30:36.790
it basically sends a time stamp.

00:30:36.790 --> 00:30:40.900
And then this time stamp is
going to be encrypted with KC.

00:30:40.900 --> 00:30:43.230
So this is sent to the
server, and the server

00:30:43.230 --> 00:30:46.240
looks at this and validates that
before it will send something

00:30:46.240 --> 00:30:47.280
back to the client.

00:30:47.280 --> 00:30:49.930
So that gets rid of this
problem that any random client

00:30:49.930 --> 00:30:53.354
can show up and just
ask for this thing here.

00:30:56.840 --> 00:31:00.824
AUDIENCE: So is time stamp
recorded in the message?

00:31:00.824 --> 00:31:04.657
So can't the attacker just give
this message and enforce it?

00:31:04.657 --> 00:31:05.740
PROFESSOR: Let's see here.

00:31:05.740 --> 00:31:09.670
So can't the attacker
get this message here?

00:31:09.670 --> 00:31:11.902
AUDIENCE: Yeah, the
encryption [INAUDIBLE].

00:31:11.902 --> 00:31:14.360
PROFESSOR: So you're thinking
where the attacker might just

00:31:14.360 --> 00:31:15.500
spoof this, for example?

00:31:15.500 --> 00:31:19.227
AUDIENCE: No, I just brute
force it and get KC out.

00:31:19.227 --> 00:31:19.810
PROFESSOR: OK.

00:31:19.810 --> 00:31:21.185
So in other words,
you're worried

00:31:21.185 --> 00:31:22.954
someone could observe this.

00:31:22.954 --> 00:31:23.620
AUDIENCE: Right.

00:31:23.620 --> 00:31:25.090
PROFESSOR: So I
believe that this

00:31:25.090 --> 00:31:29.166
is put inside an encrypted thing
that belongs to the server,

00:31:29.166 --> 00:31:30.540
or the key belongs
to the server.

00:31:30.540 --> 00:31:32.331
I think to prevent that
attack. [INAUDIBLE]

00:31:32.331 --> 00:31:34.390
so don't quote me on that.

00:31:34.390 --> 00:31:36.250
But you're correct
it's not, for example.

00:31:36.250 --> 00:31:37.625
And if the attacker,
for example,

00:31:37.625 --> 00:31:39.890
knew something that about
what the current time is,

00:31:39.890 --> 00:31:42.400
roughly, that actually
is super useful.

00:31:42.400 --> 00:31:44.190
Because then the
attacker can guess,

00:31:44.190 --> 00:31:46.815
oh, time stamp should be
roughly between here and here.

00:31:46.815 --> 00:31:48.190
And if it sees
it's in the clear,

00:31:48.190 --> 00:31:50.357
it can do the exact same
attack that we had up here.

00:31:50.357 --> 00:31:52.648
AUDIENCE: It's a little better
because the attacker has

00:31:52.648 --> 00:31:54.712
to be in the middle, but
it's still susceptible.

00:31:54.712 --> 00:31:55.670
PROFESSOR: That's true.

00:31:55.670 --> 00:31:57.150
Well, yeah, that's
right, the attacker

00:31:57.150 --> 00:31:58.770
has to be on the
network somewhere so

00:31:58.770 --> 00:32:00.370
this [INAUDIBLE] stuff.

00:32:00.370 --> 00:32:00.946
That's right.

00:32:04.070 --> 00:32:06.350
So that's all, I'm guessing.

00:32:06.350 --> 00:32:09.130
So another thing that's
important to think about

00:32:09.130 --> 00:32:14.580
is password recovery.

00:32:18.510 --> 00:32:20.950
So this is the idea that
you lose your password,

00:32:20.950 --> 00:32:23.380
and then somehow you
have to go to the service

00:32:23.380 --> 00:32:26.636
and you have to ask
for another password.

00:32:26.636 --> 00:32:28.010
But before you
get that password,

00:32:28.010 --> 00:32:30.220
you have to prove that
you are you in some way.

00:32:30.220 --> 00:32:31.290
So how does that work?

00:32:31.290 --> 00:32:32.650
How to do password recovery?

00:32:32.650 --> 00:32:35.940
So what's interesting is
that people oftentimes

00:32:35.940 --> 00:32:39.190
focus on the entropy
of the password itself.

00:32:39.190 --> 00:32:43.430
But the problem is that
if the password recovery

00:32:43.430 --> 00:32:45.570
questions or the
password recovery scheme

00:32:45.570 --> 00:32:47.420
has little entropy,
that actually

00:32:47.420 --> 00:32:50.113
affects the entropy of the
overall authentication scheme.

00:32:50.113 --> 00:32:55.240
So in other words, the
strength of the overall scheme

00:32:55.240 --> 00:32:58.520
is basically equal
to the minimum

00:32:58.520 --> 00:33:07.440
of the password entropy in
the recovery question entropy.

00:33:11.589 --> 00:33:13.960
And so you see this
actually play out

00:33:13.960 --> 00:33:16.005
in a lot of rules scenarios.

00:33:16.005 --> 00:33:18.380
There's a lot of famous cases,
like the Sarah Palin case,

00:33:18.380 --> 00:33:21.700
where basically someone
was able to recover

00:33:21.700 --> 00:33:25.300
her password fraudulently
because her recovery

00:33:25.300 --> 00:33:28.029
questions were things that
any random person could find.

00:33:28.029 --> 00:33:30.070
By looking at her Wikipedia
article, for example,

00:33:30.070 --> 00:33:32.880
find out where she went to high
school and things like that.

00:33:32.880 --> 00:33:35.840
And so often times these
password recovery questions

00:33:35.840 --> 00:33:36.950
are not very good.

00:33:36.950 --> 00:33:39.980
And they're not very good
because of a couple reasons.

00:33:39.980 --> 00:33:44.560
So sometimes these things
just have very low entropy.

00:33:44.560 --> 00:33:46.990
So if you have a password
recovery question that

00:33:46.990 --> 00:33:49.610
is something like, what's
your favorite color,

00:33:49.610 --> 00:33:52.190
the most popular answers are
going to be like blue and red.

00:33:52.190 --> 00:33:55.300
Nobody's going to say like
off white, fuchsia, magenta.

00:33:55.300 --> 00:33:57.150
So some of these
recovery questions

00:33:57.150 --> 00:34:01.035
intrinsically are very difficult
to provide a lot of entropy

00:34:01.035 --> 00:34:01.770
for.

00:34:01.770 --> 00:34:05.140
The other problem is
that sometimes these

00:34:05.140 --> 00:34:11.560
recover questions can be
leaked via social media.

00:34:11.560 --> 00:34:14.270
So for example, if one
of the recovery questions

00:34:14.270 --> 00:34:16.020
is what's your favorite movie?

00:34:16.020 --> 00:34:18.170
So maybe this space there
is a little bit bigger,

00:34:18.170 --> 00:34:20.540
but if intrinsically I
can go look at, let's say,

00:34:20.540 --> 00:34:22.530
your IMDB profile,
your Facebook profile,

00:34:22.530 --> 00:34:24.482
and figure out like,
oh hey, you literally

00:34:24.482 --> 00:34:25.940
told me that's your
favorite movie,

00:34:25.940 --> 00:34:27.820
this isn't super useful either.

00:34:27.820 --> 00:34:29.500
And another problem--
this is actually

00:34:29.500 --> 00:34:32.270
sort of the funniest
one-- is that the user

00:34:32.270 --> 00:34:38.270
selected recovery questions
are often super weak.

00:34:38.270 --> 00:34:42.396
So for example, people
have done a survey

00:34:42.396 --> 00:34:44.520
of what some of these
recovery questions look like,

00:34:44.520 --> 00:34:46.370
and sometimes users
themselves will

00:34:46.370 --> 00:34:51.820
set recovery questions that are
things like what is 2 plus 3?

00:34:51.820 --> 00:34:55.000
And so, at the time, the user's
thinking this is a big hassle,

00:34:55.000 --> 00:34:56.409
we're going to have to use this.

00:34:56.409 --> 00:34:59.680
But trivially most humans
who pass the Turing Test

00:34:59.680 --> 00:35:01.848
can answer that
questions successfully.

00:35:01.848 --> 00:35:04.842
And then therefore get
the users password back.

00:35:04.842 --> 00:35:12.340
AUDIENCE: So [INAUDIBLE] like
using recovery passwords?

00:35:12.340 --> 00:35:16.462
It's basically like you enter in
your name and maybe the subject

00:35:16.462 --> 00:35:18.891
of some emails that you've
sent, like a small amount

00:35:18.891 --> 00:35:19.974
of additional information.

00:35:19.974 --> 00:35:21.979
But based on that,
in some cases they

00:35:21.979 --> 00:35:26.200
can-- is security of
that kind of stuff then?

00:35:26.200 --> 00:35:28.771
PROFESSOR: So I don't know of
any formal study like that.

00:35:28.771 --> 00:35:30.396
Those things are
actually a lot better.

00:35:30.396 --> 00:35:32.770
I actually know
this, because I was

00:35:32.770 --> 00:35:35.000
trying to help a friend
go through this process.

00:35:35.000 --> 00:35:38.630
So she basically lost
control of her Gmail account,

00:35:38.630 --> 00:35:40.880
and she was trying to prove
that this was her account.

00:35:40.880 --> 00:35:43.840
And so yeah, they would ask you
things like roughly speaking,

00:35:43.840 --> 00:35:46.100
when did you open this account.

00:35:46.100 --> 00:35:48.573
Roughly speaking before you
lost control of this account

00:35:48.573 --> 00:35:52.770
to hesball or whatever,
who were some of the people

00:35:52.770 --> 00:35:54.205
that you talked to?

00:35:54.205 --> 00:35:55.080
And things like that.

00:35:55.080 --> 00:35:57.187
And it's actually a
pretty laborious process.

00:35:57.187 --> 00:35:59.520
What ends up happening is
that you're generally correct,

00:35:59.520 --> 00:36:01.950
it ends up being much more
powerful than this stuff.

00:36:01.950 --> 00:36:04.920
And so actually I don't know
of any formal studies of that,

00:36:04.920 --> 00:36:06.656
but it does seem
[INAUDIBLE] much strong

00:36:06.656 --> 00:36:07.886
than these types of things.

00:36:11.259 --> 00:36:12.550
All right, any other questions?

00:36:16.350 --> 00:36:20.810
Now we can get to
the paper for today.

00:36:20.810 --> 00:36:24.010
So reading for today,
the author has basically

00:36:24.010 --> 00:36:28.610
proposed a bunch of factors
that can be used to evaluate

00:36:28.610 --> 00:36:30.465
these authentication schemes.

00:36:30.465 --> 00:36:32.506
And what's really cool
about this paper, I think,

00:36:32.506 --> 00:36:35.010
is that it basically tries
to say, look, a lot of us

00:36:35.010 --> 00:36:37.460
in the security community
are fighting just

00:36:37.460 --> 00:36:38.710
based on aesthetic principles.

00:36:38.710 --> 00:36:41.020
Like, we should pick
this because I just

00:36:41.020 --> 00:36:43.260
like the way that the curly
braces look in the proof.

00:36:43.260 --> 00:36:46.161
We should pick this because
it uses a lot of math mode.

00:36:46.161 --> 00:36:48.660
And so what they say is, look,
why don't we try to establish

00:36:48.660 --> 00:36:50.050
some type of criteria?

00:36:50.050 --> 00:36:52.510
Maybe some of the criteria
are a little bit subjective.

00:36:52.510 --> 00:36:54.630
Let's just try to have
this taxonomy of ways

00:36:54.630 --> 00:36:56.620
to evaluate the
authentication scheme.

00:36:56.620 --> 00:36:59.900
And let's just see how these
various schemes stack up.

00:36:59.900 --> 00:37:03.060
And so the authors basically
proposed three high level

00:37:03.060 --> 00:37:05.660
metrics for evaluating
these schemes.

00:37:05.660 --> 00:37:11.910
And so, the first
metric is usability.

00:37:11.910 --> 00:37:13.950
And so, the base
idea here is how

00:37:13.950 --> 00:37:16.520
easy is it for users to interact
with this authentication

00:37:16.520 --> 00:37:17.620
scheme.

00:37:17.620 --> 00:37:20.000
So they find a couple
interesting properties.

00:37:20.000 --> 00:37:23.820
So for example, is
it easy to learn?

00:37:26.580 --> 00:37:29.679
This basically just means is
this scheme easy to learn?

00:37:29.679 --> 00:37:31.970
So some of these categories
are pretty straightforward.

00:37:31.970 --> 00:37:33.830
Some of them actually involve
a little bit of subtlety.

00:37:33.830 --> 00:37:35.512
But this one makes
a lot of sense.

00:37:35.512 --> 00:37:43.710
And so if we look at passwords,
passwords pass this test.

00:37:43.710 --> 00:37:48.460
Because everybody is used to
using passwords, so we'll say

00:37:48.460 --> 00:37:49.550
they are easy to learn.

00:37:49.550 --> 00:37:54.480
Another category is
infrequent errors.

00:37:54.480 --> 00:37:56.480
So that means when
you are trying

00:37:56.480 --> 00:37:58.583
to authenticate
the system, if you

00:37:58.583 --> 00:38:01.189
are the actual user
in question, is it

00:38:01.189 --> 00:38:03.230
the case that you can
often authenticate yourself

00:38:03.230 --> 00:38:04.990
without generating errors?

00:38:04.990 --> 00:38:09.050
And so, here the
authors say quasi-yes.

00:38:12.970 --> 00:38:15.316
And so the quasi prefix is
one of the more entertaining

00:38:15.316 --> 00:38:17.190
aspects of the paper,
because authors kind of

00:38:17.190 --> 00:38:20.010
admit there's this element
of subjectivity to it.

00:38:20.010 --> 00:38:24.350
So we can't necessarily say with
crisp precision yes, no, things

00:38:24.350 --> 00:38:25.020
like this.

00:38:25.020 --> 00:38:26.760
So the reason why
they say quasi-yes

00:38:26.760 --> 00:38:30.120
is because, in general, you
can authenticate a password

00:38:30.120 --> 00:38:30.700
successfully.

00:38:30.700 --> 00:38:33.109
But we've all been in that
place where it's like 3 AM,

00:38:33.109 --> 00:38:34.900
we're trying to log on
to our email server,

00:38:34.900 --> 00:38:36.060
our mind's not in
the right place,

00:38:36.060 --> 00:38:38.060
and we enter a bunch of
errors a bunch of times.

00:38:38.060 --> 00:38:41.030
So they say quasi-yes for this.

00:38:41.030 --> 00:38:46.510
Another category is
it scalable for users.

00:38:50.006 --> 00:38:54.867
And so the basic idea
here is if the user has

00:38:54.867 --> 00:38:56.950
a bunch of different
services that he or she wants

00:38:56.950 --> 00:39:01.160
to authenticate to, does
this scheme scale well?

00:39:01.160 --> 00:39:04.110
Does the user have to
remember some new thing

00:39:04.110 --> 00:39:06.290
for each one of the schemes?

00:39:06.290 --> 00:39:11.200
And so, for here,
the authors say no.

00:39:11.200 --> 00:39:14.480
Because in practice, it's
very difficult for users

00:39:14.480 --> 00:39:18.130
to remember a separate
password for every single site

00:39:18.130 --> 00:39:18.880
that they go to.

00:39:18.880 --> 00:39:21.500
This is one reason actually why
people reuse their passwords

00:39:21.500 --> 00:39:23.660
often.

00:39:23.660 --> 00:39:27.216
So another usability
property is easy recovery.

00:39:30.370 --> 00:39:34.230
So what happens if you
lose your authentication

00:39:34.230 --> 00:39:37.160
token-- in this case, your
password-- is it easy to reset?

00:39:37.160 --> 00:39:42.060
And in this case, the
answer for passwords is yes.

00:39:42.060 --> 00:39:44.670
In fact, they are probably
too easy to reset,

00:39:44.670 --> 00:39:46.620
as we just discussed
a couple minutes ago.

00:39:46.620 --> 00:39:49.690
So that's a yes.

00:39:49.690 --> 00:39:52.210
And so another existing
one is nothing to carry.

00:39:54.730 --> 00:39:58.690
So a lot of the more Barouque
authentication protocols

00:39:58.690 --> 00:40:01.190
require you run
some smartphone app,

00:40:01.190 --> 00:40:03.880
or you have some security
token or smart card or things

00:40:03.880 --> 00:40:04.790
like that.

00:40:04.790 --> 00:40:07.370
So that's a burden.

00:40:07.370 --> 00:40:08.870
Maybe not with a
smartphone so much,

00:40:08.870 --> 00:40:11.350
but having to carry around
one of these other gadgets is

00:40:11.350 --> 00:40:12.310
probably a pain.

00:40:12.310 --> 00:40:17.300
And so this is actually one
nice feature of passwords,

00:40:17.300 --> 00:40:20.340
you basically only have to
carry around in your brain,

00:40:20.340 --> 00:40:22.570
which is one that you
should have at all moments.

00:40:22.570 --> 00:40:25.427
So that's basically what
usability looks like.

00:40:25.427 --> 00:40:27.010
It is very interesting
in a high level

00:40:27.010 --> 00:40:30.600
that a lot of times
these sort of factors

00:40:30.600 --> 00:40:33.705
are given a little bit of a
short shrift in the community.

00:40:33.705 --> 00:40:36.080
Security can be when people
are evaluating these schemes.

00:40:36.080 --> 00:40:38.770
They say, oh, this thing uses
like a million bits of entropy,

00:40:38.770 --> 00:40:41.090
and can only be broken by
the Death Star or whatever.

00:40:41.090 --> 00:40:42.464
But then people
don't necessarily

00:40:42.464 --> 00:40:46.040
remember these are actually
very important factors too.

00:40:46.040 --> 00:40:52.550
OK so the next
high level category

00:40:52.550 --> 00:40:56.210
that the authors use to
evaluate authentication scheme

00:40:56.210 --> 00:40:58.350
is deployability.

00:40:58.350 --> 00:41:00.652
So the base idea
here is how easy

00:41:00.652 --> 00:41:05.940
is it to incorporate this system
in to current web services.

00:41:05.940 --> 00:41:07.890
So one thing they
look at, for example,

00:41:07.890 --> 00:41:12.753
is is it server compatible?

00:41:16.050 --> 00:41:18.350
And this basically means
can I easily integrate

00:41:18.350 --> 00:41:22.200
this scheme with today's
servers, which are based

00:41:22.200 --> 00:41:24.230
around text based passwords?

00:41:24.230 --> 00:41:27.440
And so since success here
is defined with respect

00:41:27.440 --> 00:41:30.820
to passwords, passwords succeed.

00:41:30.820 --> 00:41:35.700
So another metric is
browser compatibility.

00:41:35.700 --> 00:41:37.225
Similar type of thing.

00:41:37.225 --> 00:41:41.130
Can I use this scheme with
current off-the-shelf browsers

00:41:41.130 --> 00:41:44.390
without having to install
plug-in, something like that?

00:41:44.390 --> 00:41:48.408
Once again, passwords
win by default.

00:41:48.408 --> 00:41:50.396
And another interesting
one is excessibility.

00:41:54.870 --> 00:41:58.802
So can people who can use
passwords now, but maybe

00:41:58.802 --> 00:42:01.010
have some type of physical
disability-- maybe they're

00:42:01.010 --> 00:42:03.987
blind, or they can't hear well,
or they can't gesture well,

00:42:03.987 --> 00:42:04.820
or things like that.

00:42:04.820 --> 00:42:07.050
Can they actually
use this scheme?

00:42:07.050 --> 00:42:08.580
This is actually
pretty important.

00:42:08.580 --> 00:42:12.462
So once again, the
authors' saying yes.

00:42:12.462 --> 00:42:14.420
It's a little bit weird,
because it's not clear

00:42:14.420 --> 00:42:16.880
that all people with all
disabilities can use passwords,

00:42:16.880 --> 00:42:20.470
but they say yes here.

00:42:20.470 --> 00:42:22.690
So yes, so these are
three interesting things

00:42:22.690 --> 00:42:24.890
to think about with
respect to deployability.

00:42:24.890 --> 00:42:26.960
And the reason why this
deployability category

00:42:26.960 --> 00:42:29.940
is so important is because it's
very difficult to get anyone

00:42:29.940 --> 00:42:33.220
to upgrade anything ever.

00:42:33.220 --> 00:42:35.800
I mean people don't even
want to reboot their machines

00:42:35.800 --> 00:42:38.155
and get a new OS
update installed.

00:42:38.155 --> 00:42:40.780
So it's very difficult that this
scheme requires usable changes

00:42:40.780 --> 00:42:42.749
on the server to get
people on the server

00:42:42.749 --> 00:42:44.040
to actually do different stuff.

00:42:44.040 --> 00:42:45.340
This goes back to your
question, why don't we

00:42:45.340 --> 00:42:46.480
use these better things?

00:42:46.480 --> 00:42:47.590
Cause deployability
in many cases

00:42:47.590 --> 00:42:49.089
is super, super
important to people.

00:42:51.920 --> 00:42:56.450
All right, so then the final
category that we will look at

00:42:56.450 --> 00:42:57.125
is security.

00:43:00.690 --> 00:43:04.750
Right, so what kinds of attacks
can this scheme prevent?

00:43:04.750 --> 00:43:09.305
So a lot of these
security properties

00:43:09.305 --> 00:43:12.590
are resilient to foo.

00:43:12.590 --> 00:43:15.060
I'll just shorten
that one of reds.

00:43:15.060 --> 00:43:21.750
So is the scheme resilient
to physical observations?

00:43:25.090 --> 00:43:27.970
So the idea here is
that an attacker can not

00:43:27.970 --> 00:43:30.730
impersonate the
user after observing

00:43:30.730 --> 00:43:33.400
them authenticate a few times.

00:43:33.400 --> 00:43:35.540
So imagine that you
had a shoulder surfer.

00:43:35.540 --> 00:43:37.280
So you're somewhere
in a computer lab,

00:43:37.280 --> 00:43:38.821
someone's looking
over your shoulder,

00:43:38.821 --> 00:43:39.980
seeing what you type in.

00:43:39.980 --> 00:43:42.400
Someone's videotaping
you, maybe someone's

00:43:42.400 --> 00:43:44.802
got a microphone listening
to the acoustic signature

00:43:44.802 --> 00:43:46.677
of your keyboard and
trying to extract things

00:43:46.677 --> 00:43:49.630
from that, so on and so forth.

00:43:49.630 --> 00:43:53.820
So the authors say
that passwords actually

00:43:53.820 --> 00:43:55.190
failed this test.

00:43:55.190 --> 00:44:00.090
And that's because someone can
videotape typing in things,

00:44:00.090 --> 00:44:02.640
they can pretty easily figure
out what letters you typed.

00:44:02.640 --> 00:44:04.973
Or there's actually these
attacks where you can actually

00:44:04.973 --> 00:44:07.810
listen to the acoustic
fingerprint of the keyboard,

00:44:07.810 --> 00:44:11.840
and detect what was typed based
on what sounds that you hear.

00:44:11.840 --> 00:44:15.910
So passwords are not resistant
to physical observation.

00:44:15.910 --> 00:44:25.135
So another property is resistant
to targeted impersonation.

00:44:28.580 --> 00:44:30.630
And so the base
idea here that, is

00:44:30.630 --> 00:44:33.570
that is it possible for someone
who knows you-- a friend,

00:44:33.570 --> 00:44:35.280
an acquaintance, a
spouse, a loved one,

00:44:35.280 --> 00:44:38.795
a family member,
whatever-- to impersonate

00:44:38.795 --> 00:44:44.290
you using their knowledge of
who you are and what you do.

00:44:44.290 --> 00:44:46.667
So could your friend try
to pretend to be you easily

00:44:46.667 --> 00:44:47.750
in this particular scheme?

00:44:47.750 --> 00:44:53.065
So here the authors
basically have another one

00:44:53.065 --> 00:44:53.940
of these quasi-yeses.

00:44:56.900 --> 00:44:59.610
And they say quasi-yes
because they're not

00:44:59.610 --> 00:45:03.095
aware of any studies which
show that if you know a person,

00:45:03.095 --> 00:45:05.570
you're more likely to
guess their password.

00:45:05.570 --> 00:45:07.190
So they say quasi-yes for that.

00:45:07.190 --> 00:45:10.510
And so, note that resistance
is targeted impersonation.

00:45:10.510 --> 00:45:12.260
This is where most
security backup

00:45:12.260 --> 00:45:14.135
questions fail miserably.

00:45:14.135 --> 00:45:16.010
Because if someone knows
something about you,

00:45:16.010 --> 00:45:19.595
quite easily they can guess
your security questions

00:45:19.595 --> 00:45:22.860
in many cases.

00:45:22.860 --> 00:45:27.450
So then we have two categories
that involve guessing.

00:45:27.450 --> 00:45:30.990
So the first one is resilient
to throttle guessing.

00:45:34.930 --> 00:45:42.080
And so what this means is
if the attacker can not

00:45:42.080 --> 00:45:47.690
issue guesses at line
rate, because for, example,

00:45:47.690 --> 00:45:51.880
the server uses
anti-hammering mechanisms.

00:45:51.880 --> 00:45:56.720
Is the scheme safe
against the attacker?

00:45:56.720 --> 00:46:01.060
And so here, they say no.

00:46:01.060 --> 00:46:02.670
And so the reason
why they say no,

00:46:02.670 --> 00:46:05.480
is because in practice
passwords not only

00:46:05.480 --> 00:46:09.800
have sort of low inherit entropy
because they're not that long,

00:46:09.800 --> 00:46:12.570
but also they have that
skewed distribution.

00:46:12.570 --> 00:46:15.860
And so what that means is
that even if the attacker is

00:46:15.860 --> 00:46:18.260
throttled in some way,
typically the attacker can still

00:46:18.260 --> 00:46:20.040
make good forward
progress and crack

00:46:20.040 --> 00:46:22.140
a lot of people's passwords.

00:46:22.140 --> 00:46:26.010
So they define another
guessing property

00:46:26.010 --> 00:46:29.960
which is resistant to
unthrottled guessing.

00:46:34.030 --> 00:46:38.890
And so this is basically
saying, suppose

00:46:38.890 --> 00:46:44.110
that the attacker can issue
these authentication forgery

00:46:44.110 --> 00:46:47.280
request as quickly
as he or she wants.

00:46:47.280 --> 00:46:49.000
So in other words,
the attacker is only

00:46:49.000 --> 00:46:51.220
limited by the speed
of their hardware.

00:46:51.220 --> 00:46:54.440
So is the authentication
scheme resilient to that type

00:46:54.440 --> 00:46:55.290
of attack?

00:46:55.290 --> 00:46:59.560
And here maybe this answer's
also no, for the same reason

00:46:59.560 --> 00:47:01.470
that the answer was no up here.

00:47:01.470 --> 00:47:04.040
So basically passwords have
a very small entropy space

00:47:04.040 --> 00:47:07.040
and they come
skewed distribution.

00:47:07.040 --> 00:47:10.690
So that's all pretty
straightforward.

00:47:10.690 --> 00:47:13.603
One interesting
one is resiliency

00:47:13.603 --> 00:47:16.390
to internal observation.

00:47:21.890 --> 00:47:23.720
So this means that
the attacker can not

00:47:23.720 --> 00:47:27.370
impersonate a user like
intercepting that users input.

00:47:27.370 --> 00:47:31.770
For example, by installing
a keystroke logger

00:47:31.770 --> 00:47:34.675
on the keyboard that
the user's using,

00:47:34.675 --> 00:47:37.640
and using that logger
to steal keypresses.

00:47:37.640 --> 00:47:39.790
This also means, for
example, that there's

00:47:39.790 --> 00:47:41.450
no way for network
attacker who's

00:47:41.450 --> 00:47:44.270
observing the things that the
client sending over the wire

00:47:44.270 --> 00:47:48.670
to use that knowledge
of the network traffic

00:47:48.670 --> 00:47:50.710
to later impersonate the user.

00:47:50.710 --> 00:47:56.610
And so here they say password
do not have this scheme.

00:47:56.610 --> 00:47:59.640
And they essentially say
it's because passwords

00:47:59.640 --> 00:48:02.060
are static tokens.

00:48:02.060 --> 00:48:03.160
They don't change.

00:48:03.160 --> 00:48:06.500
And typically static tokens
are vulnerable to replay.

00:48:06.500 --> 00:48:08.920
So if somehow, for
example, an attacker

00:48:08.920 --> 00:48:11.680
installs a keystroke logger
and gets your password,

00:48:11.680 --> 00:48:14.280
then basically the attacker
can use that password

00:48:14.280 --> 00:48:17.020
until it's either expired or
revoked or something that.

00:48:17.020 --> 00:48:18.470
It you just replay
it again it'll

00:48:18.470 --> 00:48:20.960
go into that authenticating
server on the other side.

00:48:20.960 --> 00:48:22.751
So here, passwords
actually fail that test.

00:48:25.564 --> 00:48:27.522
Another thing that we
talked about a little bit

00:48:27.522 --> 00:48:29.340
in this class phishing.

00:48:29.340 --> 00:48:36.538
So resilience to phishing
is another security metric.

00:48:36.538 --> 00:48:40.190
And the base idea here is that,
if the attacker can simulate

00:48:40.190 --> 00:48:43.320
a valid service-- for
example, by attacking the DNS

00:48:43.320 --> 00:48:45.870
infrastructure or
something like that--

00:48:45.870 --> 00:48:49.200
then the attacker cannot collect
credentials from the user,

00:48:49.200 --> 00:48:53.300
then the attacker can then use
to pretend to be the user later

00:48:53.300 --> 00:48:53.925
on.

00:48:53.925 --> 00:48:58.300
And so this basically
supposed penalized sites that

00:48:58.300 --> 00:49:03.580
do not strongly tell
the user, hey, I'm

00:49:03.580 --> 00:49:06.850
this particular service, so you
can feel confident to give me

00:49:06.850 --> 00:49:07.950
your credentials.

00:49:07.950 --> 00:49:11.160
And so if here passwords fail
just because phishing sites

00:49:11.160 --> 00:49:13.217
are very, very popular.

00:49:13.217 --> 00:49:15.175
So passwords don't really
intrinsically provide

00:49:15.175 --> 00:49:16.341
any protection against that.

00:49:20.620 --> 00:49:23.170
Now the next two
are particularly

00:49:23.170 --> 00:49:28.040
interesting in the context of a
large scale distributed system.

00:49:28.040 --> 00:49:30.390
So no trusted third party.

00:49:33.760 --> 00:49:35.270
This essentially
means that other

00:49:35.270 --> 00:49:38.410
than the client and the
server, there's no one else

00:49:38.410 --> 00:49:44.580
in the system that is involved
in the authentication protocol.

00:49:44.580 --> 00:49:47.719
And so, that means that
there's no third party who,

00:49:47.719 --> 00:49:49.260
if that third party
were compromised,

00:49:49.260 --> 00:49:51.310
the entire integrity of
the securities scheme

00:49:51.310 --> 00:49:52.040
might fall apart.

00:49:52.040 --> 00:49:54.343
And so, this is actually
an interesting property

00:49:54.343 --> 00:49:56.780
to look at because a lot
of authentication problems

00:49:56.780 --> 00:49:59.900
would go away if we could just
store all our authentication

00:49:59.900 --> 00:50:01.863
information in one place.

00:50:01.863 --> 00:50:04.050
We just store it in one
place, it's very simple,

00:50:04.050 --> 00:50:05.690
we don't have to remember a
lot of stuff on the client,

00:50:05.690 --> 00:50:07.850
we just say, whatever
service you want to use,

00:50:07.850 --> 00:50:10.110
you always go to
this one third party,

00:50:10.110 --> 00:50:11.980
and that third
party will always be

00:50:11.980 --> 00:50:14.980
able to of authenticate
you, and then

00:50:14.980 --> 00:50:17.090
allow you to go on your way.

00:50:17.090 --> 00:50:20.640
Now of course third parties are
problematic with perspective

00:50:20.640 --> 00:50:22.777
of robustness right
because if you

00:50:22.777 --> 00:50:24.360
have one of these
global third parties

00:50:24.360 --> 00:50:27.750
that everybody trusts, if that
third party gets subverted then

00:50:27.750 --> 00:50:29.660
perhaps the integrity
of all the sites

00:50:29.660 --> 00:50:32.400
that use that third party to
authenticate all those sites

00:50:32.400 --> 00:50:35.000
are potentially in danger.

00:50:35.000 --> 00:50:39.760
So they say that passwords do
not have a trusted third party

00:50:39.760 --> 00:50:43.142
because each user is forced
to have a separate password

00:50:43.142 --> 00:50:44.054
for each site.

00:50:46.790 --> 00:50:48.814
A related property is