WEBVTT

00:00:01.592 --> 00:00:03.050
JOSHUA: Hi, I'm
Joshua and have you

00:00:03.050 --> 00:00:06.430
wondered, can my Facebook
password be stolen

00:00:06.430 --> 00:00:09.010
from Facebook's databases?

00:00:09.010 --> 00:00:12.870
For example, if a hacker were to
come into the Facebook network

00:00:12.870 --> 00:00:16.490
and get into the
administrator's account,

00:00:16.490 --> 00:00:19.950
would he be able to steal my
passwords, your passwords,

00:00:19.950 --> 00:00:21.200
everyone's passwords?

00:00:21.200 --> 00:00:22.790
Mua-ha-ha-ha.

00:00:22.790 --> 00:00:26.180
Would that be the
end of the world?

00:00:26.180 --> 00:00:28.830
Well, it turns out,
no, because companies

00:00:28.830 --> 00:00:32.680
like Facebook or Google
store your passwords not as

00:00:32.680 --> 00:00:33.550
passwords.

00:00:33.550 --> 00:00:38.400
In fact, this comes to one of
the most important concepts

00:00:38.400 --> 00:00:43.920
in computer security and that
is the concept of hashing.

00:00:43.920 --> 00:00:46.590
Well, so if companies
like Facebook and Google

00:00:46.590 --> 00:00:50.510
don't store your information
like passwords as passwords,

00:00:50.510 --> 00:00:51.730
what do they store in MS?

00:00:51.730 --> 00:00:55.470
Well, turns out they store in
MS things called "hashers."

00:00:55.470 --> 00:00:58.460
And hashers are kind of
like a little snapshot

00:00:58.460 --> 00:00:59.790
of the actual password.

00:00:59.790 --> 00:01:02.710
So let's say this
coffee is the password.

00:01:02.710 --> 00:01:04.680
So I just take a
little snapshot and all

00:01:04.680 --> 00:01:08.850
I see is a 2D image
of the password.

00:01:08.850 --> 00:01:12.100
So a hash is simply
a representation

00:01:12.100 --> 00:01:14.450
of the password
which proves that you

00:01:14.450 --> 00:01:16.960
know what the password is.

00:01:16.960 --> 00:01:18.460
But you'll probably
tell me, well, I

00:01:18.460 --> 00:01:21.180
can understand the idea
of a photo but what

00:01:21.180 --> 00:01:23.260
about it in computer science?

00:01:23.260 --> 00:01:25.880
What about it in
actual programming?

00:01:25.880 --> 00:01:27.440
How does it actually look?

00:01:27.440 --> 00:01:30.420
Well, the formal
definition of a hash

00:01:30.420 --> 00:01:32.400
is kind of like
what you understand

00:01:32.400 --> 00:01:36.436
as a one-way
mathematical function.

00:01:36.436 --> 00:01:37.810
This is why we
need to understand

00:01:37.810 --> 00:01:40.920
what a mathematical function
is in the first place.

00:01:40.920 --> 00:01:44.210
And so a mathematical
function is a relationship

00:01:44.210 --> 00:01:46.730
between the inputs
and the outputs.

00:01:46.730 --> 00:01:53.060
For example, f x equals
to 2x and if x is 2,

00:01:53.060 --> 00:01:55.870
then 2x would be 4.

00:01:55.870 --> 00:01:58.470
It's just a mathematical
relationship from the left side

00:01:58.470 --> 00:02:00.490
to the right side.

00:02:00.490 --> 00:02:04.380
But at the same time, you'll
probably be wondering, well,

00:02:04.380 --> 00:02:06.650
that is not one-way.

00:02:06.650 --> 00:02:08.520
So what "one-way"
means-- well, "one-way"

00:02:08.520 --> 00:02:14.040
means that the input on the
left-hand side cannot well

00:02:14.040 --> 00:02:16.420
determine what is on
the right-hand side.

00:02:16.420 --> 00:02:19.730
But knowing the results
on the right-hand side

00:02:19.730 --> 00:02:23.180
doesn't mean you know what
the actual inputs were.

00:02:23.180 --> 00:02:27.410
For this case, if you see 2x and
you see 4, you know for sure,

00:02:27.410 --> 00:02:30.240
oh, it's got to
be 2 as an input.

00:02:30.240 --> 00:02:35.190
So how do computers
enable a one-way function?

00:02:35.190 --> 00:02:38.080
In other words, knowing
what the inputs are

00:02:38.080 --> 00:02:40.720
to determine the outputs
but just having the outputs

00:02:40.720 --> 00:02:44.300
would not be enough to
determine the inputs?

00:02:44.300 --> 00:02:48.100
Well, it comes to the concept
of the modulus operator.

00:02:48.100 --> 00:02:50.350
So what's a modulus operator?

00:02:50.350 --> 00:02:54.510
A modulus operator is
an operator the returns

00:02:54.510 --> 00:02:57.860
the remainder of a division.

00:02:57.860 --> 00:03:03.500
So an operator is just
like a plus or a minus.

00:03:03.500 --> 00:03:06.460
You have, for example,
7 divided by 3.

00:03:06.460 --> 00:03:12.420
It would give you the answer
of 2 with the remainder of 1.

00:03:12.420 --> 00:03:16.720
And so 7 modulus 3
would just be the answer

00:03:16.720 --> 00:03:21.290
1, which happens to be just
the remainder of a division

00:03:21.290 --> 00:03:22.960
operator.

00:03:22.960 --> 00:03:26.310
Well, then you might
say, how is that useful

00:03:26.310 --> 00:03:28.490
to become a hash function?

00:03:28.490 --> 00:03:31.740
Well, notice that 7 is not
the only number that you

00:03:31.740 --> 00:03:35.020
can modulus by 3 to give you 1.

00:03:35.020 --> 00:03:36.390
You could use 4.

00:03:36.390 --> 00:03:37.590
You could use 1.

00:03:37.590 --> 00:03:41.920
In fact, there are many numbers
that result in the modulus

00:03:41.920 --> 00:03:43.840
function returning 1.

00:03:43.840 --> 00:03:47.250
So just by telling
you the answer is 1,

00:03:47.250 --> 00:03:49.850
you probably don't
know what the input is.

00:03:49.850 --> 00:03:54.800
And hence, you can determine
what the outputs are with

00:03:54.800 --> 00:03:58.100
the input but you cannot
determine what the inputs are

00:03:58.100 --> 00:04:00.600
just with the output.

00:04:00.600 --> 00:04:03.710
And so it seems that,
is your password safe?

00:04:03.710 --> 00:04:06.740
Well, it's quite safe because
most of your passwords

00:04:06.740 --> 00:04:08.640
are stored in hashers.

00:04:08.640 --> 00:04:12.350
And all the company
just needs to do

00:04:12.350 --> 00:04:15.390
is to take your password
whenever you log in

00:04:15.390 --> 00:04:17.829
and convert it into
that same snapshot

00:04:17.829 --> 00:04:21.190
and to compare the snapshots to
see whether the snapshots are

00:04:21.190 --> 00:04:22.000
equivalent.

00:04:22.000 --> 00:04:24.070
If they are, you logged in.

00:04:24.070 --> 00:04:26.130
If they're not, too bad for you.

00:04:26.130 --> 00:04:30.280
So in that way, you
are able to be secure.

00:04:30.280 --> 00:04:33.420
At the same time, if any hacker
steals any of the hashers,

00:04:33.420 --> 00:04:36.160
they cannot determine what
the passwords are in the first

00:04:36.160 --> 00:04:37.010
place.

00:04:37.010 --> 00:04:39.850
So in that sense, your
passwords are safe.

00:04:39.850 --> 00:04:44.140
But however, in the
same sense, a hacker

00:04:44.140 --> 00:04:46.660
could still brute force
his way through trying

00:04:46.660 --> 00:04:49.080
any number of passwords
and will eventually

00:04:49.080 --> 00:04:51.630
find your password some day.

00:04:51.630 --> 00:04:56.060
So point or story--
is your password safe?

00:04:56.060 --> 00:04:59.440
Yes and not so
because eventually,

00:04:59.440 --> 00:05:02.460
someday, someone will be
able, if they try really hard,

00:05:02.460 --> 00:05:04.070
to find your password.

00:05:04.070 --> 00:05:07.720
So stay safe and change
your passwords regularly.

00:05:07.720 --> 00:05:10.090
But other than
that, your password

00:05:10.090 --> 00:05:12.500
is still safe with any
company you put it with.

00:05:12.500 --> 00:05:14.290
Thank you.