WEBVTT

00:00:00.080 --> 00:00:02.430
The following content is
provided under a Creative

00:00:02.430 --> 00:00:03.820
Commons license.

00:00:03.820 --> 00:00:06.060
Your support will help
MIT OpenCourseWare

00:00:06.060 --> 00:00:10.150
continue to offer high quality
educational resources for free.

00:00:10.150 --> 00:00:12.700
To make a donation or to
view additional materials

00:00:12.700 --> 00:00:16.600
from hundreds of MIT courses,
visit MIT OpenCourseWare

00:00:16.600 --> 00:00:17.255
at ocw.mit.edu.

00:00:27.595 --> 00:00:28.720
PROFESSOR: All right, guys.

00:00:28.720 --> 00:00:29.470
Let's get started.

00:00:29.470 --> 00:00:31.137
So today, we're going
to talk about Tor.

00:00:31.137 --> 00:00:32.761
And we actually have
one of the authors

00:00:32.761 --> 00:00:35.090
of the paper you guys read
for today, Nick Mathewson.

00:00:35.090 --> 00:00:37.080
He's also one of the
main developers of Tor.

00:00:37.080 --> 00:00:38.065
He's going to tell
you more about it.

00:00:38.065 --> 00:00:39.148
NICK MATHEWSON: Thank you.

00:00:39.148 --> 00:00:42.315
So at this point,
I could start out

00:00:42.315 --> 00:00:44.420
by saying, please
put your hands up

00:00:44.420 --> 00:00:48.490
if you didn't read the paper,
but that wouldn't work.

00:00:48.490 --> 00:00:50.920
Because it's embarrassing
not to have read a paper

00:00:50.920 --> 00:00:52.940
you're supposed to have read.

00:00:52.940 --> 00:00:56.020
So instead, what I will ask
is, think of your birthday.

00:00:56.020 --> 00:00:57.610
Think of the date of your birth.

00:00:57.610 --> 00:01:01.000
If the last digit of the
date of your birth is odd,

00:01:01.000 --> 00:01:06.080
or you didn't read the paper,
please raise your hand.

00:01:06.080 --> 00:01:09.000
OK, that's not far from half.

00:01:09.000 --> 00:01:11.810
So I'm guessing most
people read the paper.

00:01:14.720 --> 00:01:19.190
Means of communicating that
preserve our privacy enable

00:01:19.190 --> 00:01:23.620
us to communicate more
honestly to gather better

00:01:23.620 --> 00:01:26.570
information about the
world when we are less

00:01:26.570 --> 00:01:32.210
disinhibited from speaking
because of possibly justified

00:01:32.210 --> 00:01:37.540
possibly unjustified social
and other consequences.

00:01:37.540 --> 00:01:41.570
So this brings us to
Tor, which is a anonymity

00:01:41.570 --> 00:01:44.210
network that I've been working
on for the last 10 years

00:01:44.210 --> 00:01:48.080
with some friends and
colleagues and so on.

00:01:48.080 --> 00:01:51.170
[INAUDIBLE] there's a set of
volunteer operating servers,

00:01:51.170 --> 00:01:52.930
about 6,000 of them.

00:01:52.930 --> 00:01:55.290
At first, it was
just friends of ours

00:01:55.290 --> 00:01:58.310
that Roger Dingledine
and I knew from MIT.

00:01:58.310 --> 00:02:01.660
After that, we built
up more publicity.

00:02:01.660 --> 00:02:04.810
More people started
running servers.

00:02:04.810 --> 00:02:08.360
Now it's run by nonprofits,
private individuals,

00:02:08.360 --> 00:02:12.370
some university teams, possibly
some of you here today,

00:02:12.370 --> 00:02:17.820
and no doubt some
very sketchy people.

00:02:17.820 --> 00:02:19.140
We've got about 6,000 nodes.

00:02:19.140 --> 00:02:21.540
We're serving on the order
of hundreds of thousands

00:02:21.540 --> 00:02:24.060
to millions of users
depending on how you count.

00:02:24.060 --> 00:02:26.310
It's kind of hard to count,
because they're anonymous.

00:02:26.310 --> 00:02:29.142
So you have to use statistical
techniques to estimate.

00:02:29.142 --> 00:02:30.850
And we're doing on
the order of terabytes

00:02:30.850 --> 00:02:34.500
per second worth of traffic.

00:02:34.500 --> 00:02:39.190
Lots of people need anonymity
for their regular work.

00:02:39.190 --> 00:02:40.670
Not everyone who
needs anonymity,

00:02:40.670 --> 00:02:43.980
though, thinks of
it as anonymity.

00:02:43.980 --> 00:02:46.380
Some people say, I
don't need anonymity.

00:02:46.380 --> 00:02:48.520
I'm perfectly fine
identifying myself.

00:02:48.520 --> 00:02:52.590
But there's broad
perceptions that the privacy

00:02:52.590 --> 00:02:55.330
is necessary or useful.

00:02:55.330 --> 00:02:57.982
And when regular citizens
use anonymity stuff,

00:02:57.982 --> 00:03:00.750
they tend to be doing it
because they want privacy

00:03:00.750 --> 00:03:04.455
in search results, privacy in
doing research on the internet.

00:03:04.455 --> 00:03:07.900
They want to be able to
engage in local politics

00:03:07.900 --> 00:03:12.180
while not offending local
politicians, and so on.

00:03:12.180 --> 00:03:15.210
Researchers frequently
use anonymizing tools

00:03:15.210 --> 00:03:21.800
to avoid gathering biased data,
biased by geolocation based

00:03:21.800 --> 00:03:23.685
services that might
be serving them

00:03:23.685 --> 00:03:26.500
in particular different
versions of things.

00:03:26.500 --> 00:03:29.700
Companies use
anonymity technologies

00:03:29.700 --> 00:03:32.620
for protection of
sensitive data.

00:03:32.620 --> 00:03:38.730
For instance, if I can
track all of the movements

00:03:38.730 --> 00:03:42.600
of the legal team for some
major internet company,

00:03:42.600 --> 00:03:49.360
I can probably, just by tracking
when they're visiting their web

00:03:49.360 --> 00:03:52.085
server from different
places around the world,

00:03:52.085 --> 00:03:54.126
or where they're visiting
the company [INAUDIBLE]

00:03:54.126 --> 00:03:56.000
different places
around the world,

00:03:56.000 --> 00:03:58.996
learn a lot about which teams
are collaborating with which.

00:03:58.996 --> 00:04:00.370
And this is
information companies

00:04:00.370 --> 00:04:02.370
would like to keep private.

00:04:02.370 --> 00:04:07.540
Companies use also the anonymity
technology for doing research.

00:04:07.540 --> 00:04:12.130
So a major router
manufacturer for a while--

00:04:12.130 --> 00:04:13.800
I don't know if this
is still the case--

00:04:13.800 --> 00:04:17.200
would regularly serve different
versions of its product sheets

00:04:17.200 --> 00:04:20.200
to IP addresses associated
with its competitors

00:04:20.200 --> 00:04:23.851
in order to make reverse
engineering trickier.

00:04:23.851 --> 00:04:26.142
And they found this out by
using our software and said,

00:04:26.142 --> 00:04:28.308
hey, wait a minute, we got
a different product sheet

00:04:28.308 --> 00:04:32.407
when we came in from Tor
than we did coming directly.

00:04:32.407 --> 00:04:34.365
And it's also kind of
normal for some companies

00:04:34.365 --> 00:04:36.679
to serve other companies
versions of their websites

00:04:36.679 --> 00:04:38.720
to emphasize the employment
opportunity sections.

00:04:41.660 --> 00:04:46.910
Regular law enforcement needs
anonymity technologies as well

00:04:46.910 --> 00:04:49.900
to avoid tipping off people
during investigations.

00:04:49.900 --> 00:04:51.955
You do not want the
local police station

00:04:51.955 --> 00:04:57.290
to appear in the web logs of
somebody you're investigating.

00:04:57.290 --> 00:05:00.960
And regular folks
need it, as I said,

00:05:00.960 --> 00:05:04.640
for avoiding harassment
because of online activities,

00:05:04.640 --> 00:05:07.600
to research stuff that
might be embarrassing.

00:05:07.600 --> 00:05:13.390
If you live in a country with
uncertain health care laws,

00:05:13.390 --> 00:05:16.420
you might want to avoid
creating too much public record

00:05:16.420 --> 00:05:19.070
of what diseases you think
you might have and so on,

00:05:19.070 --> 00:05:21.920
or what dangerous
hobbies you might have.

00:05:21.920 --> 00:05:27.400
And also lots of criminal or bad
folks use anonymity technology.

00:05:27.400 --> 00:05:28.710
It's not their only option.

00:05:28.710 --> 00:05:33.650
But if you are willing to
purchase time on a bot net,

00:05:33.650 --> 00:05:35.360
you can buy some
pretty good privacy

00:05:35.360 --> 00:05:38.329
that is not available
to people who

00:05:38.329 --> 00:05:39.620
think that bot nets are amoral.

00:05:39.620 --> 00:05:43.890
And Tor, and anonymity
stuff in general,

00:05:43.890 --> 00:05:49.482
are not the only multi-use
technology out there.

00:05:49.482 --> 00:05:51.690
Let's see, the average age
of a graduate is about 20.

00:05:51.690 --> 00:05:56.706
So around when you were
born-- have you talked

00:05:56.706 --> 00:05:58.327
about crypto wars at all?

00:05:58.327 --> 00:05:59.181
PROFESSOR: No.

00:05:59.181 --> 00:06:00.270
NICK MATHEWSON: No.

00:06:00.270 --> 00:06:02.700
During the 1990s, it was sort
of an up-in-the-air question

00:06:02.700 --> 00:06:06.590
in the United States about
to what extent civilian use

00:06:06.590 --> 00:06:09.120
of non-backdoor cryptography
should be legal,

00:06:09.120 --> 00:06:11.320
and to what extent it
should be exported.

00:06:11.320 --> 00:06:13.200
That kind of came
down pretty decisively

00:06:13.200 --> 00:06:17.090
on the side of cryptography
should be legal and exportable

00:06:17.090 --> 00:06:20.310
during the '90s and early 2000s.

00:06:20.310 --> 00:06:24.350
And although there's some debate
about anonymity technology,

00:06:24.350 --> 00:06:27.100
it's more or less
the same debate.

00:06:27.100 --> 00:06:30.750
And I think it's going to end
in more or less the same way.

00:06:30.750 --> 00:06:33.264
So here's an outline of my talk.

00:06:33.264 --> 00:06:35.680
I'm going to give you that
little introduction I gave you,

00:06:35.680 --> 00:06:37.721
talk a little bit about
what we mean by anonymity

00:06:37.721 --> 00:06:40.235
in a technical sense, talk a
little about our motivations

00:06:40.235 --> 00:06:41.068
for getting into it.

00:06:41.068 --> 00:06:44.970
Then I'm going to kind
of walk you through step

00:06:44.970 --> 00:06:47.450
by step how you start
with the idea of,

00:06:47.450 --> 00:06:50.445
we ought to have some
anonymity, and how

00:06:50.445 --> 00:06:52.902
do you wind up with the
design of Tor from that point.

00:06:52.902 --> 00:06:54.660
And I'll mention
some branching off

00:06:54.660 --> 00:06:56.990
points where you might
wind up with other designs.

00:06:56.990 --> 00:06:59.780
I'll pause to answer some
of the cool questions

00:06:59.780 --> 00:07:04.220
that everyone has sent in
for their class assignment.

00:07:04.220 --> 00:07:06.710
I'll talk a little bit about
how node discovery works,

00:07:06.710 --> 00:07:08.394
which is an important topic.

00:07:08.394 --> 00:07:10.150
And then I'll sort
of by show of hands

00:07:10.150 --> 00:07:12.856
pick which of these
advanced topics to cover.

00:07:12.856 --> 00:07:15.230
I guess we're calling them
advanced because they're later

00:07:15.230 --> 00:07:16.360
in the lecture.

00:07:16.360 --> 00:07:19.750
And I can't read them all,
but they're all really cool.

00:07:19.750 --> 00:07:21.655
I'll mention some
related systems

00:07:21.655 --> 00:07:23.905
whose designs you
ought to check out

00:07:23.905 --> 00:07:26.370
if this is a topic that
interests you and you'd like

00:07:26.370 --> 00:07:27.286
to know more about it.

00:07:27.286 --> 00:07:30.340
I'll talk about future work
that we want to have done at Tor

00:07:30.340 --> 00:07:32.860
and I hope that we'll
have time to do some day.

00:07:32.860 --> 00:07:35.870
And if there's time for
questions, then I'll take them.

00:07:35.870 --> 00:07:38.930
And I've got nowhere I need
to be for the next hour or so.

00:07:38.930 --> 00:07:43.307
So I and my colleague David over
there-- can you wave your hand,

00:07:43.307 --> 00:07:47.340
David-- will be hanging
around somewhere and talking

00:07:47.340 --> 00:07:48.613
to anyone who wants to talk.

00:07:48.613 --> 00:07:52.750
So right, anonymity--
what do we mean

00:07:52.750 --> 00:07:54.183
when we talk about anonymity?

00:07:54.183 --> 00:07:57.210
There are lots of
informal notions

00:07:57.210 --> 00:08:03.390
that get used in informal
discussions, in online, and so

00:08:03.390 --> 00:08:03.890
on.

00:08:03.890 --> 00:08:05.390
Some people use
anonymous to mean,

00:08:05.390 --> 00:08:06.598
I didn't write my name on it.

00:08:06.598 --> 00:08:10.900
Some people use
anonymous to mean, well,

00:08:10.900 --> 00:08:12.290
no one can actually
prove it's me

00:08:12.290 --> 00:08:15.230
even if you suspect strongly.

00:08:15.230 --> 00:08:18.200
What we mean is a number
of notions expressed

00:08:18.200 --> 00:08:25.560
in terms of the
ability of an observer

00:08:25.560 --> 00:08:32.590
or attacker on a network to
link participants to actions.

00:08:32.590 --> 00:08:35.870
These notions come out
of a terminology paper

00:08:35.870 --> 00:08:38.659
by [INAUDIBLE] that
you find a link

00:08:38.659 --> 00:08:43.929
to on freehaven.net/anonbib/,
the anonymity bibliography that

00:08:43.929 --> 00:08:46.790
I help maintain.

00:08:46.790 --> 00:08:49.423
It should list most of the
good papers in the field.

00:08:49.423 --> 00:08:51.550
We need to bring it
up to date to 2014,

00:08:51.550 --> 00:08:53.390
but it's pretty useful.

00:08:53.390 --> 00:08:55.840
So when I say anonymity,
generally what I mean

00:08:55.840 --> 00:09:01.080
is Alice is doing some activity.

00:09:01.080 --> 00:09:05.132
She's-- what should
Alice be doing?

00:09:05.132 --> 00:09:06.215
Alice is buying new socks.

00:09:10.270 --> 00:09:11.820
And there's some attacker here.

00:09:11.820 --> 00:09:14.770
Let's call her Eve for now.

00:09:14.770 --> 00:09:18.999
Eve can tell that Alice
is doing something.

00:09:18.999 --> 00:09:21.040
Preventing that is not
what we mean by anonymity.

00:09:21.040 --> 00:09:22.890
That's called unobservability.

00:09:22.890 --> 00:09:26.550
Eve can tell possibly that
someone is buying socks.

00:09:26.550 --> 00:09:28.850
Again, that's not what
we mean by anonymity.

00:09:28.850 --> 00:09:33.480
But what we hope is that
Eve cannot tell that Alice

00:09:33.480 --> 00:09:36.310
in particular is buying socks.

00:09:36.310 --> 00:09:40.190
And we mean that both
on a categorical level--

00:09:40.190 --> 00:09:42.935
Eve should not be
able to conclude

00:09:42.935 --> 00:09:45.060
through rigorous mathematical
proof, this is Alice,

00:09:45.060 --> 00:09:48.430
she's buying socks--
but also, Eve should not

00:09:48.430 --> 00:09:52.180
be able to conclude
probabilistically it's likelier

00:09:52.180 --> 00:09:56.030
that Alice is buying socks than
some randomly selected person.

00:09:56.030 --> 00:09:59.080
And also, we would
like Eve not to be

00:09:59.080 --> 00:10:02.250
able to conclude after
observing many Alice activities,

00:10:02.250 --> 00:10:05.280
Alice sometimes buys
socks, even if I

00:10:05.280 --> 00:10:08.260
don't know some particular
activity of Alice's is

00:10:08.260 --> 00:10:09.045
a socks purchase.

00:10:12.650 --> 00:10:14.560
There are other ideas
that are related.

00:10:14.560 --> 00:10:17.030
One is on unlinkability.

00:10:17.030 --> 00:10:23.876
Unlinkability is it's like a
long-term profile of Alice.

00:10:23.876 --> 00:10:26.210
So for instance,
Alice has been posting

00:10:26.210 --> 00:10:33.050
as-- I'm never good at
picking names for my example.

00:10:33.050 --> 00:10:39.060
Alice has been posting as Bob
and writing a political blog

00:10:39.060 --> 00:10:43.315
that would disrupt
her career, that

00:10:43.315 --> 00:10:45.490
would offend her
department head and disrupt

00:10:45.490 --> 00:10:49.820
her career as a computer
security [INAUDIBLE].

00:10:49.820 --> 00:10:53.280
So she's been writing as Bob.

00:10:53.280 --> 00:10:59.650
Unlinkability is Eve's
inability to link Alice

00:10:59.650 --> 00:11:01.950
to a particular profile.

00:11:01.950 --> 00:11:05.910
Final notion--
unobservability, some systems

00:11:05.910 --> 00:11:12.540
try to make it impossible to
even tell that Alice is online,

00:11:12.540 --> 00:11:15.620
that Alice is connecting to
anybody at all, that Alice

00:11:15.620 --> 00:11:17.190
is doing any active.

00:11:17.190 --> 00:11:20.660
These are rather hard to build.

00:11:20.660 --> 00:11:22.660
I'll talk a little bit
more about to what extent

00:11:22.660 --> 00:11:25.650
that they are useful later.

00:11:25.650 --> 00:11:27.610
Something that is
useful in that area

00:11:27.610 --> 00:11:29.745
is you might want to
conceal that Alice

00:11:29.745 --> 00:11:32.240
is using an anonymity
system, but not

00:11:32.240 --> 00:11:33.630
that she is on the internet.

00:11:33.630 --> 00:11:35.910
That's more achievable
than concealing the fact

00:11:35.910 --> 00:11:39.070
that Alice is on the
internet entirely.

00:11:39.070 --> 00:11:42.710
So why did I start working
on this in the first place?

00:11:42.710 --> 00:11:45.177
Well, partially because
of the engineer's itch.

00:11:45.177 --> 00:11:46.010
It's a cool problem.

00:11:46.010 --> 00:11:47.870
It's an interesting problem.

00:11:47.870 --> 00:11:50.805
Nobody else was
actually working on it.

00:11:50.805 --> 00:11:52.480
And my friend Roger
got a contract

00:11:52.480 --> 00:11:56.940
to finish up a stalled
research project

00:11:56.940 --> 00:11:58.490
before the grant expired.

00:11:58.490 --> 00:12:03.585
And he did it well enough that
I said, hey, I'll join up.

00:12:03.585 --> 00:12:05.200
And [INAUDIBLE].

00:12:05.200 --> 00:12:06.720
I'll join in.

00:12:06.720 --> 00:12:09.740
After a while, we
formed a nonprofit

00:12:09.740 --> 00:12:13.310
and released everything
as open source.

00:12:13.310 --> 00:12:14.870
So that's part of it.

00:12:14.870 --> 00:12:18.810
But for deeper
motivations, I think

00:12:18.810 --> 00:12:21.800
humanity has got a lot
of problems that can only

00:12:21.800 --> 00:12:25.760
be solved through better
and more dedicated

00:12:25.760 --> 00:12:30.530
communication, freer expression,
and more freedom of thought.

00:12:30.530 --> 00:12:33.890
And I don't know how to
solve these problems.

00:12:33.890 --> 00:12:37.360
All I think I can do
is try to make sure

00:12:37.360 --> 00:12:40.880
that what I see as
inhibiting discussion,

00:12:40.880 --> 00:12:44.780
thought, speech,
becomes harder to do.

00:12:44.780 --> 00:12:46.275
So that's [INAUDIBLE].

00:12:46.275 --> 00:12:47.188
Yeah.

00:12:47.188 --> 00:12:49.604
STUDENT: So I know there are
many good reasons to use Tor.

00:12:49.604 --> 00:12:51.062
Please don't see
this as criticism.

00:12:51.062 --> 00:12:53.036
I'm just curious,
what is your opinion

00:12:53.036 --> 00:12:55.297
as far as criminal activity?

00:12:55.297 --> 00:12:57.630
NICK MATHEWSON: What is my
opinion on criminal activity?

00:12:57.630 --> 00:12:58.430
Some laws are good.

00:12:58.430 --> 00:12:59.532
Some laws are bad.

00:12:59.532 --> 00:13:01.490
My lawyers would tell me
never to advise anyone

00:13:01.490 --> 00:13:02.860
to break the law.

00:13:05.750 --> 00:13:08.253
My goal was not to enable
criminal activity against most

00:13:08.253 --> 00:13:10.550
of the laws I agree with.

00:13:10.550 --> 00:13:13.140
In places where criticising
the government is illegal,

00:13:13.140 --> 00:13:17.399
then I'm in favor of criminal
activity of that kind.

00:13:17.399 --> 00:13:19.190
So in that case, I
suppose I was supporting

00:13:19.190 --> 00:13:21.330
that kind of criminal activity.

00:13:21.330 --> 00:13:24.946
My stance on whether it's
a problem that an anonymity

00:13:24.946 --> 00:13:26.570
network gets used
for criminal activity

00:13:26.570 --> 00:13:29.215
in general, to the extent
that there are good laws,

00:13:29.215 --> 00:13:31.660
I would prefer that
people not break them.

00:13:31.660 --> 00:13:36.980
I would, however, think that any
computer security system that

00:13:36.980 --> 00:13:40.830
does not get used by criminals
is probably a very bad computer

00:13:40.830 --> 00:13:43.720
security system if
the criminals are

00:13:43.720 --> 00:13:46.770
making any kind of good
decision making policy.

00:13:46.770 --> 00:13:49.820
I think that if we go
around banning security

00:13:49.820 --> 00:13:54.619
that works for criminals, we
wind up with insecure systems.

00:13:54.619 --> 00:13:56.410
So that's more or less
where I stand on it.

00:13:56.410 --> 00:13:58.284
I'm not really the
philosopher of it, though.

00:13:58.284 --> 00:13:59.620
I'm more of the programmer.

00:13:59.620 --> 00:14:01.760
So I'm going to be giving
really trite answers

00:14:01.760 --> 00:14:03.362
to philosophical
and legal questions.

00:14:03.362 --> 00:14:05.570
Also, I'm not a lawyer and
cannot offer legal advice.

00:14:05.570 --> 00:14:08.510
Do not take anything
I say as legal advice.

00:14:08.510 --> 00:14:14.464
That said, [INAUDIBLE], a lot
of these research problems

00:14:14.464 --> 00:14:15.880
that I'm going to
be talking about

00:14:15.880 --> 00:14:17.484
weren't even close
to being solved.

00:14:17.484 --> 00:14:19.650
So whey do we start anyway
instead of going straight

00:14:19.650 --> 00:14:21.099
into research?

00:14:21.099 --> 00:14:23.140
One of the reasons, we
thought that a lot of them

00:14:23.140 --> 00:14:27.300
wouldn't get solved unless
there was a test bed to work on.

00:14:27.300 --> 00:14:29.250
And that's kind
of been borne out.

00:14:29.250 --> 00:14:33.590
Because Tor has kind of become
the research platform of choice

00:14:33.590 --> 00:14:36.530
for lots of work on low
latency anonymity systems.

00:14:36.530 --> 00:14:38.580
And it's helped the
field a lot in that way.

00:14:38.580 --> 00:14:41.120
But also, 10 years on, a
lot of the big problems

00:14:41.120 --> 00:14:42.650
still aren't solved.

00:14:42.650 --> 00:14:45.740
So if we had waited 10 years
for everything to get fixed,

00:14:45.740 --> 00:14:48.290
we would have been
waiting in vain.

00:14:48.290 --> 00:14:51.760
So why do it then?

00:14:51.760 --> 00:14:58.740
Partially because we thought
that having a system out there

00:14:58.740 --> 00:15:03.041
would improve long-term
outcomes for the world.

00:15:03.041 --> 00:15:05.290
That is, it's really easy
to argue that something that

00:15:05.290 --> 00:15:08.250
doesn't exist should be banned.

00:15:08.250 --> 00:15:10.440
Arguments against civilian
use of cryptography

00:15:10.440 --> 00:15:13.230
were much easier to
make in public in 1990

00:15:13.230 --> 00:15:14.361
than they are today.

00:15:14.361 --> 00:15:15.860
Because there was
almost no civilian

00:15:15.860 --> 00:15:18.240
use of strong cryptography then.

00:15:18.240 --> 00:15:23.050
And you could argue that if
anything stronger than DES

00:15:23.050 --> 00:15:28.010
is legal, then
civilization will collapse.

00:15:28.010 --> 00:15:34.900
Criminals will never be
caught, and organized crime

00:15:34.900 --> 00:15:36.525
will take over everything.

00:15:36.525 --> 00:15:38.150
But you couldn't
really argue that that

00:15:38.150 --> 00:15:41.410
was the inevitable consequence
of cryptography in 2000.

00:15:41.410 --> 00:15:43.440
Because cryptography had
already been out there,

00:15:43.440 --> 00:15:46.160
and it turned out
not to end the world.

00:15:46.160 --> 00:15:49.420
Further, it was harder to argue
for a cryptography ban in 2000

00:15:49.420 --> 00:15:54.270
because there was a large
constituency in favor

00:15:54.270 --> 00:15:56.090
of the use of cryptography.

00:15:56.090 --> 00:15:59.150
That is, if someone
in 1985 says,

00:15:59.150 --> 00:16:01.630
let's ban strong
cryptography, well, banks

00:16:01.630 --> 00:16:02.880
are using strong cryptography.

00:16:02.880 --> 00:16:04.860
So they'll ask for an exemption.

00:16:04.860 --> 00:16:05.580
But other than
that, there weren't

00:16:05.580 --> 00:16:07.121
a lot of users of
strong cryptography

00:16:07.121 --> 00:16:08.384
in the civilian space.

00:16:08.384 --> 00:16:09.800
But if someone in
2000 said, let's

00:16:09.800 --> 00:16:12.180
ban strong
cryptography, that would

00:16:12.180 --> 00:16:14.900
be every internet company.

00:16:14.900 --> 00:16:18.885
Everyone running an HTTPS page
would start waving their hands

00:16:18.885 --> 00:16:20.050
and shouting about it.

00:16:20.050 --> 00:16:21.690
And nowadays, strong
cryptography bans

00:16:21.690 --> 00:16:24.610
are probably unfeasible,
although people

00:16:24.610 --> 00:16:26.000
keep bringing back the idea.

00:16:26.000 --> 00:16:27.470
And again, I'm not
the philosopher

00:16:27.470 --> 00:16:29.980
or political scientist
of the movement.

00:16:29.980 --> 00:16:34.860
So some folks ask me,
what's your threat model?

00:16:34.860 --> 00:16:37.390
It's good to be thinking
in terms of threat models.

00:16:37.390 --> 00:16:40.280
Unfortunately, our threat
model is kind of weird.

00:16:40.280 --> 00:16:43.570
We started not with an
adversary requirement.

00:16:43.570 --> 00:16:46.202
But we started with a
usability requirement.

00:16:46.202 --> 00:16:48.700
The usability requirement
we gave ourselves to begin

00:16:48.700 --> 00:16:52.395
is, this has to be
useful for web browsing.

00:16:52.395 --> 00:16:58.910
This has to be useful for
interactive protocols.

00:16:58.910 --> 00:17:01.110
And it actually
needs to see use.

00:17:01.110 --> 00:17:04.800
Subject to that, we want
to maximize security.

00:17:04.800 --> 00:17:07.369
So our threat model has
lots of weird corners

00:17:07.369 --> 00:17:10.050
in it if you actually
write it out as,

00:17:10.050 --> 00:17:13.410
what can an attacker do, under
what circumstances, and how?

00:17:13.410 --> 00:17:15.780
And that's because we've
set ourselves the goal of,

00:17:15.780 --> 00:17:17.810
it has to work for the web.

00:17:17.810 --> 00:17:20.443
And I'll return to that
in a minute or two.

00:17:20.443 --> 00:17:23.180
But let's sort of
talk about now how

00:17:23.180 --> 00:17:29.810
we can use forward anonymity,
how we build forward anonymity.

00:17:29.810 --> 00:17:32.580
So here's Alice.

00:17:32.580 --> 00:17:35.890
She wants to buy socks.

00:17:35.890 --> 00:17:42.170
So OK, let's say that
Alice runs a computer.

00:17:42.170 --> 00:17:43.820
Let's call it R for relay.

00:17:43.820 --> 00:17:47.670
And this computer relays
her traffic to-- I

00:17:47.670 --> 00:17:50.600
want to say socks.com,
but I'm afraid that'll

00:17:50.600 --> 00:17:53.690
turn out to be something
horrible, so zappos.com.

00:17:53.690 --> 00:17:55.000
Yeah, they sell socks, too.

00:17:55.000 --> 00:17:58.470
All right, so Alice wants to
buy some socks from zappos.com.

00:17:58.470 --> 00:18:00.930
And she's going through a relay.

00:18:00.930 --> 00:18:04.530
Well, I said Alice runs a relay.

00:18:04.530 --> 00:18:07.910
Any eavesdropper who's
looking at this will say,

00:18:07.910 --> 00:18:09.240
that's Alice's computer.

00:18:09.240 --> 00:18:11.097
It's probably Alice.

00:18:11.097 --> 00:18:13.180
All right, so let's have
somebody else run a relay

00:18:13.180 --> 00:18:17.340
and have lots of other
users all visit it.

00:18:17.340 --> 00:18:20.200
I'll call them A2
and A3, because there

00:18:20.200 --> 00:18:26.720
aren't enough standard
cryptography person names-- buy

00:18:26.720 --> 00:18:34.332
books, tweet cat pictures.

00:18:37.600 --> 00:18:42.670
This is like 80% of what people
do on the internet, right?

00:18:42.670 --> 00:18:46.650
So now we have three people all
going into this relay, three

00:18:46.650 --> 00:18:47.600
streams exiting.

00:18:47.600 --> 00:18:51.090
Someone who's watching the
relay can't easily correlate--

00:18:51.090 --> 00:18:54.290
should not be, we hope, but
we return to that later--

00:18:54.290 --> 00:18:58.300
that this Alice is buying
socks, this Alice, buying books,

00:18:58.300 --> 00:19:00.860
this Alice is tweeting cat pix.

00:19:00.860 --> 00:19:06.090
Well, except if they're watching
this side of the connections,

00:19:06.090 --> 00:19:08.530
they can see Alice
telling the relay,

00:19:08.530 --> 00:19:10.554
please connect me to zappos.com.

00:19:10.554 --> 00:19:12.220
All right, so we'll
add some encryption.

00:19:12.220 --> 00:19:15.200
We'll maybe do TLS on
all of these links.

00:19:15.200 --> 00:19:18.015
So to the extent that you
can't break TLS, to the extent

00:19:18.015 --> 00:19:20.200
you can't correlate
this to this,

00:19:20.200 --> 00:19:22.630
then they get some privacy.

00:19:22.630 --> 00:19:25.830
Well, that's still not
good enough, though.

00:19:25.830 --> 00:19:31.040
Because first off, we're
assuming that this relay

00:19:31.040 --> 00:19:32.619
is fully trusted.

00:19:32.619 --> 00:19:34.410
I assume you know the
definition of trusted

00:19:34.410 --> 00:19:36.460
and why it doesn't
actually mean trusted.

00:19:36.460 --> 00:19:37.940
OK, good.

00:19:37.940 --> 00:19:39.440
This is trusted in
the sense that it

00:19:39.440 --> 00:19:41.720
can break the whole system,
trusted in the sense

00:19:41.720 --> 00:19:44.845
that you can't help but trust
it, not trusted in the sense

00:19:44.845 --> 00:19:46.620
that it's actually trustworthy.

00:19:46.620 --> 00:19:49.720
So all right, we can
introduce multiple relays.

00:19:49.720 --> 00:19:53.410
We can have different relays
run by different people.

00:19:53.410 --> 00:20:00.120
We can have-- this is not
actually the topology we use.

00:20:00.120 --> 00:20:01.885
But my blackboard
technique is terrible,

00:20:01.885 --> 00:20:04.225
and I don't want
to redraw anything.

00:20:07.190 --> 00:20:09.720
We can imagine tumbling
these connections

00:20:09.720 --> 00:20:11.680
through multiple
relays, each of which

00:20:11.680 --> 00:20:14.170
removes a single
layer of encryption.

00:20:14.170 --> 00:20:19.770
So all this relay sees is
Alice is doing something.

00:20:19.770 --> 00:20:23.610
All this relay sees is
someone is buying socks.

00:20:23.610 --> 00:20:26.240
But this one just sees
someone is buying socks.

00:20:26.240 --> 00:20:28.562
The connection came
from this relay.

00:20:28.562 --> 00:20:30.395
This one just sees Alice
is doing something,

00:20:30.395 --> 00:20:32.320
and it forwards onto this relay.

00:20:32.320 --> 00:20:35.505
And no single party ought
to be able to correlate

00:20:35.505 --> 00:20:37.450
the whole thing.

00:20:37.450 --> 00:20:42.780
Now we come to a
major design point.

00:20:42.780 --> 00:20:50.090
Let's suppose that Eve is
watching here and here.

00:20:50.090 --> 00:20:52.250
Nothing I've said
so far does anything

00:20:52.250 --> 00:20:57.860
to obscure the timing and
volume of Alice's packets.

00:20:57.860 --> 00:21:01.140
Oh sure, there'll be
some trivial noise

00:21:01.140 --> 00:21:03.690
added from all the
computation and decryption

00:21:03.690 --> 00:21:06.220
these things do from
network latency and so on.

00:21:06.220 --> 00:21:11.600
But ultimately, if Alice
is sending a kilobyte in,

00:21:11.600 --> 00:21:13.500
then the design I've
sketched out so far,

00:21:13.500 --> 00:21:16.315
a kilobyte is coming out.

00:21:16.315 --> 00:21:21.650
And if the socks web
page is 64k long,

00:21:21.650 --> 00:21:26.340
and is served by this
web server at 11:26,

00:21:26.340 --> 00:21:27.870
then Alice is going
to get something

00:21:27.870 --> 00:21:33.460
about 64k long at
11:26 or 11:27 or so.

00:21:33.460 --> 00:21:38.400
Now, with some
statistics, Eve can

00:21:38.400 --> 00:21:42.540
correlate some of these
streams if we don't obscure

00:21:42.540 --> 00:21:44.726
volume and timing information.

00:21:44.726 --> 00:21:46.850
There are designs that do
obscure volume and timing

00:21:46.850 --> 00:21:48.190
information.

00:21:48.190 --> 00:21:52.230
The good ones usually
come out of [INAUDIBLE],

00:21:52.230 --> 00:21:55.140
although there's
some work on DC-nets.

00:21:55.140 --> 00:21:58.040
You could have something
where each of these nodes

00:21:58.040 --> 00:22:00.600
received a large number of
requests, just [INAUDIBLE]

00:22:00.600 --> 00:22:03.030
up all the requests
they got for an hour,

00:22:03.030 --> 00:22:06.970
reordered them, and
transmitted them all at once.

00:22:06.970 --> 00:22:10.260
And you could also say all
requests must be the same size.

00:22:10.260 --> 00:22:13.670
Requests are 1k,
responses are 1 megabyte.

00:22:13.670 --> 00:22:15.680
And with some more
work on that, we

00:22:15.680 --> 00:22:22.440
get something that would let you
send an email that would arrive

00:22:22.440 --> 00:22:29.220
in order of hours, or get a web
page in order of to end time,

00:22:29.220 --> 00:22:32.610
assuming that you optimize
it to a single round trip.

00:22:32.610 --> 00:22:36.500
These systems exist, and existed
when we started doing Tor.

00:22:36.500 --> 00:22:38.675
They don't get a
lot of use, though.

00:22:38.675 --> 00:22:40.740
I actually wrote
one called Mixminion

00:22:40.740 --> 00:22:44.010
that was a successor to
the Mixmaster remailer.

00:22:44.010 --> 00:22:46.510
I have not gotten a remailer
message in the last three

00:22:46.510 --> 00:22:47.010
years.

00:22:49.620 --> 00:22:51.350
Tor has billions of users.

00:22:51.350 --> 00:22:54.293
Remailers, it's unclear
whether they've got more than

00:22:54.293 --> 00:22:55.477
on the order of hundreds.

00:22:55.477 --> 00:22:57.310
So you might think,
well, still though, it's

00:22:57.310 --> 00:22:59.830
better anonymity for the
people who really need it.

00:22:59.830 --> 00:23:03.120
Except if you've only got on
the order of hundreds of users,

00:23:03.120 --> 00:23:05.655
then you're not
really providing them

00:23:05.655 --> 00:23:08.630
all that much anonymity against
this kind of adversary anyway.

00:23:08.630 --> 00:23:10.260
Because this adversary
can simply go,

00:23:10.260 --> 00:23:12.250
OK, there's 100 people.

00:23:12.250 --> 00:23:14.080
Well, the message I
want to investigate

00:23:14.080 --> 00:23:15.630
was looking at a
Bulgarian website.

00:23:15.630 --> 00:23:17.040
How many of them
speak Bulgarian?

00:23:17.040 --> 00:23:20.170
OK, that's five.

00:23:20.170 --> 00:23:22.950
The saying is,
anonymity loves company.

00:23:22.950 --> 00:23:25.615
Unless you have a
large user base,

00:23:25.615 --> 00:23:28.230
no system can actually
provide anonymity.

00:23:28.230 --> 00:23:31.970
And that's why also in this
design, if these Alices all

00:23:31.970 --> 00:23:33.770
belong to an
organization, they ought

00:23:33.770 --> 00:23:38.830
to have a shared public system
rather than a private one.

00:23:38.830 --> 00:23:45.130
If they all work for
MIT legal, and they're

00:23:45.130 --> 00:23:50.120
investigating some
fake MIT website that's

00:23:50.120 --> 00:23:54.663
offering fake diplomas,
then if they're just

00:23:54.663 --> 00:23:58.800
using the MIT legal anonymizer,
then it's not really

00:23:58.800 --> 00:24:00.370
concealing who they are.

00:24:00.370 --> 00:24:02.495
But if you have a large
number of different parties

00:24:02.495 --> 00:24:06.590
all using this, then it actually
can provide some privacy.

00:24:06.590 --> 00:24:13.830
So we'll return one more time
to resisting these correlation

00:24:13.830 --> 00:24:14.330
attacks.

00:24:14.330 --> 00:24:16.996
But for now let's say that we're
not resisting these correlation

00:24:16.996 --> 00:24:17.720
attacks.

00:24:17.720 --> 00:24:23.070
And instead, we assume that
an attacker who sees both ends

00:24:23.070 --> 00:24:25.850
wins, and we're trying to
minimize the probability

00:24:25.850 --> 00:24:28.220
that that happens over time.

00:24:28.220 --> 00:24:31.150
All right, so I've just
talked about message passing.

00:24:35.464 --> 00:24:37.880
The way you would build that
with something like a mix net

00:24:37.880 --> 00:24:45.630
is you give each of these relays
a public key-- K3, K2, K1.

00:24:45.630 --> 00:24:48.480
And when Alice wants to
send something through here,

00:24:48.480 --> 00:24:55.110
she would say, encrypt
with K3, socks,

00:24:55.110 --> 00:24:59.350
and then encrypt
that with K2-- I'm

00:24:59.350 --> 00:25:01.430
leaving off writing
information for now--

00:25:01.430 --> 00:25:04.320
and then encrypt with K1.

00:25:04.320 --> 00:25:05.894
But public key, as
you know, is kind

00:25:05.894 --> 00:25:08.310
of expensive enough that you
don't want to use it for bulk

00:25:08.310 --> 00:25:10.000
traffic.

00:25:10.000 --> 00:25:17.610
So instead what you
do is you negotiate

00:25:17.610 --> 00:25:20.110
a set of keys with each server.

00:25:20.110 --> 00:25:23.350
So Alice shares a symmetric
key with this relay,

00:25:23.350 --> 00:25:25.100
a different symmetric
key with this relay,

00:25:25.100 --> 00:25:28.395
and a different symmetric key
with this relay associated

00:25:28.395 --> 00:25:32.110
in what we call a circuit, which
is a path through the network.

00:25:32.110 --> 00:25:38.677
And after the initial public key
is set up to create those keys,

00:25:38.677 --> 00:25:40.135
Alice can then use
symmetric crypto

00:25:40.135 --> 00:25:41.551
to send stuff
through the network.

00:25:41.551 --> 00:25:43.920
If you stop at that
point, then you

00:25:43.920 --> 00:25:47.250
have onion routing as it
was designed in the 1990s

00:25:47.250 --> 00:25:51.955
by Syverson,
Goldschlag, and Reed.

00:25:51.955 --> 00:25:54.811
And I hope I get
the names right.

00:25:54.811 --> 00:25:56.060
Paul Syverson is still active.

00:25:56.060 --> 00:25:59.210
The other two are
working on other things.

00:25:59.210 --> 00:26:03.390
Also, once you've added circuits
like that, medium term paths

00:26:03.390 --> 00:26:06.910
through the network, you can
have an easy reply channel

00:26:06.910 --> 00:26:09.310
where things sent
back this way get

00:26:09.310 --> 00:26:13.155
to Alice being encrypted at
each step instead of decrypted

00:26:13.155 --> 00:26:15.770
at each step.

00:26:15.770 --> 00:26:21.660
And of course you need some
kind of integrity checking,

00:26:21.660 --> 00:26:24.430
either node by
node or end to end.

00:26:24.430 --> 00:26:26.280
Because if you don't
do integrity checking,

00:26:26.280 --> 00:26:31.855
then-- well, let's say you're
using an XOR based stream

00:26:31.855 --> 00:26:33.622
cypher for your encryption.

00:26:33.622 --> 00:26:35.080
If you don't do
integrity checking,

00:26:35.080 --> 00:26:39.230
then this node can XOR in
Alice, Alice, Alice, Alice,

00:26:39.230 --> 00:26:42.410
Alice to the encrypted message.

00:26:42.410 --> 00:26:44.970
And then when it's finally
decrypted over here,

00:26:44.970 --> 00:26:47.310
because that's a
malleable crypto

00:26:47.310 --> 00:26:56.410
scheme, if the same attacker is
controlling this node as well,

00:26:56.410 --> 00:26:58.970
or if the attacker
is observing it here,

00:26:58.970 --> 00:27:01.870
the attacker will see Alice,
Alice, Alice, Alice, Alice

00:27:01.870 --> 00:27:03.820
XORed with a
reasonable plain text

00:27:03.820 --> 00:27:05.320
and be able to use
that to identify,

00:27:05.320 --> 00:27:08.580
ah, this is the stream
that came from Alice.

00:27:08.580 --> 00:27:12.370
So let's do a little more
about how the protocol works.

00:27:12.370 --> 00:27:14.870
Because it would be a shame to
have everybody read the paper

00:27:14.870 --> 00:27:16.245
and then not talk
about the stuff

00:27:16.245 --> 00:27:17.680
that the paper is focused on.

00:27:24.011 --> 00:27:26.840
Again, I apologize for
my blackboard technique.

00:27:26.840 --> 00:27:32.120
Most of the time, I'm
sitting at home on a desktop.

00:27:32.120 --> 00:27:35.385
This is alien tech.

00:27:35.385 --> 00:27:38.315
So here's a relay.

00:27:38.315 --> 00:27:41.580
Here's Alice.

00:27:41.580 --> 00:27:43.610
Here's another relay.

00:27:43.610 --> 00:27:44.270
Here's Bob.

00:27:44.270 --> 00:27:45.843
Now Alice wants to talk to Bob.

00:27:48.460 --> 00:27:52.720
So first thing Alice has
to do is build a circuit

00:27:52.720 --> 00:27:55.210
through these relays to Bob.

00:27:55.210 --> 00:27:57.130
Let's say she's picked
these two, R1 and R2.

00:27:59.900 --> 00:28:08.050
So Alice first makes
a TLS link to R1.

00:28:08.050 --> 00:28:10.660
R1, let's say, already
has a TLS link to R2.

00:28:13.550 --> 00:28:16.335
First thing Alice
does is she does

00:28:16.335 --> 00:28:25.250
a one-way authenticated one-way
anonymous key negotiation.

00:28:25.250 --> 00:28:28.340
The old one in
Tor is called TAP.

00:28:28.340 --> 00:28:30.280
The new one is called NTor.

00:28:30.280 --> 00:28:31.980
They both have proofs.

00:28:35.032 --> 00:28:36.490
They both even have
correct proofs,

00:28:36.490 --> 00:28:41.540
although the original proof
in the paper had a flaw in it.

00:28:41.540 --> 00:28:45.780
But when that's done,
she sends a create cell.

00:28:45.780 --> 00:28:47.690
And she picks a circuit ID.

00:28:47.690 --> 00:28:52.023
Let's say she picks
3, and says, create 3.

00:28:54.650 --> 00:28:55.650
The relay says, created.

00:29:00.010 --> 00:29:05.575
And now R1 and Alice share a
secret key, a symmetric key,

00:29:05.575 --> 00:29:06.866
which they're going to call S1.

00:29:10.280 --> 00:29:16.234
And they both have this stored
as 3 with respect to this link.

00:29:19.020 --> 00:29:23.810
Now Alice can use that key
to send messages to R1.

00:29:23.810 --> 00:29:27.265
So she says, on 3-- that's
the circuit ID that everything

00:29:27.265 --> 00:29:38.760
was talking about in the
paper-- send a relay extend

00:29:38.760 --> 00:29:41.210
with some contents.

00:29:41.210 --> 00:29:44.326
The extend cell basically
contains the first half

00:29:44.326 --> 00:29:47.130
of the create handshake.

00:29:47.130 --> 00:29:50.965
But this time, it's not
encrypted with R1's public key.

00:29:50.965 --> 00:29:53.070
It's encrypted with
R2's public key.

00:29:53.070 --> 00:29:56.130
And it also says, and
this one goes to R2.

00:29:56.130 --> 00:30:01.941
So R1 knows to open a new
circuit to R2, and says,

00:30:01.941 --> 00:30:02.440
create.

00:30:05.770 --> 00:30:09.480
And it passes the initial
part of the handshake

00:30:09.480 --> 00:30:12.120
as it came from Alice along.

00:30:12.120 --> 00:30:14.550
And it picks its own circuit ID.

00:30:14.550 --> 00:30:17.185
Because circuit IDs identify
the different circuits

00:30:17.185 --> 00:30:19.122
on this TLS connection.

00:30:19.122 --> 00:30:20.830
And Alice doesn't know
what other circuit

00:30:20.830 --> 00:30:22.120
IDs are in use on this one.

00:30:22.120 --> 00:30:24.390
Because this one is
private to R1 and R2.

00:30:24.390 --> 00:30:28.270
So it might pick 95.

00:30:28.270 --> 00:30:30.020
It actually is very
unlikely to pick that,

00:30:30.020 --> 00:30:36.270
because they're randomly
chosen from a 4 byte space.

00:30:36.270 --> 00:30:40.780
But I don't want to write
out any 32-bit numbers today.

00:30:40.780 --> 00:30:43.975
And this says,
created in response.

00:30:43.975 --> 00:30:48.590
So this one sends back an
extended encrypted with S1.

00:30:48.590 --> 00:30:58.480
And now Alice and
relay share S2.

00:30:58.480 --> 00:31:01.050
So now Alice can send
messages encrypted

00:31:01.050 --> 00:31:06.480
first with S2, and then
with S1 as relay cells.

00:31:06.480 --> 00:31:08.000
So she sends a
message like that.

00:31:08.000 --> 00:31:12.960
R1 removes the S1 encryption
and forwards it on.

00:31:12.960 --> 00:31:17.750
It says, OK, it came
in on circuit 3.

00:31:17.750 --> 00:31:20.370
I know that 3 goes
to 95 on this one.

00:31:20.370 --> 00:31:23.075
So I send it on 95.

00:31:23.075 --> 00:31:25.852
And I say whatever I
got after decrypting.

00:31:25.852 --> 00:31:28.980
OK, and this one says,
ah, I came on 95.

00:31:28.980 --> 00:31:33.290
95 corresponds to
the shared key S2.

00:31:33.290 --> 00:31:34.740
So I'll decrypt with that.

00:31:34.740 --> 00:31:38.340
Oh, that says, open
a connection to Bob.

00:31:38.340 --> 00:31:41.650
And relay 2 opens a
TCP connection to Bob

00:31:41.650 --> 00:31:45.270
and tells Alice that it did
it through the same process.

00:31:45.270 --> 00:31:47.150
And Alice says, great.

00:31:47.150 --> 00:31:58.440
Tell Bob http 10 get/index.html,
and the world goes on.

00:31:58.440 --> 00:32:00.120
Let's see, what did I leave out?

00:32:00.120 --> 00:32:03.040
I'll skip that, skip
that, skip that.

00:32:03.040 --> 00:32:04.930
So what do we actually relay?

00:32:04.930 --> 00:32:07.210
Some designs in this area
say, well, you should

00:32:07.210 --> 00:32:08.980
send IP packets back and forth.

00:32:08.980 --> 00:32:12.006
This should just be a way
to transmit IP packets.

00:32:12.006 --> 00:32:15.980
One of the problems
with that is we

00:32:15.980 --> 00:32:19.070
want to support as many
users as possible, which

00:32:19.070 --> 00:32:21.580
means we have to run on all
kinds of operating systems.

00:32:21.580 --> 00:32:23.920
And operating system
TCP stacks do not

00:32:23.920 --> 00:32:26.020
act anything like each other.

00:32:26.020 --> 00:32:27.960
If you've ever used
Nmap, or if you've ever

00:32:27.960 --> 00:32:30.610
used any kind of network
traffic analysis tool,

00:32:30.610 --> 00:32:34.635
you can trivially tell
Windows TCP from FreeBSD

00:32:34.635 --> 00:32:36.880
from Linux TCP.

00:32:36.880 --> 00:32:38.990
And you can even tell
different versions apart.

00:32:38.990 --> 00:32:41.870
And moreover, if you
can send raw IP packets

00:32:41.870 --> 00:32:45.560
to a chosen host,
you can provoke

00:32:45.560 --> 00:32:49.810
different responses
in part based

00:32:49.810 --> 00:32:51.637
on what the host is doing.

00:32:51.637 --> 00:32:53.458
So if you're doing
IP, you would actually

00:32:53.458 --> 00:32:55.900
need an IP normalization
layer if IP is what

00:32:55.900 --> 00:32:58.630
you transport back and forth.

00:32:58.630 --> 00:33:03.730
And it seems that anything less
than a full IP stack is not

00:33:03.730 --> 00:33:07.017
actually going to work
for IP normalization.

00:33:07.017 --> 00:33:08.350
So you wouldn't want to do that.

00:33:10.880 --> 00:33:13.560
Instead, what we just chose
is-- and this is largely

00:33:13.560 --> 00:33:15.960
because this is the
easiest way-- you take

00:33:15.960 --> 00:33:18.230
the contents of TCP streams.

00:33:18.230 --> 00:33:25.390
So you just assume
each of these things

00:33:25.390 --> 00:33:27.610
is reliable and in order.

00:33:27.610 --> 00:33:31.430
You have the computer analysis
end, the program analysis

00:33:31.430 --> 00:33:35.400
running to do all
this stuff for her,

00:33:35.400 --> 00:33:38.120
accept TCP connections
from Alice's applications,

00:33:38.120 --> 00:33:40.720
and then just relay
their contents

00:33:40.720 --> 00:33:44.229
and don't do anything
trickier on the network level.

00:33:44.229 --> 00:33:46.020
You might be able to
get better performance

00:33:46.020 --> 00:33:46.970
by trying some other means.

00:33:46.970 --> 00:33:48.428
And there are some
papers examining

00:33:48.428 --> 00:33:49.880
how you would do that.

00:33:49.880 --> 00:33:52.820
But this is the one that we
could actually implement.

00:33:52.820 --> 00:33:54.392
Because we paid a
lot more attention

00:33:54.392 --> 00:33:56.100
in security and
compilers classes than we

00:33:56.100 --> 00:33:58.860
did in networking classes.

00:33:58.860 --> 00:34:00.760
Now we have networking people.

00:34:00.760 --> 00:34:04.285
But in 2003, 2004, we did not
have any networking experts.

00:34:07.250 --> 00:34:09.030
TCP also seems like
the right level.

00:34:09.030 --> 00:34:11.594
Higher level
protocols-- like in some

00:34:11.594 --> 00:34:13.210
of the original
[INAUDIBLE] designs,

00:34:13.210 --> 00:34:16.389
there were separate proxies
at this end for HTTP,

00:34:16.389 --> 00:34:19.000
for FTP, and so on.

00:34:19.000 --> 00:34:21.889
That seems to be
mostly a bad idea.

00:34:21.889 --> 00:34:24.060
Because any
interesting protocol is

00:34:24.060 --> 00:34:26.880
going to have end to end
encryption from Alice

00:34:26.880 --> 00:34:28.650
all the way to Bob.

00:34:28.650 --> 00:34:32.800
That is if we're lucky, Alice
is doing a TLS connection

00:34:32.800 --> 00:34:40.800
over this to Bob so that TLS
properties get her integrity

00:34:40.800 --> 00:34:44.110
and secrecy.

00:34:44.110 --> 00:34:46.909
But if that's the case,
then any kind anonymizing

00:34:46.909 --> 00:34:50.840
transformations you want to
apply to the encrypted data

00:34:50.840 --> 00:34:53.139
need to happen in
the application

00:34:53.139 --> 00:34:56.710
Alice is using before
the TLS happens entirely.

00:34:56.710 --> 00:34:58.637
So you can't really
do that in a proxy.

00:34:58.637 --> 00:35:00.220
And that's kind of
why we came out to,

00:35:00.220 --> 00:35:03.370
OK, the sweet spot
is TCP contents.

00:35:03.370 --> 00:35:08.070
Somebody asked me, OK, but
where are your security proofs?

00:35:08.070 --> 00:35:11.530
We do have security proofs for a
lot of the cryptography that we

00:35:11.530 --> 00:35:15.760
use, standard reductions.

00:35:15.760 --> 00:35:19.510
For the protocol
as a whole, there

00:35:19.510 --> 00:35:23.069
are proofs in the field about
certain aspects of onion

00:35:23.069 --> 00:35:23.710
routing.

00:35:23.710 --> 00:35:27.310
But the models that they
have to use in order

00:35:27.310 --> 00:35:31.170
to prove that this
provides anonymity

00:35:31.170 --> 00:35:36.890
make assumptions about
the universe, the network,

00:35:36.890 --> 00:35:41.930
or the attacker's abilities
that are so weird as

00:35:41.930 --> 00:35:45.710
to satisfy no one but certain
program committees of more

00:35:45.710 --> 00:35:49.070
theoretical conferences.

00:35:49.070 --> 00:35:54.580
The kind of things you can prove
is that an attacker who sees

00:35:54.580 --> 00:36:02.890
this, who sees a number of
strings here all of equal

00:36:02.890 --> 00:36:07.140
volume and equal timing, cannot
tell which one goes to which

00:36:07.140 --> 00:36:11.650
Bob simply by looking
at the bytes coming out.

00:36:11.650 --> 00:36:14.630
But that's hardly
a useful result.

00:36:14.630 --> 00:36:17.880
Also, the kind of guarantee you
can get from anonymity systems

00:36:17.880 --> 00:36:20.319
that we know how to
build today-- OK,

00:36:20.319 --> 00:36:21.360
I should be careful here.

00:36:21.360 --> 00:36:24.780
There are some where you
have very strong guarantees

00:36:24.780 --> 00:36:26.930
that we do know how to
build that you would never

00:36:26.930 --> 00:36:28.010
actually want to use.

00:36:28.010 --> 00:36:32.490
Like classical
[INAUDIBLE] DC-nets,

00:36:32.490 --> 00:36:35.200
for instance, provide
guaranteed anonymity.

00:36:35.200 --> 00:36:37.450
Except any participant can
shut down the whole network

00:36:37.450 --> 00:36:39.550
by not participating.

00:36:39.550 --> 00:36:41.400
That does not scale.

00:36:41.400 --> 00:36:42.820
But for the things
that we do want

00:36:42.820 --> 00:36:46.880
to build these days,
for the most part,

00:36:46.880 --> 00:36:49.960
the anonymity properties
are probabilistic rather

00:36:49.960 --> 00:36:52.670
than categorically
guarantee-able.

00:36:52.670 --> 00:36:56.070
So instead of asking,
does this protect

00:36:56.070 --> 00:36:58.650
Alice, the kind of
questions you could ask

00:36:58.650 --> 00:37:02.600
are, under this assumption
about hacker capabilities, how

00:37:02.600 --> 00:37:04.260
much traffic can
Alice safely send

00:37:04.260 --> 00:37:10.370
if she wants a 99% chance of not
being linked to her activities?

00:37:10.370 --> 00:37:13.070
So will anyone actually
run these things?

00:37:13.070 --> 00:37:15.430
That was an opening
question when we started.

00:37:15.430 --> 00:37:17.430
We didn't know whether
the system would actually

00:37:17.430 --> 00:37:18.320
take off or not.

00:37:18.320 --> 00:37:25.450
So the only [INAUDIBLE]
try to see what happens.

00:37:25.450 --> 00:37:28.920
We got a fair amount
of volunteer operators.

00:37:28.920 --> 00:37:33.410
A fair number of non-profits
have formed whose sole purpose

00:37:33.410 --> 00:37:36.440
is just to take donations and
use it to buy bandwidth and run

00:37:36.440 --> 00:37:38.890
Tor nodes.

00:37:38.890 --> 00:37:40.450
And there are also universities.

00:37:40.450 --> 00:37:42.609
There's also private companies.

00:37:42.609 --> 00:37:44.650
For a while, [INAUDIBLE]
was running a Tor server

00:37:44.650 --> 00:37:47.689
out of their security
team because they

00:37:47.689 --> 00:37:48.480
thought it was fun.

00:37:52.360 --> 00:37:54.760
The legal issues there--
again, I'm not a lawyer.

00:37:54.760 --> 00:37:55.910
I can't offer legal advice.

00:37:55.910 --> 00:37:58.035
But five different people
asked about legal issues.

00:38:00.192 --> 00:38:01.900
As far as I can tell,
in the US at least,

00:38:01.900 --> 00:38:04.800
there's no legal impediment
to running a Tor server.

00:38:04.800 --> 00:38:07.690
And that seems to be the case
throughout most of Europe

00:38:07.690 --> 00:38:09.580
as far as I'm aware.

00:38:09.580 --> 00:38:12.970
In places that generally
have less internet freedom,

00:38:12.970 --> 00:38:14.670
it's a dicier proposition.

00:38:14.670 --> 00:38:16.670
The issues to be
concerned about are not,

00:38:16.670 --> 00:38:19.180
is it illegal to
run a Tor server,

00:38:19.180 --> 00:38:24.635
but if somebody does something
illegal or undesirable

00:38:24.635 --> 00:38:28.180
with my Tor server, will
my ISP shut me down,

00:38:28.180 --> 00:38:32.846
and will law
enforcement believe, oh,

00:38:32.846 --> 00:38:34.220
you're just running
a Tor server,

00:38:34.220 --> 00:38:37.336
or will they seize the
computer to make sure?

00:38:37.336 --> 00:38:39.710
For those, I would suggest
not running the Tor server out

00:38:39.710 --> 00:38:42.720
of your dorm room.

00:38:42.720 --> 00:38:45.670
Excuse me, don't run an
exit out of your dorm room,

00:38:45.670 --> 00:38:48.460
or really out of your dorm room,
assuming the network policy

00:38:48.460 --> 00:38:49.460
allows that.

00:38:49.460 --> 00:38:50.650
I have no idea.

00:38:50.650 --> 00:38:52.400
They've changed so
much since I was a kid.

00:38:55.266 --> 00:38:57.890
Running an exit out of your dorm
room could get you in trouble.

00:38:57.890 --> 00:39:01.620
But running a non-exit relay
that doesn't deliver traffic

00:39:01.620 --> 00:39:05.282
to the internet is less
likely to create those issues

00:39:05.282 --> 00:39:05.865
in particular.

00:39:10.140 --> 00:39:12.010
But if you do it in
a nice co-lo site,

00:39:12.010 --> 00:39:14.730
and you get your
ISP's permission,

00:39:14.730 --> 00:39:19.840
then it's a pretty
reasonable thing to do.

00:39:19.840 --> 00:39:23.311
Let's see, someone asked,
well, what if users

00:39:23.311 --> 00:39:24.560
don't trust a particular node?

00:39:24.560 --> 00:39:29.670
And this brings me
to my next topic.

00:39:29.670 --> 00:39:32.750
So the software the clients
use, you can't tell it,

00:39:32.750 --> 00:39:35.780
don't use this one, don't use
this one, only use this one.

00:39:35.780 --> 00:39:39.130
But remember that anonymity
loves company principle.

00:39:39.130 --> 00:39:43.631
If I'm only using
three nodes, and you're

00:39:43.631 --> 00:39:45.256
using three different
nodes, and you're

00:39:45.256 --> 00:39:49.550
using three different nodes,
our traffic will not mix at all.

00:39:49.550 --> 00:39:52.280
To the extent that we partition
off which parts of the network

00:39:52.280 --> 00:39:55.740
we use, we are distinguishable
from one another.

00:39:55.740 --> 00:39:57.800
Now, if I just exclude
one or two nodes,

00:39:57.800 --> 00:40:00.040
and you just exclude
one or two nodes,

00:40:00.040 --> 00:40:03.120
that's not a big partitioning,
and that doesn't help

00:40:03.120 --> 00:40:05.270
distinguish-ability that much.

00:40:05.270 --> 00:40:08.700
But it would be good to
the extent possible to have

00:40:08.700 --> 00:40:12.290
everyone using the same nodes.

00:40:12.290 --> 00:40:14.880
So all right, how do
we accomplish that?

00:40:14.880 --> 00:40:16.780
So version one, in the
first version of Tor,

00:40:16.780 --> 00:40:18.730
we just chipped a list
of all of the nodes.

00:40:18.730 --> 00:40:21.525
I think there were three of
them, or five, or something.

00:40:21.525 --> 00:40:22.900
No, I think there
were about six,

00:40:22.900 --> 00:40:25.910
of which three were all
running on the same computer

00:40:25.910 --> 00:40:30.142
in a closet at LCS
in Tech Square.

00:40:30.142 --> 00:40:32.560
All right, so that
wasn't a good idea.

00:40:32.560 --> 00:40:34.090
Because nodes can
go up and down.

00:40:34.090 --> 00:40:35.067
Nodes change.

00:40:35.067 --> 00:40:36.442
You don't want to
have to put out

00:40:36.442 --> 00:40:39.005
a new release of your
software every time somebody

00:40:39.005 --> 00:40:41.160
joins to release the network.

00:40:41.160 --> 00:40:44.260
So you could just
have every node keep

00:40:44.260 --> 00:40:46.677
a list of all the other nodes
that are connected to it

00:40:46.677 --> 00:40:48.010
and all advertise to each other.

00:40:48.010 --> 00:40:50.193
And then when a client
connects, a client just

00:40:50.193 --> 00:40:51.790
has to know one
node and then says,

00:40:51.790 --> 00:40:53.189
hey, who's on the network?

00:40:53.189 --> 00:40:54.730
And actually, a lot
of designs people

00:40:54.730 --> 00:40:57.320
have built work this way.

00:40:57.320 --> 00:40:59.500
A lot of early peer to
peer anonymity designs work

00:40:59.500 --> 00:41:00.360
this way.

00:41:00.360 --> 00:41:01.771
But it's a terrible idea.

00:41:01.771 --> 00:41:04.270
Because if you go to one node
and say, who's on the network,

00:41:04.270 --> 00:41:07.240
and you believe them, well,
if I'm that node, I can say,

00:41:07.240 --> 00:41:11.070
yes, I'm on the network,
and my friend over here

00:41:11.070 --> 00:41:14.130
is on the network, and my friend
over here is on the network,

00:41:14.130 --> 00:41:15.920
and no one else
is on the network.

00:41:15.920 --> 00:41:18.895
And I can tell you any
number of fake nodes

00:41:18.895 --> 00:41:22.790
that are all operated by me
and capture all of your traffic

00:41:22.790 --> 00:41:25.160
that way with what's called
a row capture attack.

00:41:25.160 --> 00:41:28.480
OK, so maybe we just
have a single directory

00:41:28.480 --> 00:41:30.470
operated by a trusted party.

00:41:30.470 --> 00:41:33.730
That's not so good as a
single point of failure.

00:41:33.730 --> 00:41:38.210
So OK, let's have
multiple trusted parties.

00:41:38.210 --> 00:41:41.750
And clients go to these
multiple trusted parties

00:41:41.750 --> 00:41:43.990
and get a list of all of
the nodes from all of them

00:41:43.990 --> 00:41:47.010
and combine those lists.

00:41:47.010 --> 00:41:49.813
Then you're
actually-- first off,

00:41:49.813 --> 00:41:51.560
you're partitioned in that case.

00:41:51.560 --> 00:41:54.060
If I choose these three,
and you choose those three,

00:41:54.060 --> 00:41:55.975
and they say anything
different, then we'll

00:41:55.975 --> 00:41:57.350
be using different
sets of nodes.

00:41:57.350 --> 00:41:58.820
So that's still not good.

00:41:58.820 --> 00:42:01.800
Also, there's
still a [INAUDIBLE]

00:42:01.800 --> 00:42:08.820
where if I use the intersection
of the sets they tell me,

00:42:08.820 --> 00:42:11.520
then any one of them can keep
me from using a node they

00:42:11.520 --> 00:42:13.360
don't like by not listing it.

00:42:13.360 --> 00:42:16.700
If I use the union,
anyone can flood me

00:42:16.700 --> 00:42:21.630
by making 20,000 fake servers
that are all on the list.

00:42:21.630 --> 00:42:24.545
I might compute the result
of some sort of vote

00:42:24.545 --> 00:42:26.930
on them, which would
solve those two problems.

00:42:26.930 --> 00:42:28.890
But I'd still be
partitioned from everyone

00:42:28.890 --> 00:42:32.580
who's using different
trusted parties.

00:42:32.580 --> 00:42:35.270
We could do a magical DHT.

00:42:35.270 --> 00:42:36.859
Have we done
[INAUDIBLE] hash tables?

00:42:36.859 --> 00:42:39.150
All right, we could do some
sort of magical distributed

00:42:39.150 --> 00:42:43.930
structure run across
all of the nodes.

00:42:43.930 --> 00:42:50.140
I say magical, because although
there are designs in this area,

00:42:50.140 --> 00:42:54.320
and some better than
others, none of them

00:42:54.320 --> 00:42:58.624
really seem to have a solid
security evidence for it

00:42:58.624 --> 00:43:00.040
at this point to
the point where I

00:43:00.040 --> 00:43:04.260
would be comfortable in saying,
yes, this is actually secure.

00:43:04.260 --> 00:43:06.900
So the solution we
wound up with is

00:43:06.900 --> 00:43:10.610
have multiple hardened
trusted authorities run

00:43:10.610 --> 00:43:14.040
by trusted parties that
collect lists of nodes

00:43:14.040 --> 00:43:17.690
that vote hourly on
which nodes are running

00:43:17.690 --> 00:43:21.870
that can vote to exclude nodes
that seem to be misbehaving

00:43:21.870 --> 00:43:25.920
that are all running on the
same slash 16 that are doing

00:43:25.920 --> 00:43:29.120
strange things to
traffic, and have

00:43:29.120 --> 00:43:34.190
them form a consensus that's
a result of their votes.

00:43:34.190 --> 00:43:36.017
And everybody signs
the consensus.

00:43:36.017 --> 00:43:37.517
And clients don't
use it unless it's

00:43:37.517 --> 00:43:39.490
signed by enough authorities.

00:43:39.490 --> 00:43:40.940
This is not the final design.

00:43:40.940 --> 00:43:44.670
But it's the best we've
managed to come up with so far.

00:43:44.670 --> 00:43:46.630
And this way, all you
need to distribute

00:43:46.630 --> 00:43:51.880
with clients is a list of all
of the authorities' public keys

00:43:51.880 --> 00:43:54.210
and some places to
get the directories.

00:43:54.210 --> 00:43:58.120
You want to have all the nodes
cache these directory things.

00:43:58.120 --> 00:44:00.604
Because if you don't, the
bandwidth load on authorities

00:44:00.604 --> 00:44:01.270
is catastrophic.

00:44:04.320 --> 00:44:06.050
So I'm going to skip over that.

00:44:06.050 --> 00:44:11.260
Because I would love
to talk about how

00:44:11.260 --> 00:44:13.295
clients should
choose which paths

00:44:13.295 --> 00:44:14.800
to build through the network.

00:44:14.800 --> 00:44:17.560
I would love to talk
about issues applications

00:44:17.560 --> 00:44:20.382
and making applications
not deanonymize themselves.

00:44:20.382 --> 00:44:21.590
I'd love to talk about abuse.

00:44:21.590 --> 00:44:24.470
I'd love to talk about hidden
services and how they work.

00:44:24.470 --> 00:44:27.210
I'd love to talk about
censorship resistance.

00:44:27.210 --> 00:44:30.540
And I'd like to talk about
attacks and defenses.

00:44:30.540 --> 00:44:34.230
But I've only got 35 minutes.

00:44:34.230 --> 00:44:36.280
And I can't possibly
cover all of these.

00:44:36.280 --> 00:44:38.490
So show of hands
for how many people

00:44:38.490 --> 00:44:42.500
think the most important--
think about what you think

00:44:42.500 --> 00:44:45.584
are the two most important
topics on this list.

00:44:45.584 --> 00:44:47.250
If one of your two
most important topics

00:44:47.250 --> 00:44:49.041
is path selection and
how you choose nodes,

00:44:49.041 --> 00:44:51.500
please raise your hand.

00:44:51.500 --> 00:44:53.550
If one of your two
most important topics

00:44:53.550 --> 00:44:57.370
is application issues and
how to make applications not

00:44:57.370 --> 00:45:00.044
bust your anonymity,
please raise your hand.

00:45:00.044 --> 00:45:02.020
If one of your most
important issues

00:45:02.020 --> 00:45:05.700
is abuse and what kind of abuse
we see, how you can prevent it,

00:45:05.700 --> 00:45:08.294
and how that works out,
please raise your hand.

00:45:08.294 --> 00:45:11.651
OK, that one's popular.

00:45:11.651 --> 00:45:13.150
If one of your most
important topics

00:45:13.150 --> 00:45:14.566
is how these
services work and how

00:45:14.566 --> 00:45:17.280
they can be made to work
better, please raise your hand.

00:45:17.280 --> 00:45:19.530
Wow, that's much more popular
on this side of the room

00:45:19.530 --> 00:45:20.654
than that side of the room.

00:45:20.654 --> 00:45:23.162
What's going on?

00:45:23.162 --> 00:45:24.820
You guys in a club?

00:45:24.820 --> 00:45:26.926
Are you up to something?

00:45:26.926 --> 00:45:29.610
Censorship, who's
interested in censorship?

00:45:29.610 --> 00:45:32.880
OK, that's fairly popular.

00:45:32.880 --> 00:45:36.170
Attacks and defenses?

00:45:36.170 --> 00:45:39.530
OK, so we're not doing paths
and we're not doing apps.

00:45:39.530 --> 00:45:44.600
So apps-- guard nodes, guard
nodes, C guard node designs,

00:45:44.600 --> 00:45:46.240
select by bandwidth.

00:45:46.240 --> 00:45:48.230
You need to actually
weight by bandwidth,

00:45:48.230 --> 00:45:51.200
but you also need a trusted
way to measure bandwidth.

00:45:51.200 --> 00:45:55.025
And that's the too long,
didn't lecture of what

00:45:55.025 --> 00:45:56.150
would be on path selection.

00:45:56.150 --> 00:45:59.555
For application issues,
almost no protocol

00:45:59.555 --> 00:46:03.630
is actually designed
to provide anonymity.

00:46:03.630 --> 00:46:06.530
Because almost every
protocol that's widely used

00:46:06.530 --> 00:46:08.324
has the assumption
in it, well, you

00:46:08.324 --> 00:46:09.740
know, anyone who
wants to can just

00:46:09.740 --> 00:46:12.500
see the IPs on this traffic.

00:46:12.500 --> 00:46:16.030
So there's no point in
trying to conceal identity.

00:46:16.030 --> 00:46:18.900
So in a particularly
complex protocol,

00:46:18.900 --> 00:46:22.320
like the whole stack of
protocols a web browser uses,

00:46:22.320 --> 00:46:24.020
there's no real way
to anonymize that

00:46:24.020 --> 00:46:27.400
just by anonymizing the traffic
with something like Tor.

00:46:27.400 --> 00:46:30.150
You need to hack the
web browser pretty hard

00:46:30.150 --> 00:46:32.810
to make it stop doing things
like leaking the list of fonts

00:46:32.810 --> 00:46:34.830
that are identified
on your system,

00:46:34.830 --> 00:46:38.540
leaking your exact
window size, allowing

00:46:38.540 --> 00:46:41.780
all kinds of permanent
cookie-like structures,

00:46:41.780 --> 00:46:44.740
leaking what's in the cache
and what's not in the cache,

00:46:44.740 --> 00:46:46.250
and so on.

00:46:46.250 --> 00:46:48.680
So your choices
there are basically

00:46:48.680 --> 00:46:52.180
isolate everything and restart
from a fresh VM all the time,

00:46:52.180 --> 00:46:53.514
or reroute the browser, or both.

00:46:53.514 --> 00:46:55.513
Other things are a lot
easier than web browsers,

00:46:55.513 --> 00:46:56.460
but still problematic.

00:46:56.460 --> 00:47:00.780
That's all I'm going to
say about app issues.

00:47:00.780 --> 00:47:02.850
Let's see, I think
I got the most

00:47:02.850 --> 00:47:05.142
hands-- did you see what
I got the most hands for,

00:47:05.142 --> 00:47:06.624
any opinions?

00:47:06.624 --> 00:47:08.083
STUDENT: Abuse and
hidden services?

00:47:08.083 --> 00:47:09.832
NICK MATHEWSON: Abuse
and hidden services.

00:47:09.832 --> 00:47:12.277
All right, I'll talk about
abuse and hidden services.

00:47:12.277 --> 00:47:15.200
And if I've still got time,
I'll do censorship and attacks.

00:47:15.200 --> 00:47:19.185
So let's go to abuse--
abuse, abuse, abuse.

00:47:22.420 --> 00:47:26.960
So one problem that
we've fortunately not

00:47:26.960 --> 00:47:30.707
had all that much of-- so when
we were working on this stuff,

00:47:30.707 --> 00:47:32.490
the problem that
everybody was afraid of

00:47:32.490 --> 00:47:34.698
was this horrible stuff that
would get you kicked off

00:47:34.698 --> 00:47:37.580
of any ISP, and it would
create tremendous legal issues

00:47:37.580 --> 00:47:38.750
and ruin your lives.

00:47:38.750 --> 00:47:41.360
I speak of course
of file sharing.

00:47:41.360 --> 00:47:43.540
We were terrified
that people would

00:47:43.540 --> 00:47:48.200
try to BitTorrent or Gnutella
or whatever over this thing.

00:47:48.200 --> 00:47:49.760
Yes, it was a long time ago.

00:47:49.760 --> 00:47:52.990
And we thought about
how we'd do that.

00:47:52.990 --> 00:47:55.470
Well, you'll see in the paper
that we talk a lot about exit

00:47:55.470 --> 00:47:58.140
policies, about
letting exit nodes say,

00:47:58.140 --> 00:48:03.040
I only allow connections
to port 80 and port 443.

00:48:03.040 --> 00:48:05.850
This doesn't actually
help with abuse at all.

00:48:05.850 --> 00:48:15.800
Because you can try to
spread worms over port 80.

00:48:15.800 --> 00:48:21.897
You can post abusive stuff
to IRC channels over web

00:48:21.897 --> 00:48:23.710
to IRC interfaces.

00:48:23.710 --> 00:48:26.140
Everything's got a web
interface these days.

00:48:26.140 --> 00:48:29.340
So you can't really
say, it's only web.

00:48:29.340 --> 00:48:30.400
It's safe.

00:48:30.400 --> 00:48:33.040
If it's useful,
it can be abused.

00:48:33.040 --> 00:48:35.450
That said, there
are people who are

00:48:35.450 --> 00:48:39.000
willing to run exits
that deliver 80 and 443

00:48:39.000 --> 00:48:42.547
who would not be willing to
run exits delivering all ports.

00:48:42.547 --> 00:48:43.880
So it did turn out to be useful.

00:48:43.880 --> 00:48:45.588
It just didn't turn
out to be a solution.

00:48:49.010 --> 00:48:54.699
Another thing that creates
problems is criminal activity

00:48:54.699 --> 00:48:56.740
generally doesn't create
problems for the network

00:48:56.740 --> 00:48:58.560
operators so much.

00:48:58.560 --> 00:49:01.750
From time to time, somebody's
server gets seized and returned

00:49:01.750 --> 00:49:04.550
six months later, and they
have to wipe the thing.

00:49:04.550 --> 00:49:07.430
That's still an infrequent
enough occurrence

00:49:07.430 --> 00:49:12.950
that it's somewhat
surprising when it happens.

00:49:12.950 --> 00:49:16.050
And so yeah, don't run
an exit node on a server

00:49:16.050 --> 00:49:19.185
that you need to graduate.

00:49:23.165 --> 00:49:23.665
What else?

00:49:27.670 --> 00:49:31.210
The biggest problem that
we have for abuse of stuff

00:49:31.210 --> 00:49:34.260
is that many websites
around the world,

00:49:34.260 --> 00:49:36.200
and many IRC
services and so one,

00:49:36.200 --> 00:49:42.210
use IP-based blocking in
order to deter and mitigate

00:49:42.210 --> 00:49:50.680
abusive behavior-- people
posting road kill pictures

00:49:50.680 --> 00:49:56.160
on My Little Pony sites,
people flaming everybody

00:49:56.160 --> 00:49:59.690
on IRC channels,
people making love,

00:49:59.690 --> 00:50:05.300
leave, join requests, people
replacing entire Wikipedia

00:50:05.300 --> 00:50:08.896
pages with racial slurs.

00:50:08.896 --> 00:50:09.770
This stuff it's real.

00:50:09.770 --> 00:50:10.478
It's problematic.

00:50:10.478 --> 00:50:13.560
It's unacceptable to the
websites and services

00:50:13.560 --> 00:50:15.580
that use IP-based blocking.

00:50:15.580 --> 00:50:18.140
They need a way to keep
this from happening.

00:50:18.140 --> 00:50:21.950
And IP-based blocking is a
cheap way for them to do that.

00:50:21.950 --> 00:50:27.230
So it's pretty frequent that
Tor users get banned completely

00:50:27.230 --> 00:50:30.340
from some sites.

00:50:30.340 --> 00:50:36.370
There's some work on trying to
say, well, why does IP-based

00:50:36.370 --> 00:50:37.330
blocking really work?

00:50:37.330 --> 00:50:40.690
Is it because IPs are people?

00:50:40.690 --> 00:50:41.310
No.

00:50:41.310 --> 00:50:44.295
Everybody in this room knows
how to get a different IP

00:50:44.295 --> 00:50:45.710
if they need one.

00:50:45.710 --> 00:50:49.540
Everybody in this room knows how
to get like tens of thousands

00:50:49.540 --> 00:50:51.550
of different IPs
if they need one,

00:50:51.550 --> 00:50:53.180
if they need tens of thousands.

00:50:53.180 --> 00:50:56.680
But for most people,
getting more IPs

00:50:56.680 --> 00:50:59.720
is at least a little time
consuming and at least

00:50:59.720 --> 00:51:03.265
a little challenging to the
extent that it imposes a rate

00:51:03.265 --> 00:51:05.660
limit and a resource
cost on abuse

00:51:05.660 --> 00:51:08.940
if you don't want a bot net
and if they've already blocked

00:51:08.940 --> 00:51:12.110
Tor and all the
other proxy services.

00:51:12.110 --> 00:51:16.850
So for that, you need to
look at different ways

00:51:16.850 --> 00:51:20.380
to provide other resource costs.

00:51:20.380 --> 00:51:24.970
You can either say, well--
have you done blind signatures?

00:51:24.970 --> 00:51:28.740
Oh, you can construct
things so that you

00:51:28.740 --> 00:51:31.210
need an IP to make an account.

00:51:31.210 --> 00:51:33.620
But what account
you make with an IP

00:51:33.620 --> 00:51:37.250
is not linkable to your IP.

00:51:37.250 --> 00:51:39.277
And then later on if
the account gets banned,

00:51:39.277 --> 00:51:41.670
you need to create a new
account from a different IP.

00:51:41.670 --> 00:51:44.211
That's something you can build,
and we're working with people

00:51:44.211 --> 00:51:47.890
to work on it, although it needs
more hacking on the integration

00:51:47.890 --> 00:51:48.630
side.

00:51:48.630 --> 00:51:51.213
Something else that needs more
hacking on the integration side

00:51:51.213 --> 00:51:54.387
is anonymous black
listable credentials.

00:51:54.387 --> 00:51:55.470
They're a little esoteric.

00:51:55.470 --> 00:52:02.220
But the idea is that
you get something

00:52:02.220 --> 00:52:05.780
that allows you to participate
on an IRC server, for example.

00:52:05.780 --> 00:52:08.080
You can use this as
many times as you want.

00:52:08.080 --> 00:52:12.380
Your using it is not linkable
until you are banned.

00:52:12.380 --> 00:52:14.580
Once you are banned,
future attempts

00:52:14.580 --> 00:52:18.000
from the same person with the
same credential don't work.

00:52:18.000 --> 00:52:21.840
But past activities do not
become linkable to one another.

00:52:21.840 --> 00:52:24.090
These can be built
pretty easily.

00:52:24.090 --> 00:52:26.730
The problem is convincing people
who are more or less satisfied

00:52:26.730 --> 00:52:29.300
with IP blocking to
actually use them

00:52:29.300 --> 00:52:32.965
and actually integrating
them with services.

00:52:32.965 --> 00:52:36.170
Someone inevitably asks
me-- it's kind of neat.

00:52:36.170 --> 00:52:43.310
So I started these lecture notes
based on my lecture from 2013.

00:52:43.310 --> 00:52:46.110
And there was something
about the inevitable question

00:52:46.110 --> 00:52:48.660
about Silk Road
1 getting busted.

00:52:48.660 --> 00:52:50.885
There's the inevitable
question about Silk Road 2

00:52:50.885 --> 00:52:51.510
getting busted.

00:52:51.510 --> 00:52:55.880
Silk Road 2 was a hidden service
operating on the Tor network

00:52:55.880 --> 00:52:58.650
where people would get
together to buy and sell

00:52:58.650 --> 00:53:03.480
illegal things,
mostly illegal drugs.

00:53:03.480 --> 00:53:06.360
So as far as we know, as
far as we can find out,

00:53:06.360 --> 00:53:10.050
the guy got busted
through bad OPSEC.

00:53:10.050 --> 00:53:13.810
Like he made a public
posting with his actual name,

00:53:13.810 --> 00:53:17.430
and then went and deleted it
and put his pseudonym on it.

00:53:17.430 --> 00:53:20.363
Tor can't help people
against that kind of stuff.

00:53:20.363 --> 00:53:23.520
On the other hand, if you've
been looking at the NSA leaks,

00:53:23.520 --> 00:53:26.640
you know that law enforcement
has been getting information

00:53:26.640 --> 00:53:29.620
from intelligence and
then sanitizing it

00:53:29.620 --> 00:53:33.495
through a process called
dual construction where

00:53:33.495 --> 00:53:36.120
the intelligence agency will say
to the law enforcement agency,

00:53:36.120 --> 00:53:38.390
OK, look, it's Fred over there.

00:53:38.390 --> 00:53:39.480
He did it.

00:53:39.480 --> 00:53:41.482
But that's not
admissible in a court,

00:53:41.482 --> 00:53:43.190
and you can never
admit that we told you.

00:53:43.190 --> 00:53:46.125
Just find some other way to
find out that Fred did it,

00:53:46.125 --> 00:53:48.120
but Fred did it.

00:53:48.120 --> 00:53:50.210
According to some
of the Snowden leaks

00:53:50.210 --> 00:53:52.380
and some of the leaks
from the other guy, who

00:53:52.380 --> 00:53:59.910
has still not been caught,
that's done sometimes.

00:53:59.910 --> 00:54:05.960
So OK, at this point, you use
your basic Bayesian reasoning

00:54:05.960 --> 00:54:08.850
skills, and you say,
well OK, would I

00:54:08.850 --> 00:54:11.090
see this evidence
if the guy actually

00:54:11.090 --> 00:54:13.040
got caught by because of OPSEC?

00:54:13.040 --> 00:54:14.490
Yes, I would.

00:54:14.490 --> 00:54:15.720
I would see bad OPSEC.

00:54:15.720 --> 00:54:19.880
I would see reports that he got
caught because of bad OPSEC.

00:54:19.880 --> 00:54:24.410
But what would I see if it
were a dual construction case?

00:54:24.410 --> 00:54:27.100
I would also see
reports that the guy

00:54:27.100 --> 00:54:29.450
got caught by bad OPSEC.

00:54:29.450 --> 00:54:32.100
Because the evidence that
would be available to

00:54:32.100 --> 00:54:33.970
us is the same in either case.

00:54:33.970 --> 00:54:38.185
We can't really conclude
much from any public reports

00:54:38.185 --> 00:54:39.940
of that.

00:54:39.940 --> 00:54:44.521
That said, it does look like
the guy got busted by bad OPSEC.

00:54:44.521 --> 00:54:46.145
It does look like
the kind of bad OPSEC

00:54:46.145 --> 00:54:48.000
that you would be
looking for if you

00:54:48.000 --> 00:54:51.210
were trying to catch somebody
running something like this.

00:54:51.210 --> 00:54:54.620
Nevertheless, earlier I
suggested that please do not

00:54:54.620 --> 00:54:58.130
use myself to break any laws.

00:54:58.130 --> 00:55:05.380
Also if you're life or
freedom is at stake from using

00:55:05.380 --> 00:55:09.665
Tor or any security
product, do not

00:55:09.665 --> 00:55:11.180
use that product in isolation.

00:55:11.180 --> 00:55:14.810
Think of ways to
use it to construct

00:55:14.810 --> 00:55:21.330
a series of redundant
defenses for yourself

00:55:21.330 --> 00:55:23.830
if your life or
freedom at stake,

00:55:23.830 --> 00:55:27.050
or if having the
system broken is

00:55:27.050 --> 00:55:28.780
completely unacceptable to you.

00:55:28.780 --> 00:55:30.024
And I'll say that about Tor.

00:55:30.024 --> 00:55:31.190
And I'll say that about TLS.

00:55:31.190 --> 00:55:33.590
And I'll say that about PGP.

00:55:33.590 --> 00:55:38.620
Software is a work in progress.

00:55:38.620 --> 00:55:41.065
So that's the abuse section.

00:55:41.065 --> 00:55:44.870
I've got 25 minutes--
hidden services.

00:55:47.750 --> 00:55:50.490
Where's hidden services?

00:55:50.490 --> 00:55:53.620
So responder anonymity
is a much harder problem

00:55:53.620 --> 00:55:55.640
than initiator anonymity.

00:55:55.640 --> 00:55:57.300
Initiator anonymity
is what you get

00:55:57.300 --> 00:56:00.210
when Alice wants to
buy socks, and Alice

00:56:00.210 --> 00:56:02.580
wants to stay anonymous
from the sock vendor.

00:56:02.580 --> 00:56:05.200
Responder anonymity
is when Alice

00:56:05.200 --> 00:56:09.300
wants to publish her
poetry online and run a web

00:56:09.300 --> 00:56:11.190
server that has
her poetry on it,

00:56:11.190 --> 00:56:14.150
but not let anyone know
where that web server is

00:56:14.150 --> 00:56:16.680
because the poetry
is so embarrassing.

00:56:16.680 --> 00:56:19.360
And yes there actually
is a hidden service

00:56:19.360 --> 00:56:21.710
out there of mine
with bad poetry on it.

00:56:21.710 --> 00:56:24.070
No, I don't think anybody's
actually published it yet.

00:56:24.070 --> 00:56:26.490
No, I'm not going to
tell anybody where it is.

00:56:26.490 --> 00:56:27.990
I'm waiting for it to go public.

00:56:31.390 --> 00:56:37.920
So all right, one thing
you could do is-- let's

00:56:37.920 --> 00:56:39.351
see, how much time?

00:56:39.351 --> 00:56:43.650
OK, I can do this.

00:56:43.650 --> 00:56:46.622
So now Alice wants to
publish her poetry.

00:56:46.622 --> 00:56:48.205
So I'm going to put
Alice on this end,

00:56:48.205 --> 00:56:49.450
because she's the responder.

00:56:49.450 --> 00:56:54.080
Alice could build a path-- this
represents a lot of relays--

00:56:54.080 --> 00:56:59.052
through the Tor network, and
then just say to this relay,

00:56:59.052 --> 00:57:00.135
please accept connections.

00:57:02.660 --> 00:57:05.600
So now anyone who goes
to this relay could say,

00:57:05.600 --> 00:57:07.770
hey, I want to talk to Alice.

00:57:07.770 --> 00:57:10.180
And there have been
designs that work this way.

00:57:10.180 --> 00:57:12.620
It has some challenges, though.

00:57:12.620 --> 00:57:15.185
One challenge is this relay
could man in the middle

00:57:15.185 --> 00:57:19.920
all the traffic unless there
is a well known TLS key.

00:57:19.920 --> 00:57:22.400
Another thing is
maybe this relay

00:57:22.400 --> 00:57:24.396
is also embarrassed
by the poetry

00:57:24.396 --> 00:57:26.020
and doesn't want to
be a public contact

00:57:26.020 --> 00:57:31.160
point for poetry so terrible.

00:57:31.160 --> 00:57:35.280
So this relay could also be
pressured by other people who

00:57:35.280 --> 00:57:37.760
hate the poetry to censor it.

00:57:37.760 --> 00:57:41.940
This relay could also make
itself an attack target.

00:57:41.940 --> 00:57:45.130
So you want some way where Alice
can go to different relays over

00:57:45.130 --> 00:57:51.170
time and no single relay is
touching unencrypted traffic

00:57:51.170 --> 00:57:52.480
of Alice's.

00:57:52.480 --> 00:57:56.620
All right, that's doable.

00:57:56.620 --> 00:57:58.510
But once you have a lot
of different relays,

00:57:58.510 --> 00:58:01.790
what does Alice
actually tell people?

00:58:01.790 --> 00:58:04.490
It's kind of got
to be a public key.

00:58:04.490 --> 00:58:08.250
Because if she just says, relay
x, relay y, relay z, but x, y,

00:58:08.250 --> 00:58:11.530
and z are changing
every five minutes,

00:58:11.530 --> 00:58:13.920
that's kind of challenging
to know you actually

00:58:13.920 --> 00:58:15.570
got the right relay.

00:58:15.570 --> 00:58:17.590
So let's say she tells
everybody a public key,

00:58:17.590 --> 00:58:22.550
and once she gets over here,
she says, hey, this is Alice.

00:58:22.550 --> 00:58:24.090
I'll prove it with
my public key.

00:58:24.090 --> 00:58:33.960
So this relay knows
that public key z is

00:58:33.960 --> 00:58:35.380
running a hidden service here.

00:58:35.380 --> 00:58:38.330
And so if anyone else says,
hey, connect me to public key z,

00:58:38.330 --> 00:58:41.130
they can do a
handshake and wind up

00:58:41.130 --> 00:58:43.170
with a shared key with Alice.

00:58:43.170 --> 00:58:46.260
And it's the same handshake as
the Tor circuit extension uses.

00:58:46.260 --> 00:58:48.590
And now Bob can
read Alice's poetry

00:58:48.590 --> 00:58:52.190
by going another path through
the Tor network over here.

00:58:52.190 --> 00:58:57.045
Bob has to know PKz, and Bob can
say, hey, connect me with PKz.

00:58:57.045 --> 00:58:59.170
Send this thing that's sort
of like a create cell--

00:58:59.170 --> 00:59:01.380
really it's an introduce
cell, but let's

00:59:01.380 --> 00:59:03.380
forget that-- over the Alice.

00:59:03.380 --> 00:59:05.820
They do the same
handshake that relays do.

00:59:05.820 --> 00:59:07.913
And now they have a
shared key that they can

00:59:07.913 --> 00:59:10.100
use for end to end encryption.

00:59:10.100 --> 00:59:11.915
Well, there's something
I left out, though,

00:59:11.915 --> 00:59:15.120
which is, how does Bob
know how to go here?

00:59:15.120 --> 00:59:17.082
And can we do anything
about the fact

00:59:17.082 --> 00:59:22.480
that this relay has to
learn to this public key?

00:59:22.480 --> 00:59:23.070
Well, we can.

00:59:23.070 --> 00:59:27.730
We can add some [INAUDIBLE]
directory system

00:59:27.730 --> 00:59:32.745
where Alice uploads a signed
statement anonymously over Tor

00:59:32.745 --> 00:59:38.725
saying PKz is at a relay x.

00:59:41.590 --> 00:59:44.620
And then Bob says, hey,
give me a signed statement

00:59:44.620 --> 00:59:46.520
to ask the directory
system, hey, give me

00:59:46.520 --> 00:59:49.940
a signed statement about PKz.

00:59:49.940 --> 00:59:52.376
And Bob finds out where to go.

00:59:52.376 --> 00:59:56.740
And we could even do one
better and have Alice give

00:59:56.740 --> 00:59:59.250
a different public key here.

00:59:59.250 --> 01:00:00.890
So this could be PKw.

01:00:04.660 --> 01:00:09.840
And the statement she uploads
to the directory can say,

01:00:09.840 --> 01:00:12.730
if you want to talk to the
service with public key z,

01:00:12.730 --> 01:00:16.560
then go to relay x
and use public key w.

01:00:16.560 --> 01:00:21.820
And now public key z
isn't published here.

01:00:21.820 --> 01:00:26.590
You could even go one
farther and encrypt this

01:00:26.590 --> 01:00:29.480
with some shared secret
known to Alice and Bob.

01:00:29.480 --> 01:00:32.330
And if you do that, then
the directory service

01:00:32.330 --> 01:00:34.990
and people who can contact
the directory service

01:00:34.990 --> 01:00:39.530
can't learn how to connect
to Alice with that.

01:00:39.530 --> 01:00:40.030
Yeah.

01:00:40.030 --> 01:00:42.190
STUDENT: Just a
quick question there.

01:00:42.190 --> 01:00:44.850
If that's not encrypted,
then Rx can still

01:00:44.850 --> 01:00:48.010
find out that it's running
a service for Alice, right?

01:00:48.010 --> 01:00:48.890
NICK MATHEWSON: Yep.

01:00:48.890 --> 01:00:49.934
Well, not for Alice.

01:00:49.934 --> 01:00:51.475
It can find out that
it's running PKz

01:00:51.475 --> 01:00:53.060
if this is not encrypted.

01:00:53.060 --> 01:00:55.680
We have a design for that
that I'm actually going

01:00:55.680 --> 01:00:56.950
to get to at the end of this.

01:00:56.950 --> 01:00:58.740
But it's not built yet.

01:00:58.740 --> 01:01:01.040
But it's pretty cool.

01:01:01.040 --> 01:01:03.535
So OK, and you don't want to
use a centralized directory

01:01:03.535 --> 01:01:04.460
for this.

01:01:04.460 --> 01:01:12.280
So we actually do use a DHT,
which is, again, not perfect,

01:01:12.280 --> 01:01:14.370
and has some censorship
opportunities.

01:01:14.370 --> 01:01:16.966
But we are trying to
make those less and less.

01:01:16.966 --> 01:01:19.700
And I might cover more stuff,
so I can't do the whole details.

01:01:22.510 --> 01:01:24.860
So one of the
problems there though

01:01:24.860 --> 01:01:28.090
is if you are running one
of these directory services,

01:01:28.090 --> 01:01:35.960
you've got a complete list of
these keys pretty-- over time,

01:01:35.960 --> 01:01:37.800
you run a directory
service [INAUDIBLE].

01:01:37.800 --> 01:01:39.551
You get a complete
list of all these keys,

01:01:39.551 --> 01:01:41.300
and you can try
connecting to all the ones

01:01:41.300 --> 01:01:43.830
that don't have encrypted
stuff to find out what's there.

01:01:43.830 --> 01:01:45.509
That's called an
enumeration attack.

01:01:45.509 --> 01:01:47.050
And we didn't list
that in our paper,

01:01:47.050 --> 01:01:49.690
because we weren't
thinking of that.

01:01:49.690 --> 01:01:51.270
We didn't.

01:01:51.270 --> 01:01:53.630
But it is something
we'd like to resist.

01:01:53.630 --> 01:01:57.680
So in the design I hope
to be hacking together

01:01:57.680 --> 01:02:02.020
sometime in 2014, we're going
to move towards a key blinding

01:02:02.020 --> 01:02:18.770
approach where Alice
and Bob share PKz,

01:02:18.770 --> 01:02:22.190
but this statement is
not signed with PKz.

01:02:22.190 --> 01:02:24.780
This statement is
signed with PKz prime

01:02:24.780 --> 01:02:33.380
where PKz prime is
derived from PKz

01:02:33.380 --> 01:02:44.490
and, say, the date such that
if you know PKz and the date,

01:02:44.490 --> 01:02:47.240
you can derive PKz prime.

01:02:47.240 --> 01:02:51.810
If like Alice you
know secret Kz,

01:02:51.810 --> 01:02:56.550
you can generate messages
that are signed by PKz prime.

01:02:56.550 --> 01:03:01.410
But if you only see PKz
prime, even knowing the date,

01:03:01.410 --> 01:03:04.440
you cannot re-derive PKz.

01:03:04.440 --> 01:03:06.170
We've got a proof.

01:03:06.170 --> 01:03:10.495
And if you'd like to find out
how this works, then ping me

01:03:10.495 --> 01:03:11.960
and I'll send you the paper.

01:03:11.960 --> 01:03:15.700
It's a cool trick.

01:03:15.700 --> 01:03:18.590
We weren't the first
ones to invent this idea.

01:03:18.590 --> 01:03:22.900
But that is how we're going
to solve enumeration attacks

01:03:22.900 --> 01:03:26.790
sometime this coming year
assuming that I can actually

01:03:26.790 --> 01:03:29.253
get the time to build it.

01:03:29.253 --> 01:03:30.336
So that's hidden services.

01:03:34.730 --> 01:03:41.630
Attacks and
defenses-- so so far,

01:03:41.630 --> 01:03:44.600
the biggest category
of attacks we've seen

01:03:44.600 --> 01:03:47.370
is attacks at the
application level.

01:03:47.370 --> 01:03:50.810
So if you're running an
application over Tor,

01:03:50.810 --> 01:03:56.146
and it's sending unencrypted
traffic, like regular HTTP,

01:03:56.146 --> 01:03:59.450
then a hostile exit
node, just like anyone

01:03:59.450 --> 01:04:02.470
else who touches HTTP traffic,
can observe and modify

01:04:02.470 --> 01:04:04.830
the traffic.

01:04:04.830 --> 01:04:08.240
This is the number one
attack on the whole system.

01:04:08.240 --> 01:04:10.120
The solution is
encrypted traffic.

01:04:10.120 --> 01:04:13.060
Fortunately, we're kind of
in an encryption renaissance

01:04:13.060 --> 01:04:14.520
over the last few years.

01:04:14.520 --> 01:04:16.650
And more and more
traffic is getting

01:04:16.650 --> 01:04:21.520
encrypted with the nifty
free certificate authority

01:04:21.520 --> 01:04:25.550
that EFF and Mozilla and Cisco
and I forget who else announced

01:04:25.550 --> 01:04:26.740
a day or two ago.

01:04:26.740 --> 01:04:29.632
There will be even less excuse
for unencrypted traffic in 2015

01:04:29.632 --> 01:04:31.420
than there was this year.

01:04:31.420 --> 01:04:33.210
So that solves that.

01:04:33.210 --> 01:04:37.580
More interesting attacks include
things like traffic tagging.

01:04:37.580 --> 01:04:44.090
So we made a mistake in our
early integrity checking

01:04:44.090 --> 01:04:44.870
implementation.

01:04:44.870 --> 01:04:47.870
Our early integrity
checking implementation

01:04:47.870 --> 01:04:55.098
did end to end checking between
Alice's program and the exit

01:04:55.098 --> 01:04:56.410
node.

01:04:56.410 --> 01:04:58.900
But it turns out that
that's not enough.

01:04:58.900 --> 01:05:02.410
Because if the
first relay messes

01:05:02.410 --> 01:05:07.290
with the traffic in a way that
creates a pattern that the exit

01:05:07.290 --> 01:05:10.330
node can detect, then
that's an easy way

01:05:10.330 --> 01:05:12.800
for the first relay
and the last relay

01:05:12.800 --> 01:05:17.860
to learn that they are on the
same path and identify Alice.

01:05:17.860 --> 01:05:20.220
Of course, if the first
relay and the last relay

01:05:20.220 --> 01:05:23.390
happen to be on the
same path, happen

01:05:23.390 --> 01:05:25.950
to be collaborating anyway,
then they can already

01:05:25.950 --> 01:05:30.000
identify Alice through traffic
correlation, we believe.

01:05:30.000 --> 01:05:34.944
But perhaps it should not
be so easy for them as that.

01:05:34.944 --> 01:05:36.610
Perhaps traffic
correlation will someday

01:05:36.610 --> 01:05:38.330
be harder than we think.

01:05:38.330 --> 01:05:41.460
It would be good to actually
solve that attack for real.

01:05:41.460 --> 01:05:43.700
We've got two
solutions for that.

01:05:43.700 --> 01:05:46.220
One is the expected
result of this attack

01:05:46.220 --> 01:05:48.350
is that periodically
circuits will fail.

01:05:48.350 --> 01:05:50.750
Because the attacker
on the first hop

01:05:50.750 --> 01:05:53.570
guessed wrong about
controlling the last hop.

01:05:53.570 --> 01:05:59.130
So every Tor client checks
for weird failure rates.

01:05:59.130 --> 01:06:00.910
The real long-term
fix is to make it

01:06:00.910 --> 01:06:04.570
so that messing with the
pattern on the first hop

01:06:04.570 --> 01:06:07.890
doesn't create more than 1 bit
of information on the last hop.

01:06:07.890 --> 01:06:10.790
You can't avoid sending
1 bit of information,

01:06:10.790 --> 01:06:13.830
because the first hop can always
just shut down the connection.

01:06:13.830 --> 01:06:17.097
But you can limit it
to 1 bit-- OK, 2 bits.

01:06:17.097 --> 01:06:19.430
Because then they'll have the
choice to corrupt the data

01:06:19.430 --> 01:06:20.740
or shut down the connection.

01:06:23.700 --> 01:06:25.716
Oh, I had an idea of
how to make that better.

01:06:25.716 --> 01:06:28.980
I'll have to think about that.

01:06:28.980 --> 01:06:32.610
Let's see, DOS is
actually pretty important.

01:06:32.610 --> 01:06:34.610
There was a paper the
other year about something

01:06:34.610 --> 01:06:36.640
that the authors called
the sniper attack

01:06:36.640 --> 01:06:39.986
where you see traffic
coming from a Tor node

01:06:39.986 --> 01:06:41.850
that you don't control.

01:06:41.850 --> 01:06:44.230
You want to kick everybody
off that Tor node.

01:06:44.230 --> 01:06:45.490
So you connect to it.

01:06:45.490 --> 01:06:50.217
You fill up all its memory
buffers, and it crashes.

01:06:50.217 --> 01:06:52.050
Then you see whether
the traffic in question

01:06:52.050 --> 01:06:54.055
gets rerouted to a node
you control or not,

01:06:54.055 --> 01:06:55.235
and you repeat as necessary.

01:06:59.020 --> 01:07:02.575
For that, our best
options are first off,

01:07:02.575 --> 01:07:05.310
no longer have memory DOSes.

01:07:05.310 --> 01:07:10.550
I think we have all of the
good memory DOSes fixed now.

01:07:10.550 --> 01:07:13.080
There are some bad ones that
still needed to get addressed.

01:07:13.080 --> 01:07:16.430
But they're screamingly
inefficient.

01:07:16.430 --> 01:07:19.770
The other option for
resolving this kind of thing

01:07:19.770 --> 01:07:23.020
is make sure relays
are high capacity.

01:07:23.020 --> 01:07:25.720
Don't accept low capacity
relays on the network.

01:07:25.720 --> 01:07:26.700
We do that, too.

01:07:26.700 --> 01:07:30.130
If you're trying to run
a relay on your phone,

01:07:30.130 --> 01:07:31.570
the authorities won't list it.

01:07:35.950 --> 01:07:39.350
And another thing is to try
to pick our circuit scheduling

01:07:39.350 --> 01:07:45.710
algorithms so that it's
hard to starve out circuits

01:07:45.710 --> 01:07:46.820
that you don't control.

01:07:46.820 --> 01:07:50.605
That's very hard,
though, and it's as yet

01:07:50.605 --> 01:07:52.830
an unsolved problem.

01:07:52.830 --> 01:07:55.660
Let's see, should I do
an interesting attack

01:07:55.660 --> 01:07:58.342
or an important attack?

01:07:58.342 --> 01:07:59.216
STUDENT: Interesting.

01:07:59.216 --> 01:08:01.600
NICK MATHEWSON: Interesting, OK.

01:08:01.600 --> 01:08:03.094
So show of hands,
how many people

01:08:03.094 --> 01:08:04.510
might like to write
a program that

01:08:04.510 --> 01:08:07.130
uses cryptography some day?

01:08:07.130 --> 01:08:08.770
Cool, here's what
you must learn.

01:08:08.770 --> 01:08:12.540
Never trust your
cryptography implementation.

01:08:12.540 --> 01:08:15.670
So even when it's
correct, it's wrong.

01:08:15.670 --> 01:08:21.825
So long ago-- I think this may
be one of the worse security

01:08:21.825 --> 01:08:24.430
bugs that we've had.

01:08:24.430 --> 01:08:25.805
Any relay could
man in the middle

01:08:25.805 --> 01:08:32.420
any circuit because we assumed
that a correct Diffie-Hellman

01:08:32.420 --> 01:08:38.120
implementation would verify
that it was not being passed 0

01:08:38.120 --> 01:08:40.600
as one of the inputs.

01:08:40.600 --> 01:08:42.770
The authors of our
Diffie-Hellman implementation

01:08:42.770 --> 01:08:44.758
assumed the proper
application would never

01:08:44.758 --> 01:08:49.470
pass zero to a Diffie-Hellman
implementation.

01:08:49.470 --> 01:08:56.229
So Diffie-Hellman, when I say
g to the x, you say g to the y.

01:08:56.229 --> 01:08:57.340
I know x.

01:08:57.340 --> 01:08:58.310
You know y.

01:08:58.310 --> 01:09:01.332
And we can both compute
g to the xy now.

01:09:01.332 --> 01:09:02.540
You tend to feel me?

01:09:02.540 --> 01:09:03.100
Good.

01:09:03.100 --> 01:09:06.640
Well, if instead the
man in the middle

01:09:06.640 --> 01:09:10.990
replaces my g to the x with
0 and your g to the x with 0,

01:09:10.990 --> 01:09:13.100
and then I happily
compute 0 to the x,

01:09:13.100 --> 01:09:16.890
and you compute 0 to the y,
we will have the same key.

01:09:16.890 --> 01:09:18.719
We will happily
talk to each other.

01:09:18.719 --> 01:09:22.740
But this will be a key that the
attacker knows, because it's 0.

01:09:22.740 --> 01:09:25.149
1 also works.

01:09:25.149 --> 01:09:27.290
p also works.

01:09:27.290 --> 01:09:29.729
p plus 1 also works.

01:09:29.729 --> 01:09:33.110
So you basically just need to
make sure that your values here

01:09:33.110 --> 01:09:37.120
are within range 2 and p minus
1 if you're doing Diffie-Hellman

01:09:37.120 --> 01:09:38.439
in z sub p.

01:09:41.010 --> 01:09:47.090
OK, let's see, I would love
to talk more about censorship.

01:09:47.090 --> 01:09:49.609
Because actually,
it's one of the areas

01:09:49.609 --> 01:09:51.460
where we can do the most good.

01:09:51.460 --> 01:09:55.260
Generally, the summarized
version of that

01:09:55.260 --> 01:09:57.240
was, in the earliest
paper you read,

01:09:57.240 --> 01:09:59.880
and in some of the updates,
we were still on the idea

01:09:59.880 --> 01:10:01.880
that we would try to
make Tor look just

01:10:01.880 --> 01:10:05.275
like a web client talking
to a web server over HTTPS

01:10:05.275 --> 01:10:06.869
and make that hard to block.

01:10:06.869 --> 01:10:08.660
It turns out that's
fantastically difficult

01:10:08.660 --> 01:10:10.820
and probably not worth doing.

01:10:10.820 --> 01:10:12.250
Instead, the
approach we take now

01:10:12.250 --> 01:10:15.190
is using different
plug-in programs

01:10:15.190 --> 01:10:21.030
that a non-listed relay
called a bridge can use,

01:10:21.030 --> 01:10:23.930
and a client can use
to do different traffic

01:10:23.930 --> 01:10:25.440
transformations.

01:10:25.440 --> 01:10:28.675
And we manage to keep
adding new ones of those

01:10:28.675 --> 01:10:30.800
faster than the censors
have been able to implement

01:10:30.800 --> 01:10:32.380
blocking for them.

01:10:32.380 --> 01:10:38.560
And that's actually a case
where none of the solutions

01:10:38.560 --> 01:10:42.320
are categorically workable.

01:10:42.320 --> 01:10:44.030
That's not a
well-formed sentence.

01:10:44.030 --> 01:10:47.170
None of these plug-ins
are inherently

01:10:47.170 --> 01:10:50.651
unblockable by any
imaginable technique so far.

01:10:50.651 --> 01:10:53.150
But they're good enough to keep
traffic unblocked for a year

01:10:53.150 --> 01:10:56.390
or two in most places,
and six or seven

01:10:56.390 --> 01:10:59.460
months at a time in China.

01:10:59.460 --> 01:11:02.760
China currently has the most
competent censors in the world,

01:11:02.760 --> 01:11:04.580
largely because China
doesn't outsource.

01:11:04.580 --> 01:11:08.330
Most other censoring countries
outsource their censorship

01:11:08.330 --> 01:11:12.680
to dishonest European, American,
and Asian companies whose

01:11:12.680 --> 01:11:15.410
incentives are not actually
to sell them good censorship,

01:11:15.410 --> 01:11:17.820
but to keep them on
an upgrade treadmill.

01:11:17.820 --> 01:11:21.130
So if you were buying
your censorship software

01:11:21.130 --> 01:11:24.470
from the United States--
which technically speaking

01:11:24.470 --> 01:11:27.220
US companies aren't allowed
to make censorship software

01:11:27.220 --> 01:11:29.140
for nations.

01:11:29.140 --> 01:11:32.620
But they just make
corporate firewall software

01:11:32.620 --> 01:11:34.650
that happens to scale
to 10 million people.

01:11:37.240 --> 01:11:39.116
Yeah, I think that's unethical.

01:11:39.116 --> 01:11:41.900
But again, I'm not the political
scientist of the organization,

01:11:41.900 --> 01:11:43.729
or the philosopher.

01:11:43.729 --> 01:11:46.020
Paul Syverson, one of the
original [INAUDIBLE] authors,

01:11:46.020 --> 01:11:47.790
does have a degree
in philosophy,

01:11:47.790 --> 01:11:50.090
for what that's worth,
which means that he can't

01:11:50.090 --> 01:11:50.886
answer these questions either.

01:11:50.886 --> 01:11:52.761
But he takes a lot longer
not to answer them.

01:11:56.720 --> 01:11:58.550
Right, where was I?

01:11:58.550 --> 01:12:01.380
90 minutes is a long time.

01:12:01.380 --> 01:12:05.200
Censorship-- right, so what
the censorware providers

01:12:05.200 --> 01:12:10.020
do is once Tor gets
around their censorship,

01:12:10.020 --> 01:12:13.510
they will block the most
recent version of Tor.

01:12:13.510 --> 01:12:17.480
But they do it in a way that
is the weakest possible block.

01:12:17.480 --> 01:12:20.470
So if we change 1 bit in
one identifier somewhere,

01:12:20.470 --> 01:12:22.150
we get around it.

01:12:22.150 --> 01:12:25.050
We can't prove that they're
doing this on purpose

01:12:25.050 --> 01:12:30.890
to ensure that Tor will evade
their version so that they can

01:12:30.890 --> 01:12:34.370
sell Tor blocking and then have
it not work so they can sell

01:12:34.370 --> 01:12:36.360
the upgrade, and then
sell the next upgrade,

01:12:36.360 --> 01:12:37.640
and sell the next upgrade.

01:12:37.640 --> 01:12:39.480
But it sure does seem that way.

01:12:39.480 --> 01:12:42.614
So that's another reason not to
work for censorship providers.

01:12:42.614 --> 01:12:44.530
They're tremendously
unethical, and they don't

01:12:44.530 --> 01:12:45.654
provide very good software.

01:12:48.180 --> 01:12:50.920
If you're interested
in writing any

01:12:50.920 --> 01:12:52.584
of these plug-able
transport things,

01:12:52.584 --> 01:12:54.000
that is an excellent
kind of thing

01:12:54.000 --> 01:12:56.877
to do as a student
project-- loads of fun,

01:12:56.877 --> 01:12:58.460
learn a little bit
about crypto, learn

01:12:58.460 --> 01:13:00.076
a little bit about networking.

01:13:00.076 --> 01:13:02.200
And so long as you do it
in a memory-safe language,

01:13:02.200 --> 01:13:04.240
you can't screw
it up that badly.

01:13:04.240 --> 01:13:06.350
The worst thing
that happens is it

01:13:06.350 --> 01:13:10.496
gets censored after a month
instead of after a year.

01:13:10.496 --> 01:13:17.600
And that's what I want to-- oh,
the addenda related to work.

01:13:17.600 --> 01:13:21.680
Tor is the most popular
system of its kind,

01:13:21.680 --> 01:13:23.110
but it's not the only one.

01:13:23.110 --> 01:13:24.740
Lots of others have
really good ideas,

01:13:24.740 --> 01:13:26.820
and you should
check them out too

01:13:26.820 --> 01:13:29.822
if you're interested
in learning all

01:13:29.822 --> 01:13:31.280
of the stuff I'm
not thinking about

01:13:31.280 --> 01:13:33.770
and all the reasons I'm wrong.

01:13:33.770 --> 01:13:37.330
freehaven.net/anonbib/
lists the academic research

01:13:37.330 --> 01:13:39.290
and publications in this area.

01:13:39.290 --> 01:13:42.240
But not all the research
in this area is academic.

01:13:42.240 --> 01:13:48.680
You should also
look at I2P; Gnunet;

01:13:48.680 --> 01:13:52.090
Freedom, which is
currently defunct,

01:13:52.090 --> 01:14:09.640
no pun intended; Mixmaster;
Mixminion; Sphynx with a Y,

01:14:09.640 --> 01:14:17.280
Sphinx with an I is
something different; DC-nets,

01:14:17.280 --> 01:14:25.950
particularly the work of Brian
Ford, and also of the team

01:14:25.950 --> 01:14:28.645
at Technical University
Dresden, in trying

01:14:28.645 --> 01:14:30.240
to make DC-nets practical.

01:14:30.240 --> 01:14:32.770
They're very strong [INAUDIBLE],
not actually deployable

01:14:32.770 --> 01:14:35.245
yet-- and many others.

01:14:41.040 --> 01:14:44.230
Why these get less use
or attention than Tor

01:14:44.230 --> 01:14:48.270
is an open topic
of some interest

01:14:48.270 --> 01:14:50.910
that I don't have
a solid answer for.

01:14:50.910 --> 01:14:55.120
Future work-- so
one of the reasons

01:14:55.120 --> 01:14:58.940
I do these is not just
because I would like everybody

01:14:58.940 --> 01:15:00.700
to know about the cool
software I work on.

01:15:00.700 --> 01:15:02.820
But also because I
know students have

01:15:02.820 --> 01:15:05.090
lots and lots of free time.

01:15:05.090 --> 01:15:07.360
And I'm kind of
looking to recruit.

01:15:07.360 --> 01:15:09.180
OK, you may think I'm joking.

01:15:09.180 --> 01:15:12.730
But when I was just getting
started in this field,

01:15:12.730 --> 01:15:16.790
I was complaining about how I
was so busy reviewing papers

01:15:16.790 --> 01:15:19.254
for one conference, writing
software, fixing a bug,

01:15:19.254 --> 01:15:19.920
answering email.

01:15:19.920 --> 01:15:21.920
I was complaining to some
senior faculty member.

01:15:21.920 --> 01:15:27.060
And he told me, you will
never have so much free time

01:15:27.060 --> 01:15:27.850
as you do today.

01:15:29.955 --> 01:15:31.330
You actually have
a lot more free

01:15:31.330 --> 01:15:33.050
time now than you
will in 10 years.

01:15:33.050 --> 01:15:37.580
So this is a great time to work
on crazy software projects.

01:15:37.580 --> 01:15:39.680
So let me tell you about
future work in Tor.

01:15:39.680 --> 01:15:43.710
There's this key blinding
thing and a complete revamp

01:15:43.710 --> 01:15:45.670
of our hidden
services system, which

01:15:45.670 --> 01:15:47.900
was the best we could design
when we came up with it.

01:15:47.900 --> 01:15:49.816
But there's been a lot
of research since then.

01:15:49.816 --> 01:15:52.710
Maybe some of it will turn
out to be a good idea.

01:15:52.710 --> 01:15:54.480
We're also revamping
most of our crypto.

01:15:54.480 --> 01:15:58.320
We chose schemes that
seemed like a good security

01:15:58.320 --> 01:16:03.140
performance trade-off
in 2003, like RSA-1024.

01:16:03.140 --> 01:16:05.720
We've replaced the
really important uses

01:16:05.720 --> 01:16:09.580
of RSA-1024 with stronger stuff,
currently [INAUDIBLE] 25519.

01:16:09.580 --> 01:16:11.080
But there's still
some cases that we

01:16:11.080 --> 01:16:14.797
want to replace in the protocol
that we need some work on.

01:16:14.797 --> 01:16:16.630
I didn't talk too much
about path selection,

01:16:16.630 --> 01:16:19.910
so I can't talk too much about
improvements in that selection.

01:16:19.910 --> 01:16:24.410
But our path selection
algorithms were [INAUDIBLE].

01:16:24.410 --> 01:16:26.140
And there's been
some awesome research

01:16:26.140 --> 01:16:31.750
in the past five or six years on
that that we need to integrate.

01:16:31.750 --> 01:16:33.900
There's a little
work that's been

01:16:33.900 --> 01:16:38.500
done on mixing high latency
and low latency traffic so

01:16:38.500 --> 01:16:41.345
that the low latency
traffic can provide cover

01:16:41.345 --> 01:16:44.270
for the high latency traffic
in terms of providing lots

01:16:44.270 --> 01:16:47.960
of users while the high latency
traffic is still very well

01:16:47.960 --> 01:16:50.500
anonymized.

01:16:50.500 --> 01:16:53.970
It's not clear whether
this would work or not.

01:16:53.970 --> 01:16:57.600
It's not clear whether
anyone would use this or not.

01:16:57.600 --> 01:17:01.080
And it is clear that
unless something changes,

01:17:01.080 --> 01:17:03.879
or unless some major funding
for that particularly shows up,

01:17:03.879 --> 01:17:05.920
I'm not going to have time
to work on it in 2015.

01:17:05.920 --> 01:17:08.045
But if somebody else wants
to hack on that, my god,

01:17:08.045 --> 01:17:08.860
that would be fun.

01:17:08.860 --> 01:17:10.920
Our congestion
control algorithms

01:17:10.920 --> 01:17:15.030
were chosen questionably based
on what we could hack together

01:17:15.030 --> 01:17:17.050
in a week.

01:17:17.050 --> 01:17:20.360
We've improved them, but they
could use a bigger revamp.

01:17:20.360 --> 01:17:23.070
There's some research on
scaling to hundreds of thousands

01:17:23.070 --> 01:17:24.170
of nodes.

01:17:24.170 --> 01:17:26.800
So in the current
design, we can probably

01:17:26.800 --> 01:17:29.630
get up to 10,000 or
20,000 with no problem.

01:17:29.630 --> 01:17:33.330
But because we assume that every
client knows about every node,

01:17:33.330 --> 01:17:35.670
and every node may be
connected to every other node,

01:17:35.670 --> 01:17:38.350
that's going to stop
scaling before 100,000.

01:17:38.350 --> 01:17:41.250
And we need to do
something about that.

01:17:41.250 --> 01:17:43.070
That opens up some
classes of attacks

01:17:43.070 --> 01:17:47.680
based on attackers learning
which clients know which nodes

01:17:47.680 --> 01:17:50.574
and using that to
distinguish clients.

01:17:50.574 --> 01:17:52.740
So most of the naive
approaches are a bad idea here.

01:17:52.740 --> 01:17:56.960
But it may be that less naive
approaches might work out.

01:17:56.960 --> 01:17:59.585
Another thing you might want to
do if you're increasing 100,000

01:17:59.585 --> 01:18:02.230
nodes is get rid of those
centralized directory

01:18:02.230 --> 01:18:05.840
authorities and go to some
kind of peer to peer design.

01:18:05.840 --> 01:18:10.010
I don't have extremely
high confidence

01:18:10.010 --> 01:18:12.530
in the peer to peer
designs I know of so far.

01:18:12.530 --> 01:18:16.940
But it could be that
somebody's about to advance

01:18:16.940 --> 01:18:17.690
the next good one.

01:18:20.230 --> 01:18:23.400
Let's see, I don't
know what that means.

01:18:23.400 --> 01:18:26.566
Oh, somebody asked a
question about adding

01:18:26.566 --> 01:18:33.013
padding traffic or fake
traffic to try to deceive end

01:18:33.013 --> 01:18:34.360
to end traffic correlation.

01:18:34.360 --> 01:18:36.150
This is an exciting
research field

01:18:36.150 --> 01:18:40.422
that needs someone smarter
to work on it or someone

01:18:40.422 --> 01:18:42.630
with a more practical attitude
to work on it than has

01:18:42.630 --> 01:18:44.230
previously worked on it.

01:18:44.230 --> 01:18:47.260
Too many of the results
in the research literature

01:18:47.260 --> 01:18:51.345
there are only about
distinguishing the traffic

01:18:51.345 --> 01:18:55.229
of two users on a number
containing one relay,

01:18:55.229 --> 01:18:57.020
because that's how the
math was easy to do.

01:19:00.230 --> 01:19:02.230
So because of this
kind of stuff,

01:19:02.230 --> 01:19:04.240
all of the traffic
analysis defenses

01:19:04.240 --> 01:19:06.200
that we know of in this
area that are still

01:19:06.200 --> 01:19:10.109
compatible with broad
browsing, they sound good

01:19:10.109 --> 01:19:11.150
if you read the abstract.

01:19:11.150 --> 01:19:14.790
You'll say, hooray, this
one forces the attacker

01:19:14.790 --> 01:19:17.510
to gather three times as
much traffic before they

01:19:17.510 --> 01:19:19.020
can correlate users.

01:19:19.020 --> 01:19:20.950
Except when you
actually read the paper,

01:19:20.950 --> 01:19:23.510
previously the attacker needed
two seconds worth of traffic,

01:19:23.510 --> 01:19:24.485
and then they won.

01:19:24.485 --> 01:19:26.940
Now they need six seconds.

01:19:26.940 --> 01:19:29.430
That's not really a
defence in this model,

01:19:29.430 --> 01:19:33.699
although perhaps
against a real network,

01:19:33.699 --> 01:19:35.740
the numbers would be
different and it might work.

01:19:35.740 --> 01:19:38.930
So we would actually like
to see some stuff done

01:19:38.930 --> 01:19:40.470
with padding and fake traffic.

01:19:40.470 --> 01:19:43.645
But we don't like to
add voodoo defenses

01:19:43.645 --> 01:19:45.580
that we conjecture to
maybe do some good,

01:19:45.580 --> 01:19:47.340
although we can't do that.

01:19:47.340 --> 01:19:48.715
We actually like
to have evidence

01:19:48.715 --> 01:19:50.590
that any changes
we're going to make

01:19:50.590 --> 01:19:51.790
are going to help something.

01:19:51.790 --> 01:19:53.240
I think I'm out of time.

01:19:53.240 --> 01:19:55.439
And there may be a
class in here after us?

01:19:55.439 --> 01:19:55.980
There is not?

01:19:55.980 --> 01:19:58.104
All right, so I'm going to
hang around for a while.

01:19:58.104 --> 01:20:00.140
And thanks for coming to listen.

01:20:00.140 --> 01:20:02.290
I would take questions now.

01:20:02.290 --> 01:20:06.089
But it's 12:25, and folks
may have another class.

01:20:06.089 --> 01:20:07.380
But I'll be around [INAUDIBLE].

01:20:07.380 --> 01:20:08.880
Thank you very much for coming.

01:20:08.880 --> 01:20:11.952
[APPLAUSE]