6.033 | Spring 2018 | Undergraduate

Computer System Engineering

Week 12: Security Part II

Lecture 21: Authentication and Passwords

Lecture 21 Outline

  1. Introduction
  2. Authentication via Passwords
  3. Implementing Passwords
  4. Session Cookies
  5. Phishing
  6. Bootstrapping/Resetting
  7. Password Alternatives

Lecture Slides

Reading

  • Book section 11.2

Recitation 21: Why Cryptosystems Fail

Lecture 22: Secure Channels

Lecture 22 Outline

  1. Today’s Threat Model
  2. Secure Channel Primitives
  3. Secure Channel Abstraction
  4. Key Exchange
  5. Cryptographic Signatures for Message Authentication
  6. Key Distribution
  7. TLS: A Protocol That Does All of This
  8. Discussion

Lecture Slides

Reading

  • Book sections 11.3, 11.4, and 11.5

Recitation 22: Domain Name System Security Extensions (DNSSEC)

Tutorial 12: Final Design Project Report

Having now had two rounds of feedback on your design, you’re working on your final report. Unlike the proposal document, the report should contain enough detail that it could feasibly be turned over to Facilities for implementation. It should also contain an evaluation of your design. See the Design Project section for detailed information.

Read “Security Vulnerabilities in DNS and DNSSEC (PDF)” by Suranjith Ariyapperuma and Chris Mitchell. This paper is about DNSSEC. DNS, as is, is an insecure system; DNSSEC is a proposed extension to DNS to mitigate some of the security concerns. It is not yet widespread.

  • Section 2 gives an overview of DNS. Read it if you need a refresher on the protocol, but if not, you can skip it.
  • Section 3 details some of the vulnerabilities to which DNS is open.
  • Section 4 describes DNSSEC, which addresses some of the vulnerabilities in Section 3. DNSSEC has its own problems, however, which are detailed in Section 5.

As you read, think about

  • What are the consequences for users (such as yourself) of the vulnerabilities of DNS?
  • Why must DNSSEC be backwards-compatible with DNS?
  • Why are chains of trust necessary?
  • Who should be in charge of the root key?
Questions for Recitation

Before you come to this recitation, write up (on paper) a brief answer to the following (really—we don’t need more than a couple sentences for each question). 

Your answers to these questions should be in your own words, not direct quotations from the paper.

  • From a security standpoint, what does DNSSEC provide? (e.g., confidentially, authentication, etc.)
  • How does it provide that?
  • Why is DNSSEC necessary (or is it necessary?), and why hasn’t it been fully deployed?

As always, there are multiple answers to each of these question

Disclaimer: This is part of the security section in 6.033. Only use the information you learn in this portion of the class to secure your own systems, not to attack others.

  1. Introduction 
    • Current security guidelines:
      • Be explicit about our policy and threat model.
      • Use the guard model to provide complete mediation.
      • Make as few components trusted as possible.
    • Guard (in guard model) commonly provides authentication and authorization.
      • Commonly, but not always; some systems let users be anonymous.
    • Today: Principal authentication, primarily via passwords.
      • Later, we’ll discuss principal authentication via something other than passwords.
      • We are also not dealing with message authentication today; we’ll get to that in a later lecture.
  2. Authentication via Passwords 
    • Goal of authentication: Verify that the user is who they say they are. An attacker should *not* be able to impersonate the user.
    • Why passwords?
      • In theory, lots of options: A random 8-letter password => 26^8 possibilities (more like 60^8 if you allow lowercase/caps/numbers/symbols). n-letter passwords even better.
      • Guessing is expensive; brute-force attack is infeasible.
  3. Implementing Passwords 
    • Scenario: Logging into an account on a shared computer system.
    • Threat model: Attacker has some access to the server on which password information is stored.
      • Attacker does *not* have access to the network between client and server; that comes in a future lecture.
    • Attempt 1: Store plaintext passwords on server. Very bad idea.
      • If adversary has access to the server (example: They are a sysadmin), they can just read passwords straight from the accounts table.
      • If adversary has access to server but not table, they could use buffer overflow.
      • Lesson: Don’t store secure information in plaintext.
    • Attempt 2: Store hashes of passwords on the server.
      • A hash function H takes an input string of arbitrary size and outputs a fixed-length string.
      • If two input strings, x and y, are different, the probability that H(x) = H(y) is virtually zero (hash functions are “collision resistant”).
      • Cryptographic hash functions are one-way: Given H(x), it’s (very) hard to recover x.
      • If adversary gets access to table, they just have hashes, not passwords.
      • But… can compare that to hashes of popular passwords.
        • Rainbow table: Map common passwords (e.g., “123456”) to their hashes.
          • Actually more complex in practice.
        • With a rainbow table, adversary can figure out who has one of the most common passwords, which is a lot of people.
        • Hash functions that are fast to compute make this data structure very easy to create. “Slow hashes” (key-derivation functions) take longer, but it’s still possible to create rainbow tables of the most common passwords.
      • Lesson: Think about human factors when designing secure systems.
    • Attempt 3: Salt the hashes.
      • Store username, “salt” (a random number), and the hash of the password concatenated with the salt.
      • Adversary *will* see the salt if they get this table, but to build a rainbow table, they’d have to calculate the salt of every common password concatenated with every possible salt. It’s impractical to build that table.
      • They could build a rainbow table for a particular user (i.e., for a particular salt value). If they’re targeting one specific user, this might be worth it, but often isn’t.
        • The goal of many attacks is to get as many accounts as possible.
        • The nice thing about rainbow tables is that you can build them once and use them forever (they do take *some* time to create). One per user per salt is much more onerous.
  4. Session Cookies 
    • Typically we use passwords to bootstrap authentication, but don’t continuously authenticate with our password for every command.
      • Security: Typing, storing, transmitting, checking password is a risk.
      • Convenience (sometimes). No one wants to type their password for every command. We could try to automate this process, but that means we have to store our password somewhere, and you’ve seen where that got us.
    • Web apps often exchange passwords for session cookies: Like temporary passwords that are good for a limited time.
    • Basic idea: Client sends username/password to server. If it checks out, server sends back a cookie:
    • No need to store password in (client) memory or re-enter it.
    • Why use serverkey in hash?
      • Ensure that users can’t fabricate the hash themselves.
      • Server can change serverkey, invalidate old cookies.
    • Can user change expiration?
      • No. To do that, they’d also have to change the hash, which they can’t do (they don’t know serverkey).
  5. Phishing 
    • Phishing attacks: Adversary tricks users into visiting a legitimate-looking site (that adversary owns), asks for 
             username/password.
    • Has nothing to do with whether the network is secure: We just handed the password to the adversary.
    • Solution 1: Challenge-response protocol.
      • Assume (for now) the server stores plaintext passwords.
        • Instead of asking for the password, the server chooses a random value r, sends it to the client.
        • Client computes H(r + password), sends that back to the server.
        • Server checks whether this matches its computation of the hash with the expected password.
        • If the server didn’t already know the password, it still doesn’t.
      • If server stores (salted) hashes, we could have the client compute H(r | H(p)) (or H(r | H(s | p))) and send that. But then H(p) is effectively the password. And by storing hashes, the server is storing passwords.
      • Solution: SRP (“Secure Remote Password”) protocol.
        • No details in 6.033, but allows server to store hashes of passwords and still do a challenge-response.
      • Lesson: Make the server prove that it knows a secret without revealing that secret.
  6. Bootstrapping/Resetting 
    • How do we initially set a password for an account? If an adversary can subvert this process, there’s virtually nothing we can do.
      • MIT: Admissions office vets each student, hands out account codes.
      • Many web sites: Anyone with an email can create a new account.
    • How do we change our password, e.g., after compromise?
      • MIT: Walk over to accounts office, show ID, admin can reset password.
      • Many web sites: Additional “security” questions used to reset password.
    • Why does this matter?
      • Password bootstrap / reset mechanisms are part of the security system, important that they are not weak.
      • Anecdote: Sarah Palin’s Yahoo account was compromised by an attacker guessing her security questions. Personal information can be easy to find online.
    • Lesson: Don’t forget the bootstrapping/resetting parts of a system when designing it.
  7. Password Alternatives 
    • Password Managers:
      • Automatically generate “good” passwords for you.
      • Securely keep track of your passwords, protected via one *really* good password (that you choose).
      • Pros: Keeps users from picking bad passwords/reusing passwords.
      • Cons: Less convenient, what happens if you lose the one good password? Do you trust the authors of the password manager?
    • Two-step verification:
      • Server texts you a code that you have to input (along with your password) when you log in.
      • Pros: Adversaries need your password and your phone to mount attack.
      • Cons: Inconvenient, slow.
    • Biometrics:
      • E.g., retina scans, fingerprints.
      • Pros: Adversaries have to be you (or near you) to log in.
      • Cons: Can you reset the “password”? Also hard to be anonymous.
    • Passwords aren’t perfect. Many alternatives are more secure in some senses. But all have trade-offs for complexity and convenience.

Disclaimer: This is part of the security section in 6.033. Only use the information you learn in this portion of the class to secure your own systems, not to attack others.

  1. Today’s Threat Model 
    • Last time: Adversary with access to server.
    • Today: Adversary in the network.
    • What can adversary in the network do? 
      • Observe packets
      • Corrupt packets
      • Inject packets
      • Drop packets
    • Some can be combated with techniques you already know. 
      • TCP senders retransmit dropped packets.
      • Corrupt packets get dropped (at a router, usually), and thus also retransmitted.
    • Need a plan, though, for carefully corrupted, injected, or sniffed packets.
    • This lecture: Focus on preventing an adversary in the network from observing/tampering with contents of packets. 
      • So NOT injecting new packets; that’s next time.
    • Goals (policy) 
      1. Confidentiality: Adversary cannot learn message contents.
      2. Integrity: Adversary cannot tamper with message contents.
      3. More accurately, if the adversary tampers with the message contents, the sender and/or receiver will detect it.
    • Result is known as a “secure channel.”
  2. Secure Channel Primitives 
    • Ensure confidentiality by encryption.
      • Encrypt(k, m) -> c ; Decrypt(k, c) -> m.
        • k = key (secret; unknown to adversary, never transmitted)
        • m = message
        • c = ciphertext
        • Property: Given c, it is (virtually) impossible to obtain m without knowing k.
      • Encryption alone does not provide integrity.
    • Ensure integrity via message authentication codes (MAC).
      • MAC(k, m) -> t
        • k = key
        • m = message
        • t = output
      • Similar to hash functions. Difference: Uses a key.
        • Alternate name: “Keyed hash function.”
        • Adversary can’t compute the MAC of a message; needs key. (This is not true for regular hash functions.)
        • There are other subtle differences we won’t get into. One example: MACs are not always subject to the same mathematical requirements as cryptographic hash functions.
  3. Secure Channel Abstraction 
    • So far:
    • If adversary intercepts [c|h] and tampers with it, receiver will know; MAC won’t check out.
    • Aside: Instead of [c|h], sender could send either of these:
    • Problem: Adversary can intercept, and then retransmit message (“replay” message).
    • Solution: Include a sequence number in every message, and choose a new random sequence number for every connection.
    • If adversary intercepts message, can’t replay in the same way because sender won’t reuse sequence number.
      • Assume sequence numbers don’t wrap around.
      • Aside: In reality, if there is a conversation that is long enough that the sequence number space is exhausted, a session is “renegotiated"between the sender and the receiver. (You could, for instance, imagine that whenever a session is renegotiated, the sender and receiver both change their keys. In reality, they change a particular random value known as the session ID.
    • But if receiver is also sending to the sender (i.e., if they’re both sending), the receiver might use that sequence number. So adversary could replay in the other direction (a “reflection” attack).
    • Solution: Use different keys in each direction.
  4. Key Exchange 
    • How do sender/receiver get keys in the first place? Can’t just send them in the clear in the beginning.
    • Diffie-Hellman key exchange:
      • Two parties: Alice and Bob (“sender” and “receiver” before).
      • Alice and Bob pick:
        • a prime number p
        • a “generator” g
          • Aside: For g to be a generator, it has to be a “primitive root modulo p”. In 6.033, don’t worry about that; we’ll always tell you g and p. If you want to know more about primitive roots, take a cryptography, number theory, or abstract algebra class.
        • p and g don’t need to be secret; assume adversary knows them.
      • Alice picks random number a (secret).
      • Bob picks random number b (secret).
      • Alice sends g^a mod p to Bob.
      • Bob sends g^b mod p to Alice.
      • Alice computes (g^b mod p) ^ a mod p = g^ab mod p.
      • Bob computes (g^a mod p) ^ b mod p = g^ab mod p.
      • Secret key = g^ab mod p.
      • Adversary can learn p, g, g^a mod p, g^b mod p. From this, one cannot calculate g^ab mod p; you need to know either a or b to do that.
        • Trust me on that; won’t prove it in 6.033.
    • Problem: Man-in-the-middle attack
      • Adversary in middle of network intercepts (and responds to) messages in both directions; Alice thinks she has established a connection with Bob, and vice versa; in reality, they’ve both established a connection with the adversary.
  5. Cryptographic Signatures for Message Authentication 
    • Problem with the above is that messages aren’t authenticated; Alice doesn’t know if she’s really talking to Bob and vice versa.
    • Before: Shared key between the two parties. Known as symmetric key cryptography.
    • For signatures: Public-key cryptography.
      • Each user generates a key pair: (PK, SK).
      • PK is public: Known to everyone, adversaries included.
      • SK is secret: Known only to user.
      • PK and SK are related mathematically; we will not get into that here.
        • RSA is a scheme that generates a key-pair for you.
      • SK let’s you sign messages; PK let’s you verify signatures (but NOT perform the signing).
    • Primitives
      • Sign(SK, m) -> sig.
        • SK = secret key
        • m = message
        • sig = signature
      • Verify(PK, m, sig) -> yes/no.
        • PK = public key
        • m = message
        • sig = signature
        • “yes/no” -> yes if signature is verified, no otherwise.
    • This is all similar to MACs. Signatures don’t require parties to share a key.
  6. Key Distribution 
    • How do we distribute public keys? Lots of ideas.
    • Alice remembers the key she used to communicate with Bob the last time.
      • Easy to implement, effective against subsequent man-in-the-middle attacks.
      • Doesn’t protect against MITM attacks the first time around, doesn’t allow parties to change keys.
    • Consult some authority that knows everyone’s public key.
      • Doesn’t scale (client asks authority for a PK for every time).
      • Alice needs server’s public key beforehand.
    • Authority, but pre-compute responses. Authority creates signed messages: {Bob, PK_bob}_{SK_as}. Anyone can verify that the authority signed this message, given PK_as. When Alice wants to talk to Bob, she needs a signed message from the authority, but it doesn’t matter where this message comes from as long as the signature checks out (i.e., Alice could retrieve the message from a different server).
      • This signed message is a certificate.
      • More scalable.
    • Certificate authorities bring up questions:
      • Who should run the certificate authority?
      • How does the browser get this list of CAs?
        • Generally they come with the browser.
      • How does the CA build its table of names <-> public keys?
        • Have to agree on how to name principals, and need a mechanism to check that a key corresponds to a name.
      • What if a CA makes a mistake?
        • Need a way to revoke certificates..
        • Expiration date? Useful in long term, not for immediate problems.
        • Publish certificate revocation list? Works in theory, not as well in practice (CRLs sometimes incorrect, not always updated immediately).
        • Query online server to check certificate freshness? Not a bad idea.
      • Alternative: Avoid CAs by using public keys as names (protocols: SPKI/SDSI). Works well for names that users don’t have to remember/enter.
  7. TLS: A Protocol That Does All of This 
    • Lots of parts to this protocol; its complexity can cause problems (frequently implemented incorrectly).
    • Notice that client/server use public-key crypto to exchange a secret, which they use to generate keys for symmetric crypto.
      • Symmetric crypto is much faster than public-key crypto.
  8. Discussion 
    • Why isn’t traffic encrypted by default?
      • Can be computationally expensive.
      • Complex to implement.
      • Wasn’t a well-known thing for most users until relatively recently.
        • Historically just applied to transactions that obviously need to be secured. E.g., banking.
      • Maybe we’re at a point now where these arguments no longer apply?
    • Open vs. Closed Design
      • Should system designers keep details of encrypt/decrypt/MAC/ etc. a secret?
      • No: Make the weakest practical assumptions about the adversary. Assume they known the algorithms, but not the secret keys..
        • Also lets us reuse well-tested, prove algorithms.
        • Plus, if key is compromised, we can change it (unlike the algorithms).

Read “Why Cryptosystems Fail (PDF)” by Ross Anderson. This paper is about a philosophy of cryptosystem design, with a focus on their use in financial institutions and particularly in ATM (Automated Teller Machine) networks.

  • Skim the abstract, introduction, and conclusion first, because they will help you to focus on the parts of the paper that support the author’s main claims.
  • Section 3 is devoted to examples of ways in which ATM networks could fail or have failed. This part of the paper is very entertaining, but it can be difficult to keep the big picture in mind while reading about the individual exploits and problems. Pay attention to the section headings (which you may wish to skim before diving into the text) in order to keep your bearings. For each incident, before moving on, spend a few moments thinking about the lessons that it teaches and how the problem could have been avoided.
  • Sections 4 and 5 conclude with a broader discussion.
  • As always, you should read critically and be on the lookout for additional gems and for arguments that are missing or whose framing de-emphasizes certain points.

As you read, think about how this paper relates to other papers we’ve read in 6.033, despite the fact that we’ve covered nothing else on ATM networks.

Questions for Recitation

Before you come to this recitation, write up (on paper) a brief answer to the following (really—we don’t need more than a couple sentences for each question). 

Your answers to these questions should be in your own words, not direct quotations from the paper.

  • In your mind, what is the root cause of the majority of the attacks detailed in Section 3?
  • Pick one of the attacks. How was that root cause exploited in that attack?
  • Why wasn’t the attack prevented in the first place? What could have been done to prevent it, if anything?

As always, there are multiple correct answers for each of these questions.

Course Info

Instructor
As Taught In
Spring 2018
Learning Resource Types
Lecture Notes
Written Assignments
Projects with Examples
Instructor Insights