6.033 | Spring 2018 | Undergraduate

Computer System Engineering

Week 7: Networking Part III

Lecture 12: In-Network Resource Management

Lecture 12 Outline

  1. Introduction
  2. DropTail
  3. RED
  4. ECN
  5. ED/ECN vs. DropTail
  6. Traffic Differentiation
  7. Delay-based Scheduling
  8. Bandwidth-based Scheduling
  9. Round-Robin
  10. Weighted Round-Robin
  11. Deficit Round-Robin
  12. Discussion

Lecture Slides

Reading

No readings assigned

Recitation 12: Data Center Transmission Control Protocol (DCTP)

Lecture 13: Networking: P2P Networks + Content Distribution Network (CDN)

Lecture 13 Outline

  1. Introduction
  2. File-Sharing
  3. Peer-To-Peer (P2P) Networks for File-Sharing
  4. BitTorrent: How to Incentivize Peers to Upload
  5. VoIP: Voice over IP
  6. Video-Streaming (Briefly)

Lecture Slides

Reading

  • No readings assigned

Recitation 13: Content Distribution Networks (CDNs)

Tutorial 7: [No Tutorial this Week]

Read “The Akamai Network: A Platform for High-Performance Internet Applications (PDF)” by Erin Nygren, Ramesh Sitaraman, and Jennifer Sun; skim Section 9. This paper, from 2010, describes the Akamai platform, which improves the performance of technologies that the Internet was not designed for (e.g., streaming video). Incidentally, Akamai’s headquarters are right down the street from MIT.

The first six sections of this paper give context and motivation. Akamai’s actual platform is not described until Section 7. In Section 8, the authors walk through an example of how Akamai’s platform maintains availability in the face of different types of failure. (Just like how the system you design for the DP needs to handle multiple types of route failures!)

As you read, think about the following:

  • What are Akamai’s design goals?
  • What happens when a user visits a particular URL? How does the content get to their machine?
  • The paper differentiates between a “content delivery network” and an “application delivery network.” What is the difference between “content” and “applications,” and why does this difference matter to Akamai?
  • Why wouldn’t a peer-to-peer network suffice for Akamai’s purposes?

Questions for Recitation

Before you come to this recitation, write up (on paper) a brief answer to the following (really—we don’t need more than a couple sentences for each question).  

Your answers to these questions should be in your own words, not direct quotations from the paper.

  • What aspect(s) of the Internet’s infrastructure is Akamai’s platform designed to overcome?
  • How is the platform designed to overcome those aspects?
  • Why is it necessary for Akamai to overcome those aspects?

As always, there are multiple correct answers for each of these questions.

  • Read “Data Center TCP (DCTCP) (PDF - 2.98MB)” by Mohammad Alizadeh, Albert Greenberg, et al.
  • Skip section 3.3 except for the final paragraph, which gives an estimate for the parameter K.
  • Skim section 4 (Results)
  • Closely observe figures 15 and 19, which show the queue occupancy as a function of time, and number of sources.

DCTCP customizes the TCP congestion control algorithm for datacenters. It leverages the Explicit Congestion Notification (ECN) to obtain an early congestion feedback from routers/switches, before the queue drops packets. Further, DCTCP provides a smooth reaction to congestion, i.e., when congestion is limited, it reduces its congestion window by a small amount. In contrast, when congestion is severe, it reduces its congestion window by a large amount.

To help you as you read:

  • Section 1 introduces the paper. Section 2 describes communication in datacenter networks. After this section, you should understand how datacenter traffic differs from “normal” Internet traffic.
  • Section 3 describes the DCTCP algorithm. After this section, you should understand how DCTCP compares to TCP. Does it react sooner or later to congestion than TCP does? What does a DCTCP sender do when it infers that there is congestion on the network as compared to a TCP center? What are queues in a datacenter running DCTCP like (empty? full? etc.)?
  • Section 4—which you should skim—gives the results of the authors’ experiments. Check that the empirical results match your expectations.

As you read, think about:

  • Would DCTCP work on the Internet?
  • Is there a trade-off between the generality of a protocol and its performance?

Questions for Recitation

Before you come to this recitation, write up (on paper) a brief answer to the following (really—we don’t need more than a couple sentences for each question). 

Your answers to these questions should be in your own words, not direct quotations from the paper.

  • What is the goal of DCTCP?
  • How does DCTCP differ from TCP?
  • Why does DCTCP differ from TCP?

As always, there are multiple correct answers for each of these questions.

  1. Introduction 
    • Last time: TCP CC. Massive success. Doesn’t require us to change the network, is something machines can opt-in to (don’t have to have reliable transport if you don’t need it), lets us prevent congestion in a distributed manner.
    • But:
      • Can result in long delays when routers have too much buffering.
      • Doesn’t work well in some scenarios (DCTCP).
      • Most important for today: Doesn’t react to congestion until queues are full.
    • Full queues = long delay.
    • Queues = necessary to absorb bursts.
    • Goal: Transient queues, not persistent queues.
    • Idea: Drop packets *before* the queues are full. TCP senders will back off before congestion is too bad.
  2. DropTail 
    • The original queue management scheme. When a packet arrives, if the queue is full, drop it; else, enqueue it.
    • Simple (+).
    • Only drops packets when it needs to (+/-).
      • Remember: Dropped packet => retransmission, which wastes resources.
    • Synchronizes sources (-).
    • Not very fair (-).
    • Tends to result in mostly-full queues (-).
    • Bad for bursty traffic (-).
  3. RED 
    • Active queue management scheme.
    • Idea: Drop packets before the queue is full to give senders an early signal.
    • Requires a measure of the average queue size, q_avg.
  1. Introduction
    • “New" technologies on the Internet. How do they work? Are they overcoming any problems in the existing architecture? Do they invalidate any of our assumptions? Do they provide opportunities?
    • Today: File-sharing, VoIP, and video-streaming.
    • Commonalities: All deal with P2P networks, or related constructs (CDNs).
  2. File-Sharing: Getting a File from One Person (Machine) to Another
    • Can use client/server:
      • Client requests file, server responds with the data.
      • HTTP, FTP work this way.
    • Downsides: Single point of failure, expensive, doesn’t scale.
    • Could use CDNs:
      • Buy multiple servers, put them near clients to decrease latency.
      • No single point of failure, scales better.
      • See the next recitation for more discussion.
  3. Peer-To-Peer (P2P) Networks for File-Sharing
    • Distribute the architecture to the extreme.
    • Once a client downloads (part of) the file from the server, that client can upload (part of) the file to others. Put clients to work!
    • In theory: Infinitely scalable.
    • P2P networks create overlays on top of the underlying Internet (so do CDNs).
    • Problem: What if users aren’t willing to upload?
  4. BitTorrent: How to Incentivize Peers to Upload
    • Basics of original BitTorrent (BT) protocol:
      • Create a .torrent file, which contains meta-information about the file (file name, length, info about pieces that comprise the file, URL of tracker).
      • Have a tracker. A server that knows the identity of all the peers involved in your file transfer.
      • To download:
        • Peer contacts tracker.
        • Tracker responds with list of other peers involved in transfer.
        • Peer connects to these other peers, begins to transfer blocks (see below).
        • Some peers are seeders: Already have the entire file (maybe servers that host the file, or just nice peers who are sticking around).
    • In the actual download, peers request blocks: pieces of pieces.
      • Details/terminology doesn’t matter. Just know that blocks are small (~16KB) chunks of the file.
      • Request blocks in a random order (more or less).
    • What incentivizes users to upload (UL) rather than just download(DL)ing?
      • High-level: Users aren’t allowed to DL from a user unless they’re also ULing to that user.
        • So peers want mutual interest: A has to have blocks that B needs, and vice versa.
      • Protocol is divided into rounds. In round n, some number of peers upload blocks to Peer X. In round n+1, Peer X will send blocks to the peers that uploaded the most in round n. (Typically, to the top four peers.)
      • How do peers get started?  Each peer reserves some (small) amount of bandwidth to give away freely.
    • This method of incentivizing peers is part of what allowed P2P file-sharing to take off.
    • Lingering problem: tracker is central point of failure.
    • Most BT clients today are “trackerless”, and use Distributed Hash Tables (DHTs) instead.
  5. VoIP: Voice over IP
    • Talking specifically about Skype, a proprietary system.

    • Skype used to use a P2P network for two things: To improve performance, allow certain connections to work at all.

    • Recall the first networking lecture. Internet bred NATs: Network Address Translators.

      • Consider client A behind a NAT, who wants to initiate a connection to server S. A’s IP is private (can’t route to it); S’s and N’s are public.

      A — N —- S

      • A sends a packet: [to:S from:A].
      • N rewrites the header: [to:S from:N].
        • and stores some state.
      • S receives it, sends response back to N: [to:N from:S].
      • N uses stored state to figure out that this packet is really meant for A.
        • N will keep track of the port(s) that A is communicating on. Communication via those ports is then meant for A.
    • Now imagine two clients, both behind NATs:

      A — N1 —- N2 — S

      • Now A doesn’t even know S’s IP (private IPs aren’t routable). It also doesn’t know N2’s IP; it has no way to get that.
      • For Skype: Means that A and S can’t call each other.
      • Skype provides a directory service, so assume we can get N2’s public IP.  When N2 gets packet destined for S, it has no idea what to do with it.
      • (See Lecture 13 slides (PDF) for example.)
    • Skype will employ an additional node—a “supernode”—P, with a public IP, and route A and S’s calls through P: 

      Diagram showing connection between client A and client S through nodes.
      • P keeps a bunch of state to get this to work, and A and S must both be registered users of Skype. A and S will connect to P as part of starting up their Skype client (so private IP users initiate connections to public IPs).
        • In reality, there is not one supernode, but a network of supernodes. A, S are both connected to nodes in that network, and the overlay network routes data between them.
    • Seems like this will affect performance, so Skype only let you be a supernode if your memory/CPU is sufficient (and you have a public IP).

    • Good idea?

      • A/S might not want their (encrypted) call routed through someone else.
      • P might not want to pay to transit traffic for A and S.
    • Today: Microsoft owns all of the supernodes, making this less of a P2P network and more of a hierarchy.

  6. Video-Streaming (Briefly)
    • Can we just use BitTorrent to stream (live) video?
      • Streaming requires getting blocks (roughly) in order.
      • Also requires certain amount of bandwidth at all times.
    • Probably not:
      • BT works because peers can acquire blocks in any order.
      • Moreover, most BT peers are on residential links, which have underwhelming upload bandwidth.
    • What’s good for streaming? CDNs!
      • Thursday’s recitation: What CDNs bring to the table that P2P networks don’t.
      • Also think about whether you want to reconsider CDNs for file-sharing.

Course Info

Instructor
As Taught In
Spring 2018
Learning Resource Types
Lecture Notes
Written Assignments
Projects with Examples
Instructor Insights