6.171 | Fall 2003 | Undergraduate

Software Engineering for Web Applications

Rules

This is an MIT “ex-camera” exam, which means that you can take it where you want to and use any resources you like, except that you must do the exam by yourself. In order to do the exam, you will need access to a Web browser. We suggest that you find a place to take the exam where you have quiet, private use of a web browser, a text editor, and email.

When you are through with the exam, send your work by email (in plain text, please) to Instructor. At the beginning of the message, please include a statement affirming that you worked on the exam completely by yourself.

Even though we have given you six hours to work on this exam, you should be able to complete it very quickly if you’ve you’ve been coming to lecture and doing the readings. Note in particular that we’ve specified an expected length of answer for each question: don’t feel that you have to write reams of stuff just because you have the time – a clear, succinct answer is better than an extensive ramble.

Resources

Here are some resources that will help you answer the questions below.

Question 1: Normalizing a Data Model

Here is a data model for storing a collaboratively collected set of MP3s (digitized music clips). These users are presumed to be registered in a single ArsDigita Community System. List two reasons why this table is not even in First Normal Form. Then rewrite this data model to get it into Third Normal Form. Hint: your normalized data model will contain more than one SQL table.

create table mp3s (
 mp3_id   integer primary key,
 the_music  blob,
 submitting_user_id references users,
 playing_minutes  number,
 musicians  varchar(4000),
 composers  varchar(4000),
 date_recorded  date,
 – user IDs separated by spaces
 – (these are users who clicked “add to my personal list”
 personal_list_ids clob,
 – user IDs separated by spaces
 – (these are all the users who’ve played a song
 downloading_user_ids clob,
 – ratings, separated by spaces
 – each rating is a user ID,integer (between 1 and 10)
 ratings   clob,
 original_lyrics  clob,
 – what language are the lyrics in (use codes)
 original_lyric_lang char(2),
 – let users upload translations into other languages
 lyric_translation1 clob,
 lyric_translation1_lang clob,
 lyric_translation2 clob,
 lyric_translation2_lang clob,
 lyric_translation3 clob,
 lyric_translation3_lang clob,
 – if this came from a Compact Disc, store disc info
 – is this disc on file at cddb.com?  If so, record key.
 cddb_key  varchar(200),
 barcode_on_disc  varchar(100),
 record_label  varchar(200),
 catalog_number  varchar(200),
 total_playing_minutes number

);

Expected length of answer: 1 page.

Question 2: Usability

You’re sick of MIT. You’re sick of the sub-freezing temperatures (and it isn’t even officially winter yet!). You’re sick of being so thin and want to look more like your instructors. Where can you find warm temperatures and the unlimited meals that you’ll need to put some flesh on your bones? Clearly a Caribbean cruise is in order.

Here are your constraints:

  • you’ve got to spend Christmas with your parents or they’ll refuse to pay tuition for next term; i.e., you can’t leave until December 26
  • you’ve got to be back at MIT on January 8 for the start of IAP; more than 200 people have signed up for the “Joy of PL/SQL” seminar that you’re teaching
  • being a student in the coolest course at MIT has been great for your love life; you’ve got a new romantic partner and you’ll need two spots on the cruise

Start by visiting the Carnival site and spend no more than 3 minutes attempting to answer the question “Do they have two free spots on any cruise departing on or after December 26 and returning before January 8?”

Question 2a

Suggest five things that the designers of this site could have done differently to improve your experience. (Expected length of answer: 1-3 sentences per suggestion.)

Now we’ll add a twist: you’re gay. You don’t want to end up on a Patrick Buchanan-themed cruise.

Spend another three minutes surfing around the Carnival site to get a feel for life on board one of the ships (maybe start by picking the ship Paradise). See if you can find any information about whether Carnival offers special gay-themed cruises or gathering places for gays.

Question 2b

What could Carnival have done to give you a better answer to your question? A better feel for life on board their ships? Could collaborative authorship play a role? Sketch how this could be done. (Expected length of answer: 1 or 2 paragraphs.)

Question 2c

Critique the user interface of gay.com’s Q&A forum software. If you wish, you may do this in the form of a pluses and minuses comparison to the standard ACS forum software. Expected length of answer: 2 paragraphs.

Visit the PlanetOut site (CEO is MIT Course 2 graduate Megan Smith) and search for “Carnival cruise”. Look at enough results to figure out whether or not Carnival is gay-friendly. Google for the search string “Carnival cruise gay”. Again, look at enough results to figure out whether or not Carnival is gay-friendly.

Question 2d

Compare the user interfaces of the preceding search engines for usability. (Expected length of answer: 2 paragraphs.)

Question 3: Oracle

Question 3a: Fun with Oracle

You’ve been handed a running Web service that relies on an Oracle database. The original developers weren’t strong on documentation and they’ve not left you any .sql files with CREATE TABLE statements. Using only SQL queries (i.e., not any SQL*Plus commands such as “DESCRIBE”), how can you find out (1) what tables have been defined, (2) what columns are in those tables, (3) which of those columns are nullable (i.e., can contain the value “NULL”), and (4) which of those table columns are constrained. Your SQL queries need not be exactly right but you should have found the tables and columns that contain the information you need.

Expected length of answer: four queries, each no more than a line or two.

Question 3b: More Fun with Oracle

You’ve written an application program that:

  1. queries a row of information from Oracle (using “SELECT * FROM FOOBAR”)
  2. performs some computation on the resulting data
  3. based on the computation, updates the information that was queried in Step 1 (using UPDATE)

Explain the concurrency problem that arises when two copies of this program are running simultaneously (expected length of answer: 1 paragraph). Give two modifications to the program that would eliminate the concurrency problem and explain the advantages of each approach. Your modifications should include explicit SQL statements. (Expected length of answer: 1 paragraph).

Parting Words

Percentage of MBA students who say they would hire a competitor’s employee to steal trade secrets: 73%

Percentage of convicted criminals who say they would do this: 60%

-- Wall Street Journal, May 3rd 1999

It takes more than engineering to make a successful Web service. Pick your partners carefully.

At the beginning of the message, please include a statement affirming that you worked on the exam completely by yourself.

Who Wrote This and When

This exam was written by Hal Abelson and Philip Greenspun in December 2000. It is copyright 2000 by them but may be reused provided credit is given to the original authors.

Rules

This is a take-home exam. You can take it where you want to and use any resources you like, except that you must do the exam by yourself (no collaboration, not even with your team members). If you are relying on a section of a book or Web page, cite the source. An 80/100 will be a great score on this exam so if there is one part that is taking forever and doesn’t interest you, write up your thoughts and move on.

When you are through with the exam, send your work by email (in plain text, please) to Instructor. At the beginning of the message, please include a statement affirming that you worked on the exam completely by yourself.

You have one week to complete the exam.

Resources

Here are some resources that will help you answer the questions below.

Problems or questions? Email Instructor.

Question 1: Developing a Data Model

Visit friendster.com. If you’re not already a member, sign up for an account.

  1. Find Ben, using a “user search,” and add him as a friend (Ben said that he went into Computer Science because he wanted to meet a lot of cool people). You should do this earlier rather than later so that you have time to complete the exercises.
  2. Look at a page describing a user. Look carefully at the information presented on this profile, including friends’ comments.
  3. Add a few other friends (people in the class are a good idea).
  4. Change your favorite movies or TV shows to try to find people who have the same tastes. Observe how the site presents this information.
  5. Add some testimonials on your friends.
  6. Search for people in your network who live in specific cities other than Boston.

Now develop a data model for the friendster.com site. Your data model should be complete enough to handle all of the preceding actions and operate the core friendster.com service. (Ignore the “Shopping” and “Events” sections). Assume the existence of a users table, keyed with user_id. The data model should include appropriate integrity constraints and indices to facilitate the queries that you believe are necessary for the pages that you just experienced. The data model should contain appropriate in-line comments.

Test your data model by feeding it to an RDBMS and making sure that the CREATE TABLE and index definition statements are accepted without errors. You will probably want to create a new database user so that your table definitions don’t conflict with any tables that you might have defined for your team project.

Expected length of answer: if your data model contains more than 15 or 20 tables, you’re probably handling too many special cases. The important thing is to get the structure of the data model right and the relations. Don’t spend a lot of time making sure that you have every possible column in every table. For example it would be okay to say “– … address columns …” to imply that there would be a way to store a member’s street address, city, state, etc.

Question 2: Fancy SQL Queries

Using the data model you defined in Question 1 write two computer programs:

Part a: a computer program that generates the page “find people in my network of friends that live within 50 miles of me and who say that they are interested in meeting people for Dating someone of my sex”. The program should list those people ordered by ascending geographically distance.

Part b: a computer program that generates the page “how am I related to this person?” and displays all the possible paths between two people.

Ideally each program will be a single SQL query. It is acceptable to use procedural language crutches, e.g., stored procedures. Hint: you can postulate the existence of a table or function that will give you the latitude/longitude of the centroid of a U.S. zip code.

Test your queries by feeding your data model from Question 1 into the RDBMS that you’ve been using this semester (though you can use a separate tablespace/database/namespace). Fill the tables with at least the following test data (you may add more if you like):

  • George is 57, likes pretzels, Segway Human Transporters, Texas, and SUVs. George lives in the U.S. at zip code 20500.
  • Dick is 60, likes bacon double cheese burgers, SUVs, and energy policy. Dick won’t tell us where he lives.
  • Ken is 55, likes energy policy, Texas, and body building. Ken lives at the United States District Court in Houston 77002.
  • Arnold is 55, likes California, body building, and SUVs. Arnold is interested in Dating Women. (Your data model need not include “groping” as a possible object of meeting people.) Arnold lives in zip code 90210.
  • Britney is 22, likesSemiconductor Physics and is interested in dating both men and women (as long as they are rich and famous and/or MIT Course 6 grads). Britney put down 90210 as her home zip code because she’s spending a lot of time out there working on a movie.
  • George is friends with Dick.
  • Dick is friends with Ken.
  • Ken is friends with Arnold.
  • George is friends with Britney.

Print out a transcript of a database session in which you use your queries (for (b) you could query to find the friend path between George and Arnold).

Note that it is possible that you’ll start doing these queries and discover that you want to make changes to your data model. That’s a perfectly natural way to refine a system. See if a data model supports some queries then go back and modify it until it does.

Don’t spend more than a few hours on this question. You can get partial credit if you show that you’re thinking along the right lines, e.g., by coming up with some smaller queries that give you reports that could be useful in solving the larger problem.

Question 3: Usability Analysis

Based on your experience in Question 1 identify three usability problems with Friendster and write instructions to their programmers on how to fix them. Extra credit: email these ideas to the Friendster folks and attach their response.

Expected length of answer: 1 page.

Question 4: Extending the Application with Distributed Computing

Jose Frio Cuarenta, senior MIT administrator, has decided that friendship networks are the next hot thing and wants to bring the magic of Friendster to MIT by building MITster. Being the total Organization Man, he conceives this as a standalone application in which every MIT-affiliated person registers and comes to visit the site. Philip ‘82 and Gerry ‘68 can discover that they both live in the same city, share some interests, and are connected through Hal ‘XX [we could tell you but we’d have to kill you].

Jose brings you into the project to lay out the system requirements and your first question is Why can’t they just use Friendster? Jose responds that he isn’t satisfied with Friendster’s community features. A group of friends should be able to start an ad-hoc group with a private discussion forum, collaborative workspace, event planner, etc.

You point out that standalone servers were great in the 1990s but this century is all about distributed computing. Maybe the goal should be to augment rather than supplant Friendster. An MIT affiliate could log into a single server and get all of the services of Friendster plus additional services for groups of MIT friends.

Clearly you’re going to have to set up some sort of server to provide the additional community support features. What kind of SOAP interface would you want from Friendster? Write a 1-2-page English description of the interface that you want and one WSDL file for a portion of the interface.

Here again we’re looking for a reasonable structure not a complete solution with every possible field defined. It is okay to have some parts elided and commented on.

Question 5: Fun with Data Warehousing

Using the Levi Strauss data model defined in the “Real World Example” section of Data Warehousing, write SQL queries that answer the following questions:

  • For each region in the U.S., how many pairs of pants were sold to repeat versus first-time customers?
  • Starting from the inception of this business, what are the aggregate sales for each month, ordered from least recent to most recent months?

Suppose now that you are asked to prepare a report of average waist size of a pair of shipped pants, broken down by region of the country. To which table would you add a waist_size column and why?

Turning it in

Please email your answers to the questions to Instructor.

At the beginning of the message, please include a statement affirming that you worked on the exam completely by yourself.

Rules

This exam is open-book and take-home. Do not discuss your answers or the problems with other students, however. If you are relying on a section of a book or Web page, cite the source.

When you are through with the exam, send your work by email (in plain text, please) to Instructor.

You have one week to complete the exam.

Resources

Here are some resources that will help you answer the questions below.

Question 1: Thinking About the Overall Application

Suppose that you are part of committee designing a Web service for intramural sports at MIT. Visit the  MIT IM sports site to get an idea of what an attempt to build a site like this looks like but do not be constrained in your thinking by what you see there.

What are the user classes for an ideal site like this? What should each class of user be able to do? What are the core benefits of doing this as a Web application instead of adding a section to the Tech?

Question 2: Building the Application

How would you apply the latest version of the ACS to the problem of constructing the ideal IM sports site?

How would you configure existing modules? Which ones would you have to extend via writing new code? What new modules would you have to write?

Question 3: Mollifying the Client

Suppose that the publisher of the IM sports site hires a Chief Technology Officer. This will be a person who cannot program. In order to bolster his or her ego, you’ll get questioned:

  • Why are you using Oracle? Wouldn’t it be cheaper and faster to use MySQL?
  • Why are you using Oracle? Wouldn’t it be cheaper to use Microsoft SQL Server?
  • Why are you using AOLserver? Wouldn’t Apache be better?
  • Why are you using this ACS toolkit?
  • Why are you using Solaris instead of Linux?
  • Why do you want to host this at AboveNet instead of in your dorm room? Or underneath my desk in the Athetics Department?

Question 4: User Activity Analysis

Once you’ve launched your new IM Sports site, the publisher will want to know how users are interacting with the service.

Discuss the relative merits of the following approaches to user activity analysis:

  • standard server log analysis
  • recording predefined important events, such as “user clicked through to a foreign site” into the RDBMS
  • a dimensional data warehouse of clickstream data

Question 5: Usability Analysis

Every successful Web project has at least one participant who is passionate about the end-user experience. It is important to become skilled at working through a usage scenario and noting specifically what should be changed on each page. We haven’t built our ideal IM sports site so we’ll shift gears to a real running system:  http://photo.net/ (the image sharing service portion of the site was built as a 6.170 project in Spring 1999).

Work through the following scenarios:

  • Take the role of a novice photographer who has just taken a single picture a digital camera and wishes to use the system to share that photo with a relative. Upload one photo into the system.
  • Take the role of the relative and come to the site to find the picture.
  • Take the role of a serious photographer. Upload a number of photos, a few captions for photos that one does not have scanned, and create a presentation.

Write up a usability analysis stating, for each page, what you’d change to make the experience better. Note that your analysis may suggest changing the flow, adding new pages, eliminating pages, or adding configuration switches for users.

Question 6: Workflow

A really effective use for a Web-based system is to support cooperative work, i.e., a purposeful community. When people seek an information system to support their work together, it is usually because the workflow is at least a little bit complex, e.g., someone in the support group must write an answer to a customer inquiry, Amanda or Barney must approve it but if they disapprove the answer, the inquiry gets kicked back into the poor of questions to be answered.

A good way to support workflow is with a finite-state machine modelled in the database. See the Oracle Workflow product description in the Oracle documentation for some ideas about how engineers have approached this problem.

Build the SQL data model for a finite-state machine-based workflow system. It should be powerful enough to handle the approval process for purchasing in a company, with states such as “employee_wants_thing”, “supervisor_has_approved_thing”, “thing_is_on_order”, “thing_was_received”, “thing_was_paid_for”. Have a journal where data accumulated during state transitions, e.g., how much was paid for the thing, may be logged.

Who Wrote This and When

This problem set was written by Philip Greenspun in April 2000. It is copyright 2000 by him but may be reused provided credit is given to the original author.

Rules

This is a take-home exam. You can take it where you want to and use any resources you like, except that you must do the exam by yourself. If you are relying on a section of a book or Web page, cite the source.

When you are through with the exam, send your work by email (in plain text, please) to instructor. At the beginning of the message, please include a statement affirming that you worked on the exam completely by yourself.

You have one week to complete the exam.

Resources

Here are some resources that will help you answer the questions below.

Problems or questions? Email instructor.

Question 1: Developing a Data Model

Visit netflix.com. Sign up for the free 10-day trial membership and take the following steps:

  1. Look at a page describing the movie “Strictly Ballroom”. Look carefully at the information presented on this movie, including user reviews and the categorization of user reviews.
  2. Add a few movies to your rental queue.
  3. Reorder the queue so that the last movie you chose will be shipped next.
  4. Delete some movies from the queue.
  5. Look into the options for reporting a shipping problem but don’t actually report a movie lost or delayed.
  6. Rate a few movies.
  7. Ask for recommended movies based on ratings that you previously entered.

Now develop a data model for the netflix.com site. Your data model should be complete enough to handle all of the preceding actions and operate the core netflix.com service. You may assume the existence of a users table, keyed with user_id. Your data model should include appropriate integrity constraints and indices to facilitate the queries that you believe are necessary for the pages that you just experienced. Your data model should contain appropriate in-line comments.

Test your data model by feeding it to an RDBMS and making sure that the CREATE TABLE and index definition statements are accepted without errors. You will probably want to create a new database user so that your table definitions don’t conflict with any tables that you might have defined for your team project.

Expected length of answer: if your data model contains more than 15 or 20 tables, you’re probably handling too many special cases.

Question 2: Fancy SQL Queries

Write a computer program that generates the page “here are the movies that we recommend based on your ratings of movies compared to other users’ ratings of movies in the database” by querying the data model that you defined in Question 1. Ideally your program will be a single SQL query that returns rows in descending order of likeliness that the user will like a movie. It is acceptable to use procedural language crutches, e.g., PL/SQL functions in Oracle.

Test your query by feeding your data model from Question 1 into the RDBMS that you’ve been using this semester. Fill the tables with at least the following test data (you may add more if you like):

  • Schlomo Lifschitz loved “Terminator 2”, liked “Terminator” and “The Replacement Killers”, was neutral on “Crouching Tiger Hidden Dragon”, and hated “Gone With the Wind.”
  • Schlomo’s wife Rachael hated T2, disliked Terminator, was neutral on “Crouching Tiger”, and loved “GWTW.”
  • Their divorce attorney Biff Hutchinson hated “GWTW” and liked “Gaslight.”
  • The judge handling the case, Ronald Heidegger, loved “GWTW” and liked “Emma.”

Print out a transcript of a database session in which you use your query to generate movie recommendations to Biff and Ronald. Note that your query should never recommend a movie that the person has already seen.

Question 3: Usability Analysis

Visit photo.net  with the goal of uploading three photos and sharing them with six friends (by emailing them a URL).

Take the following steps:

  1. Find the photo sharing service on photo.net.
  2. Upload your three photos (if you used my photos, note that they were taken with a Canon EOS-5 camera, 70-200/2.8L lens, and Fuji Astia slide film).
  3. Try to find a facility within the photo sharing service for preparing an exhibit in which those three photos can appear.
  4. Prepare an exhibit (photos plus some explanatory text).

Critique the user experience. Your critique should be clear concerning what is wrong with the current system. Your critique should be explicit about what to change, such that a junior programmer could implement your improvements without depending on his or her own taste and judgement.

Expected length of answer: 1-2 pages.

Question 4: Thinking About the Overall Application

Suppose that someone notices the inconsistencies and prevailing lameness of the average MIT campus organization’s Web site (see Activities and Clubs at MIT for a list of offenders). Here we are at the school that invented the Web (with a bit of help from the Swiss physicists and NCSA) and this is the best we can do?

Imagine that you’ve been asked by the Associate Dean for Internet Self-Esteem to develop a central server that will provide outsourced IT to MIT student organizations. Each organization will get its own corner of this server rather than having to build a standalone Web site. What services should this central facility provide to organizations? To a club leader? To a club member? To a new student trying to figure out which clubs to join? To an MIT Dean trying to figure out which clubs to fund?

Expected length of answer: 2-3 pages.

Question 5: Justifying an Engineering Decision

Pick some hardware and software tools to support the outsource-IT-for-MIT-clubs system that you designed for Question 4. Explain these decisions in terms of required user capacity, any special circumstances attributable to the service being run at MIT, and with an eye toward long-term maintainability. Try to anticipate and answer questions that someone coming fresh to the problem might have, such as the following:

  • Why are you using [your chosen RDBMS]? Wouldn’t it be cheaper and faster to use MySQL?
  • Which Web server program is the best?
  • Would it be a lot easier and cheaper to do all of this with Microsoft Sharepoint or some other prepackaged software?
  • Does it make more sense to have a 2-tier system with SQL queries embedded in page scripts or something like Java® 2 Enterprise Edition where Enterprise Java® Beans are made persistent via machine-generated SQL transactions?
  • What training will be required before a programmer can safely take over maintenance on this system? How will we know if a new programmer is adequately prepared to start modifying the system?

Expected length of answer: 2 pages total.

Turning it in

Please email your answers to the questions to instructor.

At the beginning of the message, please include a statement affirming that you worked on the exam completely by yourself.

Course Info

Learning Resource Types
Online Textbook
Exams
Design Assignments