6.171 | Fall 2003 | Undergraduate

Software Engineering for Web Applications

Assignments

All of the assigned problems are from the online course textbook. Unless specified, all of the exercises in the chapter should be completed.

Problem Set 1: Basics
Problem Set 2: User Registration and Management
Problem Set 3: Planning and exercises 1-3 in Content Management
Problem Set 4: Remaining exercises in Content Management, including client sign-off
Problem Set 5: Software Modularity
Problem Set 6: Discussion, up to, but not including, the usability test
Problem Set 7: Remaining exercises in Discussion, except execution of the refinement plan
Problem Set 8: Scaling Gracefully
Problem Set 9: Exercises 1 and 2 from Search
Problem Set 10: Remaining exercises in Search
Problem Set 11: Planning Redux
Problem Set 12: Distributed Computing
Problem Set 13: Metadata
Problem Set 14: User Activity Analysis

Assignments from Previous Offerings of the Course

Basic db-backed Web sites, AOLserver, Tcl, SQL
Content Management
Metadata, Knowledge Management

File decompression software, such as Winzip® or StuffIt®, is required to open the .tar files in this section. Any number of software tools can be used to import the .csv files in this section.

Reading for This Week

Helper and example files (if you’re not doing this at MIT): 6916.ps1.tar (TAR) (This .tar file includes 1 .txt, 2 .adp, 1 .csv, 10 .tcl files.)

Objectives

We’re trying to make sure that everyone knows:

  • How to log into his or her development server
  • Rudiments of Tcl
  • How to run Tcl via tclsh
  • about the ADP templating facility of AOLserver
  • How to create, execute, test, and debug a .tcl page
  • How to write a .tcl page that queries a foreign server
  • Rudiments of SQL
  • How to query Oracle from the shell using SQL*Plus
  • How to write a .tcl page that queries Oracle
  • How to personalize Web services by issuing and reading cookies
  • How to read and write data in XML
  • How to use SQL*Loader

This first problem set requires you to learn a lot of new software, so make sure you get started early: plan to spend at least two or three sessions on it. There is nothing difficult here, but we do want to lead you through the mechanics of using Tcl, SQL, and running the Web server.

Getting Started with Tcl    
Getting Started with SQL*Plus    
Working with AOLserver and Oracle    
Personalizing Web Services with Cookies    
Sharing Data with XML    
The Wide World of Oracle    
Information Architecture and User Interface

Getting Started with Tcl

Start by reading Using the LCS Web/db Computing Facility and following its instructions to log into your server machine.

Exercise 1: Running Tcl from the Shell

Run Emacs. Type “m-x shell” to get a Unix® shell. Type “tclsh” to start the Tcl shell program. Define a recursive Fibonacci procedure in Tcl. Execute and test.

Hint: If you’re writing Tcl programs of more than two or three lines, you may find it convenient to type the code into a separate Emacs buffer (set to tcl mode) and cut and paste from there into the Tcl shell buffer.

Type info tclversion at the tclsh prompt to make sure that you’re running Tcl 8.2, the same version that is compiled into AOLserver.

Exercise 2: Running Tcl from an (almost) HTML Page

Look at two-plus-two.adp ([HTM](/courses/6-171-software-engineering-for-web-applications-fall-2003/pages/assignments/twoplustwo.htm)) source (TXT). This is an example of the ADP templating facility in AOLserver.

Augment the page so that (1) you add a $4000 South American Cichlid aquarium as an option, (2) you use a constructor procedure to build each aquarium element (instead of simply calling list), (3) you add an element to the aquarium for how many of each type of aquarium will be installed (4) you use procedures to extract type, cost and quantity from an aquarium element (instead of simply calling lindex), (5) you print out quantity-dependent subtotals and the grand total at the bottom.

Exercise 3: Simple Tcl Pages

Using the Web browser running on your local machine, visit the URL http://yourvirtualserver/psets/ps1/simple-tcl-page.tcl. Using Emacs running on the server machine, examine the source code for this page in /web/yourvirtualserver/www/psets/ps1/simple-tcl-page.tcl. Also look at the source code for the target of the form in /web/yourvirtualserver/psets/ps1/simple-tcl-page-2.tcl. (If these files are missing, download them from 6916.ps1.tar (TAR) (This .tar file includes 1 .txt, 2 .adp, 1 .csv, 10 .tcl files.) and put them in /web/yourvirtualserver/www). Notice how we use Tcl to read the form variables. Try out the form a couple of times, using your browser. Now debug the regular expression in simple-tcl-page-2.tcl so that it properly handles the names “Tammy Faye Baker” and “William H. Gates III”.

Hint 1: it is easier if you don’t try to do this in one regexp. Use if then elseif then elseif …

Hint 2: regexp has a side-effect. If you use a multi-clause if statement, make sure that you wrap your calls to regexp in braces so that they don’t all get evaluated immediately.

Exercise 4: Tcl Pages that Query Foreign Servers

Using the Web browser running on your local machine, visit the Seeq. Read the discussion of this program in Chapter 10 of Philip and Alex’s Guide to Web Publishing. Drawing upon that program as a model, build a new web service that takes the ISBN of a book from a form and then uses ns_httpget to query several online bookstores to find price and stock information and displays the results in an HTML table. Save your program in files called /web/yourvirtualserver/www/psets/ps1/books.tcl and books-2.tcl so people can access your service over the web.

We suggest querying wordsworth.com, barnesandnoble.com, and www.1bookstreet.com (amazon.com tends to respond with a 302 redirect you if the client doesn’t give them a session ID in the query). Your program should be robust to timeouts, errors at the foreign sites, and network problems. You can ensure this by wrapping a Tcl catch statement around your call to ns_httpget. Test your program with the following ISBNs: 0385494238, 0062514792, 0140260404, 0679762906.

Extra credit: From which of the preceding books is the following quote taken?

“The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers.”

This would be a good time to take break.

Getting Started with SQL*Plus

Start up again with Emacs (you took a break, right?) and start a Tcl shell as before (“M-x shell” then “tclsh”). Type “M-x rename-buffer” to rename the shell to “tcl-shell”. Type “M-x shell” to then get a new Unix® shell. Rename this buffer “sql-shell”. In the SQL shell, type “sqlplus” to start SQL*Plus, the Oracle shell client. It’s convenient to work like this using two shells, one for Tcl and one for SQL.

Exercise 5: Talking to Oracle from the Shell

Type the following at SQL*Plus to a table for keeping track of the classes you’re taking this semester:

create table my_courses ( course_number varchar(20));

Note that you have to end your SQL commands with a semicolon in SQL*Plus. These are not part of the SQL language and you shouldn’t use these when writing SQL in your Tcl progams for AOLserver.

Insert a few rows, e.g.,

insert into my_courses (course_number) values ('6.916');   

Commit your changes:

commit;   

See what you’ve got:

select * from my_courses;

One of the main benefits of using an RDBMS is persistence. Everything that you create stays around even after you log out. Normally, that’s a good thing, but in this case you probably want to clean up after your experiment:

drop table my_courses;

Quit SQL*Plus by typing “c-c c-d”.

Exercise 6: Tcl Pages that Talk to Oracle

Look at the file /web/yourvirtualserver/www/psets/ps1/quotations.tcl, which is the source code for a page that displays quotations that have been stored in the Oracle database. Visit this page with your Web browser and you should get an error. The reason for the error is that the program is calling a procedure that doesn’t exist: ad_header (“ArsDigita Header”). You can confirm this suspicion by using Emacs to read /home/nsadmin/log/yourvirtualserver-error.log, which is where AOLserver logs any notices or problems.

To get AOLserver to load procedure definitions at server startup, you have to put .tcl files in your server’s private Tcl library: /web/yourvirtualserver/tcl/. Create a file called “ps1-defs.tcl” in this directory and define the following Tcl procedures:

ad_header page_title – returns HTML, HEAD, TITLE, and BODY tags, with argument enclosed within the TITLE tags

ad_footer – returns a string that will close the BODY and HTML tags

Reload the quotations.tcl page and you get … the same error! AOLserver doesn’t know that you’ve added a file to the private library; this is only checked at server startup. Go to a Unix® shell and “restart-aolserver yourservername” (this is the big hammer; it kills your server’s Unix® process so that Unix® will restart AOLserver automatically). If restart-aolserver does not come back with “Killing 10234” or some other process ID, you’ll know that you did not succeed (perhaps you made a typo when specifying your server name).

Reload the quotations.tcl page and you get … a slightly different error! The program is trying to query a table that doesn’t exist: quotations.

Go back to your sql shell and restart SQL*Plus. Copy the table definition from the comments at the top of the file quotations.tcl and feed this definition to Oracle. Go back to your Web browser and reload the page that previously gave you an error. Things should now work, although the quotations table is empty.

Use the form on the web page to manually add the following quotation, under an appropriate category of your choice: “640K ought to be enough for anybody” (Bill Gates). Note that it would be funnier if our table had a column for recording the date of the quotation (1981) but we purposely kept our data model as simple as possible.

Return to SQL*Plus and SELECT * from the table to see that your quotation has been recorded. The horrible formatting is an artifact of your having declared the quote column to be 4000 characters long.

In SQL*Plus, insert a quotation with some hand-coded SQL (if you are feeling lazy, you can cut and paste some SQL from the AOLserver error log; all SQL statements that AOLserver sends to Oracle are logged here). Now reload the quotations.tcl URL from your Web browser. If you don’t see your new quote here, that’s because you didn’t type COMMIT; at SQL*Plus. This is one of the big features of a relational database management system: simultaneously connected users are protected from seeing each other’s unfinished transactions.

Loading Tables from .csv Files

Now it is time to preload your quotations database with some interesting material. Load /web/yourvirtualserver/www/psets/ps1/quotations.csv into Emacs and look at the format of the file (this is a standard kind of output that you can get from any desktop spreadsheet program). Using SQL*Loader (see Oraexercise 1 below), load these data into your quotations table.

Working with AOLserver and Oracle

Let’s look at how to access the database from Tcl programs. The basic idea is that AOLServer includes a data abstraction called a set, defined by the operations listed under the ns_set API. A set is a collection of (key,value) pairs, which should be a familar idea from 6.001. Selecting from a table with ns_db select returns an identifier for a set, whose keys are the names of the selected columns. Subsequent successive calls with ns_db getrow will fill in the values in this set with the values from successive selected rows. For example, suppose you obtain a set identifier by selecting the following table with ns_select:

WRITERS BOOKS
Tolstoy Anna Karenina
Steinbeck Grapes of Wrath
Greenspun Guide to Web Publishing

Then, after the first call to ns_db getrow the set will be

{{writer Tolstoy} {book "Anna Karenina"}}

After the second call to ns_db getrow the set will will be

{{writer Steinbeck} {book "Grapes of Wrath"}}

and after the third call the set will be

{{writer Greenspun} {book "Guide to Web Publishing"}}

The programs in the files quotations.tcl and quotation-add.tcl illustrate these ideas. It will be well worth your while to study these programs until you understand how they work, because you’ll be doing a lot of this kind of programming throughout the semester.

Exercise 6a: Eliminating the Lock Table via a Sequence

Read about Oracle’s sequence database object in Data Modeling. By creating a sequence, you should be able to edit quotation-add.tcl to

  • eliminate the lock table
  • eliminate the begin and end transaction (since you’re no longer tying multiple SQL statements together)
  • generate a key for the new quotation within the INSERT statement

Exercise 7: Improving the User Interface for Data Entry

Go back to the main quotations.tcl page and modify it so that categories entry is done via a select box of existing categories (you will want to use the “SELECT DISTINCT” SQL command). For new categories, provide an alternative text entry box labeled “new category”. Make sure to modify quotation-add.tcl so that it recognizes when a new category is being defined.

Exercise 8: Searching

Add a small form at the top of quotations.tcl that takes a single query word from the user. Build a target for this form that returns all quotes containing the specified word. Your search should be case-insensitive and also look through the authors column. Hint: like ‘%foo%’.

Personalizing Web Services with Cookies

We’d like you to build a system that implements per-user personalization of the quotation database. The overall goal should be

  • A user can “kill” a quotation and have it never show up again either from the top-level page or the search page.
  • Killing a quotation is persistent and survives the quitting and restarting of a browser.
  • Quotations killed by one user have no effect on what is seen by other users.
  • Users can erase their personalizations and see the complete quotation database again by clicking on an “erase my personalization” link on the main page. This link should appear only if the user has personalized the quotation database.

You can personalize Web services with the aid of magic cookies. A cookie issued by the server directs the browser to store data in browser’s computer. To issue a cookie, the server includes a line like

Set-Cookie: cookie_name=value; path=/ ; expires=Fri, 01-Jan-2010 01:00:00 GMT

in the HTTP header sent to the browser. Here cookie_name is the name for this cookie, and value is the associated value, which can contain any character or format except for semicolon, which terminates a cookie. The path specifies which URLs on the server the cookie applies to. Designating a path of slash (/) includes all URLs on the server.

After the browser has accepted a server’s cookie, it will include the cookie name and value as part of its HTTP requests whenever it asks that server for an applicable URL. Your Tcl programs can read this information using the AOLServer API

[ns_set get [ns_conn headers] Cookie]

After the expiration date, the browser no longer sends the cookie information. The server can also issue cookies with no specified expiration date, in which case, the cookie is not persistent – the browser uses it only for that one session.

You can see an example of how cookies are issued and read, by visiting the URL http://yourvirtualserver/psets/ps1/set-cookies.tcl and examining the Tcl for file and the associated URLs check-cookies.tcl and expire-cookies.tcl. Observe how expire-cookies gets rid of cookies by reissuing them with an expiration date that has already past.

Reference: The magic cookie spec is available from Persistent Client State HTTP Cookies.

Exercise 9

Implement the personalized quotation system described above.

Hint 1: it is possible to build this system using an ID cookie for the browser and keeping the set of killed quotations in Oracle. However, if you’re not going to allow users to log in and claim their profile, there really isn’t much point in keeping data on the server. In fact, by keeping killed quotation IDs in your users’ browser cookies, you’ve achieved the holy grail of academic database management system researchers: a distributed database!

Hint 2: it isn’t strictly copacetic with the cookie spec, but you can have a cookie value containing spaces. Tcl stores a list of integers internally as those numbers separated by spaces. So the easiest and simplest way to store the killed quotations is as a space-separated list.

Hint 3: don’t filter the quotations in Tcl; it is generally a sign of incompetent programming when you query more data from Oracle than you’re going to display to the end-user. SQL is a very powerful query language. You can use the NOT IN feature to exclude a list of quotations.

How about taking another break?

Sharing Data with XML

As you learned above from querying bookstores, data on the Web have not traditionally been formatted for convenient use by computer programs. In theory, people who wish to exchange data over the Web can cooperate using XML, a 1997 standard from the Web Consortium. In practice, hardly anybody uses XML right now (1999). Fortunately for your sake in completing this problem set, you can cooperate with your fellow students: the overall goal is to make quotations in your database exportable in a structured format so that other students’ applications can read them.

Here’s what we need in order to cooperate:

  • an agreed-upon URL at everyone’s server where the quotations database may be obtained: “/quotations-xml.tcl”
  • an agreed-upon format for the quotations.

We’ll format the quotations using XML, which is simply a conventional notation for describing structured data. XML structures consist of data strings enclosed in HTML-like tags of the form <foo> and </foo>, describing what kind of thing the data is supposed to be.

Here’s an informal example, showing the structure we’ll use for our quotations:

<quotations>
<onequote>
<quotation_id>1</quotation_id>
<insertion_date>1999-02-04</insertion_date>
<author_name>Bill Gates</author_name>
<category>Computer Industry Punditry</category>
<quote>640K ought to be enough for anybody.</quote>
</onequote>
<onequote>
.. another row from the quotations table …
</onequote>
… some more rows 
</quotations>

Notice that there’s a separate tag for each column in our SQL data model:

<quotation_id>
<insertion_date>
<author_name>
<category>
<quote>

There’s also a “wrapper” tag that identifies each row as a <onequote> structure, and an outer wrapper that identifies a sequence of <onequote> stuctures as a <quotations> document.

Building a DTD

We can give a formal decription of our XML structure, rather than an informal example, by means of an XML Document Type Definition (DTD).

Our DTD will start with a definition of the quotations tag:

<!ELEMENT quotations (onequote)+>

This says that the quotations element must contain at least one occurrence of onequote but may contain more than one. Now we have to say what constitutes a legal onequote element:

<!ELEMENT onequote (quotation_id,insertion_date,author_name,category,quote)>

This says that the sub-elements, such as quotation_id must each appear exactly once and in the specified order. Now we have to define an XML element that actually contains something other than other XML elements:

<!ELEMENT quotation_id (#PCDATA)>

This says that whatever falls between <quotation_id> and </quotation_id> is to be interpreted as raw characters rather than as containing further tags (PCDATA stands for “parsed character data”).

Here’s our complete DTD:

<!-- quotations.dtd --> <!ELEMENT quotations (onequote)+>
<!ELEMENT onequote (quotation_id,insertion_date,author_name,category,quote)>
<!ELEMENT quotation_id (#PCDATA)> <!ELEMENT insertion_date (#PCDATA)>
<!ELEMENT author_name (#PCDATA)>
<!ELEMENT category (#PCDATA)>
<!ELEMENT quote (#PCDATA)>

You will find this extremely useful… Hey, actually you won’t find this DTD useful at all for completing this part of the problem set. The only reasons that DTDs are ever useful is for feeding to XML parsers because they can then automatically tokenize an XML document. For implementing your quotations-xml.tcl page, you will only need to look at informal example.

Exercise 10: Generating XML

Create a Tcl program that queries the quotations table, produces an XML document in the preceding form, and returns it to the client with a MIME type of “application/xml”. Place this in a file quotations-xml.tcl, so that other users can retrieve the data by visiting that agreed upon URL.

To get you started, we’ve provided /psets/ps1/example-xml.tcl. Requesting this URL with a Web browser should offer to let you to save the document to a local file, and you can then examine it with a text editor on your local machine. (This assumes that you haven’t defined some special behavior for your browser for MIME type application/xml.) The differences between our example and your program is that you’ll need to produce a document containing the entire table and you’ll need to generate it on the fly.

Exercise 11: Importing XML

Write a program to import a quotations database from another student’s XML output page (if you have completed Exercise 9 and your peers have not, this might be a good time to exhort them to greater efforts). Your program must

  • Grab /psets/ps1/quotations-xml.tcl from another student’s database using ns_httpget
  • Parse the file into records and the records into fields
  • If a quote from the foreign server has identical author and content as a quote in your own database, ignore it; otherwise, insert it into your database with a new quotation_id (you don’t want keys from the foreign server conflicting with what is already in your database)

Hints: You might want to set up a temporary table using create table quotations_temp as select * from quotations and then drop it after you’re done debugging. You should use DoubleApos when presenting data to Oracle for comparisons.

Rather than having you link in a 100,000-line C program (or a 5,000-line Lisp program) that parses XML documents based on a DTD, we’ve gone for simplicity here by predefining for you a parser in Tcl that understands only this particular DTD for quotations. The procedure is parse_all (TXT) (you have to install this file in your server’s private Tcl library, /web/yourvirtualserver/tcl/, for this function to be callable by .tcl and .adp pages) . The parse_all proc takes an XML quotation structure as argument and returns a Tcl list, showing the parts and subparts of the structure. To see an example of the format, use your browser to visit the page http://yourvirtualserver/psets/ps1/xml-parse-test.tcl.

Note: these exercises are designed to familiarize you with XML. In most cases, sophisticated XML processing should be done inside Oracle using Java® libraries.

Exercise 12: Tracking a Book’s Popularity

Neurotic authors will constantly check amazon.com to see where their book is ranked in terms of sales. That these figures are updated hourly only makes the habit more destructive. Write a program to track a very neurotic author’s work (ISBN: 1558605347). You will need to

  1. Define an Oracle table to hold ISBN, date-time, sales rank
  2. Write a procedure that will grab the Amazon page, REGEXP out the sales rank, and stuff it into the Oracle table
  3. Use the AOLserver API call ns_schedule_proc to schedule your procedure to run once every hour
  4. Build a .tcl page to look at the popularity over time

One of the interesting things about Amazon is that they often lose control of their server farm and database (they write a lot of C code and one programmer’s sloppiness can generate a catastrophic failure of the entire service). You might want to build your system so that you can record (a) times when amazon.com is unreachable, and (b) for which of those times the page served contains the string “Our store is closed temporarily for scheduled maintenance” (you’ll sometimes get this during the middle of weekdays when they would definitely not have intentionally scheduled any maintenance).

Exercise 13: Becoming a Chartoonist

Why print a table of a book’s popularity when you can print a chart? You’re going to learn about the wonders of single-pixel GIFs and WIDTH and HEIGHT tags now. Grab the software and put it in your server’s private Tcl directory (/web/yourservername/tcl/). Read the docs and then write code to generate a pretty chart of the data from Exercise 12.

Note that you’re dipping into the ArsDigita Community System toolkit here, the software with which you’ll be occupied in Problem Set 2.

The Wide World of Oracle

We’re going to shift gears now into a portion of the problem set designed to teach you more about Oracle and SQL.

Oraexercise 1: SQL*Loader

  • create a tab-separated file in Emacs containing five lines, each line to contain your favorite stock symbol, an integer number of shares owned, and a date acquired (in the form MM/DD/YYYY)
  • create an Oracle table to hold these data: 
create table my_stocks ( symbol varchar(20) not null, n_shares integer not null, 
date_acquired date not null );
  • use the sqlldr shell command to invoke SQL*Loader to slurp up your tab-separated file into the my_stocks table (see page 1183 of Oracle8: The Complete Reference and the official Oracle docs.)

Oraexercise 2: Copying Data from One Table to Another

This exercise exists because we found that, when faced with the task of moving data from one table to another, programmers were dragging the data across SQL*Net from Oracle into AOLserver, manipulating it in Tcl, then pushing it back into Oracle over SQL*Net. This is not the way! SQL is a very powerful language and there is no need to bring in any other tools if what you want to do is move data around within Oracle.

  • using only one SQL statement, create a table called stock_prices with three columns: symbol, quote_date, price. After this one statement, you should have created the table and filled it with one row per symbol in my_stocks. The date and price columns should be filled with the current date and a nominal price. Hint: select symbol, sysdate as quote_date, 31.415 as price from my_stocks;.
  • create a new table: 
create table newly_acquired_stocks ( symbol varchar(20) not null, n_shares integer not null, 
date_acquired date not null );
  • using a single insert into .. select … statement (with a WHERE clause appropriate to your sample data), copy about half the rows from my_stocks into newly_acquired_stocks

Oraexercise 3: JOIN

With a single SQL statement JOINing my_stocks and stock_prices, produce a report showing symbol, number of shares, price per share, and current value.

Oraexercise 4: OUTER JOIN

Insert a row into my_stocks. Run your query from Oraexercise 3. Notice that your new stock does not appear in the report. This is because you’ve JOINed them with the constraint that the symbol appear in both tables.

Modify your statement to use an OUTER JOIN instead so that you’ll get a complete report of all your stocks, but won’t get price information if none is available.

Oraexercise 5: PL/SQL

Inspired by Wall Street’s methods for valuing Internet companies, we’ve developed our own valuation method for this problem set: a stock is valued at the sum of the ascii characters making up its symbol. (Note that students who’ve used lowercase letters to represent symbols will have higher-valued portfolios than those will all-uppercase symbols; “IBM” is worth only $216 whereas “ibm” is worth $312!)

  • define a PL/SQL function that takes a trading symbol as its argument and returns the stock value (hint: Oracle’s built-in ASCII function will be helpful)
  • with a single UPDATE statement, update stock_prices to set each stock’s value to whatever is returned by this PL/SQL procedure
  • define a PL/SQL function that takes no arguments and returns the aggregate value of the portfolio (n_shares * price for each stock). You’ll want to define your JOIN from Oraexercise 3 (above) as a cursor and then use the PL/SQL Cursor FOR LOOP facility. Hint: when you’re all done, you can run this procedure from SQL*Plus with select portfolio_value() from dual;.

SQL*Plus Tip: though it is not part of the SQL language, you will find it very useful to type “/” after your PL/SQL definitions if you’re feeding them to Oracle via the SQL*Plus application. Unless you write perfect code, you’ll also want to know about the SQL*Plus command “show errors”. For exposure to the full range of this kind of obscurantism, see the SQL*Plus User’s Guide and Reference.

Oraexercise 6: Buy More of the Winners

Rather than taking your profits on the winners, buy more of them!

  • use SELECT AVG() to figure out the average price of your holdings
  • Using a single INSERT with SELECT statement, double your holdings in all the stocks whose price is higher than average (with date_acquired set to sysdate)

Rerun your query from Oraexercise 4. Note that in some cases you will have two rows for the same symbol. If what you’re really interested in is your current position, you want a report with at most one row per symbol.

  • use a SELECT … GROUP BY query from my_stocks to produce a report of symbols and total shares held
  • use a SELECT .. GROUP BY query JOINing with stock_prices to produce a report of symbols and total value held per symbol
  • use a SELECT .. GROUP BY .. HAVING query to produce a report of symbols, total shares held, and total value held per symbol restricted to symbols in which you have at least two blocks of shares (i.e., the “winners”)

Oraexercise 7: Encapsulate your Queries with a View

Using the final query above, create a view called stocks_i_like that encapsulates the final query.

Information Architecture and User Interface

You’ve got a database table filled with stock data. There are a couple of ways to provide a Web interface to these data. The loser Web developer presents a page with options for retrieving stock data:

  • show recent acquisitions
  • show best performers
  • show highest value stocks
  • show entire portfolio

The bottom line is that the user doesn’t see anything on the first page. From an information presentation point of view, the first page is therefore a waste.

An alternative is to show the user a table of holdings right on the top-level page, with a sensible subset of the data by default, and provide controls to adjust what is included in the display. What kind of controls? You could have the ones above, plus whatever other views the publisher of the site and the users eventually decide are necessary. Suppose, though, that you can organize the controls along orthogonal dimensions. If you can do that, with just a handful of “dimensional sliders”, the user will have many options.

For example, the ArsDigita ticket tracking system is used to store bug reports and feature requests. The following dimensions are employed

  • involvement of connected user (values: mine/everyone’s)
  • ticket status (values: open/+deferred/+closed)
  • ticket age (values: last day/last week/last month/all)

Note that even though these are modeled as continuous dimensions, the user is not presented with continuous sliders. The user picks one of several discrete points on each dimension. This interface is compatible with the Netscape 1.1 browser and provides the user with 24 choices in total. Yet instead of seeing 24 options in a big list, the user sees one line across the top of the browser, with nine buttons arranged logically into three dimensions.

If those 24 options aren’t enough, the ticket tracking system lets the user re-sort the table by any of the columns by clicking on the column heading.

Try building the same sort of thing for your stock portfolio. You want a .tcl page that shows the contents of my_stocks with stock_prices. Provide controls across the top (hint: TABLE WIDTH=100% and TD ALIGN= RIGHT will be useful) for the following dimensions:

  • recency of acquistions (within last week/last month/last year/all)
  • value of holding (more than 10% of total portfolio/more than 2%/all)

Provide the ability for users to sort by any of the columns presented (hint: export_ns_set_vars in $SERVER_HOME/tcl/00-ad-utilities.tcl will be useful for this, notably because of the exclusion_list argument), e.g., symbol, number of shares, price per share, value of holding.

You can build this from scratch or use the ArsDigita Community System toolkit API calls as a building block.

Who Wrote This and When

This problem set was written by Philip Greenspun and Hal Abelson in January 1999. It is copyright 1999 by them but may be reused provided credit is given to the original authors.

Reading for This Week

Objectives

Teach people to understand the dimensions of the site design and content management problems. They first need to understand what challenges a site publisher faces then build software to support the publisher, programmers, and authors.

The Huge Picture
The Big Picture
The Picture
Functional Spec
Assignment Details

The Huge Picture

Building a Web site is trivial. Maintaining a Web site is hard.

The Big Picture

It is pretty easy to build and maintain a Web site if:

  • One person is publisher, author, and programmer
  • The site comprises only a few pages
  • Nobody cares whether these few pages are formatted consistently
  • Nobody cares about retrieving old versions or figuring out how a version got to be the way that it is

Sadly for you, the Web service developer, the preceding conditions seldom obtain. What is more typical are the following conditions:

  • Labor is divided among publishers, information designers, graphic designers, authors, and programmers
  • The site contains thousands of pages
  • Pages must be consistent within sections and sections must have a unifying theme
  • Version control is critical

In fact, the project that you’ll be working on later this semester will probably require a content management system of some sort.

The Picture

We need a system that:

  • Records who is contributing to a site and in what role
  • Records publishing and design decisions
  • Collects programs and content

Functional Spec

The publisher decides (1) what major content sections are available, (2) when a content section goes live, (3) relative prominence to be assigned the content sections.

The information designer decides (1) what navigational links are available from every document on the page, (2) how to present the available content sections, (3) what graphic design elements are required.

The graphic designer contributes drawings, logos, and other artwork in service of the information designer’s objectives. The graphic designer also produces mock-up templates (static HTML files) in which these artwork elements are used. You won’t have to demonstrate this role in your prototype, but make sure that your data model supports this.

The programmer builds production templates (HTML with embedded Tcl and SQL) that reflect the instructions of publisher, information designer, and graphic designer.

In keeping with their relative financial compensation, we consider the needs and contributions of authors second to last. Authors stuff fragments of HTML, plain text, photographs, music, sound, into the database. These authored entities will be viewed by users only through the templates developed by the programmers.

In keeping with their relative popularity among authors and publishers, we place the perennially kicked-around editor last. Editors approve content and decide when specific pages go live. Editors assign relative prominence among pages within sections.

Assignment Details

Your “practice project” will be a content management system to support a guide to Boston, along the lines of Boston. You will need to produce a design document and a prototype implementation. The prototype implementation should be able to support the following scenario:

  1. Log in as publisher and visit /admin/content-sections/
  2. Build a section called “movies” at /movies
  3. Build a section called “dining” at /dining
  4. Build a section called “news” at /news that simply uses the existing ACS news system
  5. Log out
  6. Log in as information designer and visit /cm and specify navigation. From anywhere in dining, readers should be able to get to movies. From movies, readers should be able to dining or news.
  7. Log out
  8. Log in as programmer and visit /cm
  9. Make two templates for the movie section, one called movie_review and one called actor_profile; make one template for the dining section called restaurant_review
  10. Log out
  11. Log in as author and visit /cm
  12. Add two movie reviews and two actor profiles to the movies section and a review of your favorite restaurant to the dining section
  13. Log out
  14. Log in as editor and visit /cm
  15. Approve two of the movie reviews, one of the actor profiles, and the restaurant review
  16. Log out
  17. Without logging in (i.e., you’re just a regular public Web surfer now), visit the /movies section and, ideally, you should see that the approved content has gone live
  18. Follow a hyperlink from a movie review to the dining section and note that you can find your restaurant review
  19. Log in as author and visit /cm
  20. Eedit the restaurant review to reflect a new and exciting dessert
  21. Log out
  22. Visit the /dining section and note that the old (approved) version of the restaurant review is still live
  23. Log in as editor and visit /cm and approve the edited restaurant review
  24. Log out
  25. Visit the /dining section and check that the new (with dessert) version of the restaurant review is being served

Exercise 1: Build the Data Model

Using the file naming and placement conventions set forth, create a SQL data model file. Before you can do this, you have to come up with a name for your module. In order to make life easier for us in looking over your shoulder, please refrain from being creative and call your module “cm”.

Here are some guidelines for your data model:

  • Any time you are representing a person, you should do it by referencing the users table.

  • Any time you have to lump users together because of a special property (e.g., “authority to make publishing decisions”), make sure that you do it with the user-groups facility of the ArsDigita Community System (goes by the short name of “ug”).

  • For the actual content of the site (the authored articles), make sure that you have an audit trail. There are two classical ways to do this. The first is to set up separate audit tables, one for each production table. Every time an update is made to a production table, the old row is written out to an audit table, with a time stamp. This can be accomplished transparently via RDBMS triggers (see Triggers). We aren’t going to do things this way! Instead, you should adopt the second classical approach: keep current and archived information in the same table. This is more expensive in terms of computing resources required because the information that you want for the live site is interspersed with seldom-retrieved archived information. But it is easier if you want to program in the capability to show the site as it was on a particular day. Your templates won’t have to query a different table, they will merely need a different WHERE clause.

    Note: this is the kind of design trade-off that you have to make every day as a Web service developer. In general, if something greatly simplifies programming at the cost of additional computing resources, you should adopt the simper approach. Computers are cheap and getting cheaper. Programming is hard and expensive. This isn’t an argument for profligacy. In this case, we’re assured by the fact that the tables we’re archiving contain information typed in by users and that versioning only happens when users take time-consuming actions such as filling out forms and hitting “Submit”. It is very difficult to fill up a modern disk drive with information typed by even a large collection of human beings.

  • For modeling the major areas of the Web service, use the  content_sections table in the community-core.sql file. You’ll have to augment it at the very least with a templated_p column to indicate that this content section is generated by the cm system.

  • For capturing the navigation decisions of the information designer, call your table cm_navigation (one row for each link from one section to another)

  • For templates, rely on AOLserver’s ADP facility (see Example 6 in Chapter 10 of Philip and Alex’s Guide to Web Publishing and AOLserver Tcl Developer’s Guide, Chapter 2)

  • Keep everything in the relational database, including ADP scripts; synchronizing data in a file system with tables in a relational database adds a tremendous amount of complexity

  • Generally you’ll want to associate one individual with a content element as the owner (from the users table) and then allow modifications by a group of users (from the user_groups table).

Deliverable 1: The Design Document

When doing real Web projects, you have to coordinate multiple contributors. The best way to do this is with a design document. The document should include:

  1. A data model (/doc/sql/cm.sql)
  2. What sections of the ACS will need to be modified and how
  3. What new directories you intend to create
  4. Functional specs for the Tcl scripts you expect to create
  5. Work plan: who is going to do what and in what order

This problem set itself specifies a work plan to some extent and you should read the entire problem set carefully, but you shouldn’t take our plan as gospel.

Before working further on this problem set, discuss and refine your design document with a TA.

Now that you’ve gotten agreement that the design document is reasonable, it is time to build the prototype. We suggest a work plan below, but if you’ve gotten approval to proceed in some other order, that’s fine.

Exercise 2: Create User Groups

Using the Web forms in /admin/ug/ create a new user group type: “cm”. Create six groups of this type: publisher, information designer, graphic designer, programmer, author and editor. Put a couple of users in each group.

Make sure that you’re using appropriate Tcl API calls to check user group membership before allowing users to take particular actions.

Exercise 3: Extend /admin/content-sections/

Warm yourself up by visiting the /admin/content-sections/ directory and augmenting the admin pages for the content_sections table, to which your data model has presumably added at least one column.

Define brand-new movies and dining content sections; also define a non-templated news content section that points to the existing /news facility.

Exercise 4: Navigation

Create the /cm directory and a subdirectory for information designers (/cm/id/). The index.tcl page in the /cm directory should check a user’s role assignments (memberships in the content management user groups). If a user is in the information designer group, a link to the /cm/id/ subdirectory should be presented. The scripts in the /cm/id/ directory should let the information designer specify which content sections are to have links to other content sections.

Define a procedure called cm_navbar that a programmer can insert into a template. This procedure takes no arguments and returns the HTML fragment for a section-appropriate navigation bar. The procedure should:

  • Use ns_conn url to figure out from where it is being called
  • Grab a database handle from the subquery pool (ns_db gethandle subquery)
  • Look at the content_sections table to see which content section the current URL is part of
  • Look at the cm_navigation table to see what links should be offered from the current content section
  • Release the database handle (you must do this before the return statement)
  • Return the HTML string

Define a link from dining to movies. Define links from movies to dining and news.

Exercise 5: Templates

You have to define forms to let programmers create templates. Each template is an ADP script that pulls information for a page from a single database table.

Create a table to hold all the templates (cm_templates is a reasonable name). Make sure that this table can hold at least the following information, which will be supplied by the programmers when they define new templates:

  • A human-readable but Oracle-friendly key. For example, a template to display movie reviews would be known as a “movie_review” rather than “3”. (Your program should trap the error where a user tries to create a new template with the same key as an old one and return a page explaining the problem with a link to \“edit old template\”)
  • The ADP template to execute (in a CLOB column)
  • The name of the table that will hold information for individual pages using this template; by convention this should probably be ${key}_info, e.g., “move_review_info”. Each row of this table will hold enough information to fill the template (i.e., each row of this table will correspond to a distinct viewable page on the Web).
  • The Oracle CREATE TABLE statement for the template’s _info table

Programmers who write templates and create these _info table definitions will be required to remember that each _info table must contain a fixed set of cm system columns for page_key, version, modification_date, approved_p, approved_by, and author_id.

Note that it would probably be better to have a separate table with one row for each column of a template and then generate the CREATE TABLE for the _info automatically, more or less like user_group_type_fields. We aren’t doing that in this problem set because we want you to be able to get through it quickly.

Put your scripts for adding and editing templates in a directory called /cm/templates/. Users in the programmer group who visit /cm/index.tcl should be offered the option of visiting /cm/templates/. If a user who isn’t in the programmer group happens to visit /cm/templates/, they shouldn’t be able to add or edit templates.

Define templates for movie reviews, actor profiles, and restaurant reviews.

At this point, you’re probably screaming with rage at having to edit source code using Web forms. Unfortunately, right now this is pretty much the only way that you have to get information into and out of Oracle. Powerful tools like Emacs aren’t set up to talk directly to relational databases, which is why people still use the Unix® file system despite its myriad shortcomings. Relief is in sight towards the middle of 2000 with a patch to the 8.1.6 release of Oracle. This version of the database can pretend to be a Windows®-protocol file server. So you could run Emacs on a PC and have it write files to and from the “O: drive”. You and Emacs would think that these were plain ordinary files, but the information would actually be stored in a CLOB column of a database table. Oracle calls this “iFS”.

Exercise 6: Template-Section Mapping

Return to the /admin/content-sections/ directory and add some Tcl scripts to allow the publisher to decide which templates are to be included in which content sections. Presumably you’ll need a cm_template_section_map table to reflect these choices.

Associate the movies section with movie_review and actor_profile templates. Associate the dining section with the restaurant_review template.

Exercise 7: Supporting Authors

When a user in the authors group requests the /cm/ page, he or she should be invited to contribute an article for one of the templates. The templates should be organized by content section for presentation.

Build forms that let an author add or edit an article. Your Tcl scripts will have to automatically generate forms by

  1. Looking in the cm_templates table to find the _info table name associated with a template
  2. Using the ns_column API call to find out what fields are required for a particular template (ignoring the key, version, modification_date, approved_p, and author_id system columns)

After an author adds or edits a rows in an _info table, your scripts must generate keys and set system columns appropriately.

Add two movie reviews, two actor profiles, and one restaurant review.

Exercise 8: Supporting Editors

When a user in the editors group requests the /cm/ page, he or she should see a list of all the currently contributed-but-unapproved content. For your prototype, you don’t have to worry about letting the editors actually edit; it is enough that they can toggle the approved_p column and that your software records the user_id of the editor who approved the item.

Approve everything except one of the actor profiles.

Exercise 9: Build the Public Pages

Now we have enough information in the database that we can consider serving pages to the public. We could “compile” a Web site from the information in the database. This would entail looping through the content_sections and cm_templates tables and building Unix file system directories filled with .adp files. The drawback to this approach is that there is a risk of the database and the file system becoming out of sync. Furthermore, we’d like ultimately to version and approve templates much as we version and approve content. That argues for a completely virtual approach to serving public pages. The fundamental AOLserver facility that supports this is ns_register_proc, which tells the server to run a Tcl script whenever it sees a matching URL pattern.

Put a file of Tcl scripts in a file called “cm.tcl” in your server’s private Tcl directory (/web/yourservername/tcl/). These scripts will be evaluated by your AOLserver when it starts up and when you request “re-initialize private Tcl” from the /NS/Admin pages.

Your file must:

  1. Get a database connection
  2. Query the content_sections table for those sections that require virtual directories (i.e., dining and movies but not news)
  3. Use ns_register_proc to tell AOLserver to send any request starting with “/dining” or “/movies” to a Tcl procedure (call it cm_serve_section)
  4. Release the database connection
  5. Define the cm_serve_section Tcl procedure to actually deliver a content section

Remember that the same cm_serve_section procedure is called for every content section. So it must use ns_conn url to figure out which content section has been requested. Also, the same procedure is called for every subpage within a section. So cm_serve_section has to be smart enough to either display a list of articles (e.g., if the user requests /movies) or display one article by evaluating a template. For example, if the user requests /movies/movie_review/39, cm_serve_section will have to look up page_key 39 in movie_review_info, set local Tcl variables to the values from the database, pull the movie_review ADP template from cm_templates, then evaluate the ADP template in the context of these local variables with ns_adp_parse.

Remember that cm_serve_section should only offer approved articles, only offer one version of each article, and that the version displayed should be the most recent approved version. Implement Hint: you can query from a table, ordered by modification_date desc, get the first row, then use ns_db flush $db to throw away the rest of the query results.

Visit the /dining/ and /movies/ URLs on your site and verify that appropriate articles are being presented to the public. If it doesn’t work perfectly the first time, edit the code in your cm.tcl file and restart your AOLserver.

Exercise 10: What’s for Dessert?

Now we just have make sure that all the steps in the “more concretely” section work properly. Demonstrate the final steps of editing and approving a change to the restaurant review.

Who Wrote This and When

This problem set was written by Philip Greenspun in February 1999. It is copyright 1999 by him but may be reused provided credit is given to the original author.

Reading for This Week

  • Transactions chapter from Oracle Concepts document.

Objectives

Teach students the virtues of metadata. More specifically, they learn how to formally represent the requirements of a Web service and then build a computer program to generate the computer programs that implement that service.

The Huge Picture   
The Big Picture   
The Medium-Sized Picture   
A Data Model to Represent your Data Model   
Optional Assignment

The Huge Picture

Organizations have complex requirements for their information systems. They also have insane schedules and demand that you build their system within weeks. Finally, they are fickle and have no compunction about changing the requirements mid-stream.

The Big Picture

Corporations all have knowledge management systems even though generally they may not have any knowledge. Universities claim to have knowledge and yet none have knowledge management systems. You are going to build a knowledge management system for your university’s computer science department. In order to ensure that you can get a job after this course is over, you must refer to your final product as “a KM system”.

Another issue is a perennial side-show in the world-wide computer programming circus: the spectacle of nerds arguing over programming tools. The data model can’t represent the information that the users need, the application doesn’t do what what the users need it to do, and instead of writing code, the “engineers” are arguing about Java® versus Lisp versus Perl versus Tcl. If you want to know why computer programmers get paid less than medical doctors, consider the situation of two trauma surgeons arriving at an accident scene. The patient is bleeding profusely. If surgeons were like programmers, they’d leave the patient to bleed out in order to have a really satisfying argument over the merits of two different kinds of tourniquet.

If you’re programming one Web page at a time, you can switch to the Language du Jour in search of higher productivity. But you won’t achieve significant gains unless you switch from writing code for one page. You need to think about ways to write down a formal description of the application and user experience, then let the computer generate the application automatically.

The Medium-Sized Picture

Knowledge is text, authored by a user of the community. The user may attach a document, photograph, or spreadsheet to this text. Other users can comment on the knowledge, submitting text and optional attachments of their own. What distinguishes a knowledge management system from the standard /bboard module of the ACS? The following:

  • Links among knowledge objects
  • Facilities for browsing and searching that are more powerful than in the standard /bboard system, where browse-by-category is the only option beyond full-text search
  • Tracking and rewarding of contributions and reuse

A Data Model to Represent your Data Model

Business people like to talk about “objects” rather than “rows in tables”. This doesn’t mean that the MBA curriculum includes inheritance and method combination in the Common Lisp Object System. What seems to have happened is that

  1. Xerox PARC and MIT developed object-oriented programming systems in the 1970s, including Smalltalk and the Lisp Machine
  2. People hailed these systems as rather advanced
  3. The business community finally picked up the object-oriented buzz during the 1990s

When a business person talks about an “object”, what he or she generally means is “a row in a relational database table”.

In order to make your system comprehensible to the CIO/CTO types who will be adopting it, we’ll use their vocabulary. A table row is an “object” and each column is an “element” of that object.

We need a way to represent the kinds of objects that our system will represent first. Let’s assume that we’ll have at least the following object types:

  • Person
  • Publication
  • Data structure
  • System
  • Algorithm
  • Problem
  • Language

To say that “John McCarthy developed the Lisp programming language”, the user would create two objects: one of type language and one of type person. Why not link to the users table instead? John McCarthy might not be a registered user of the system. Some of the people you’ll be referencing, e.g., John Von Neumann, are dead. Characteristics of the Lisp language would be stored as elements of the language object.

For an example of what a completed system of this nature might look like to the casual reader, visit Paul Black’s Dictionary of Algorithms and Data Structures.

For each object type we’ll be creating an Oracle table. For each Oracle table we create, we store one row in the metadata table:

create table km_metadata_object_types (
        table_name              varchar(21) primary key,
        pretty_name             varchar(100) not null,
        pretty_plural             varchar(100)
);

We need to store information about what elements will be kept for each type of object. Note that some elements are common across object types:

  • Name (a short description of the thing)
  • Overview (a longer description)
  • Who created this object
  • When they created it
  • Who modified it last
  • When they modified it
  • Who has the right to modify it
  • Who has the right to view it
  • Does it need approval?
  • Has it been approved?
  • If so, by whom and when?
  • Iif so, under what section?

You won’t be writing code to implement fancy permissioning so not all of this information will be useful in this problem set. However, it is good to have it in your data model even if you aren’t going to build .tcl pages to update or query it. Notice that you may need to define extra tables to support some of this many-to-one information, e.g., the access control list for an object.

For elements that are unique to an object type (i.e., elements other than the standard ones listed above), we need to insert one row in a metadata table per element:

create table km_metadata_elements (
        metadata_id             integer primary key,
        table_name              not null references km_metadata_object_types,
        column_name             varchar(30) not null,
        pretty_name             varchar(100) not null,
        abstract_data_type      varchar(30) not null,  – ie. “text” or “shorttext” “boolean” “user”
 – this one is not null except when abstract_data_type is “user”
        oracle_data_type        varchar(30),   – “varchar(4000)”
        – e.g., “not null” or “check foobar in (‘christof’, ‘patrick’)”
        extra_sql               varchar(4000),
        – values are ‘text’, ‘textarea’, ‘select’, ‘radio’,
 – ‘selectmultiple’, ‘checkbox’, ‘checkboxmultiple’, ‘selectsql’
        presentation_type       varchar(100) not null,
        – e.g., for textarea, this would be “rows=6 cols=60”, for select, Tcl list,
        – for selectsql, an SQL query that returns N district values
        – for email addresses mailto:
        presentation_options    varchar(4000),
        – pretty_name is going to be the short prompt,
 – e.g., for an update page, but we also need something
 – longer if we have to walk the user through a long form
        entry_explanation       varchar(4000),
 – if they click for yet more help
        help_text               varchar(4000),
        – note that this does NOT translate into a “not null” constraint in Oracle
        – if we did this, it would prevent users from creating rows incrementaly
        mandatory_p             char(1) check (mandatory_p in (’t',‘f’)),
        – ordering in Oracle table creation, 0 would be on top, 1 underneath, etc.
        sort_key                integer,
        – ordering within a form, lower number = higher on page
        form_sort_key           integer,
        – if there are N forms, starting with 0, to define this object,
 – on which does this go?  (relevant for very complex objects where
 – you need more than one page to submit)
        form_number             integer,
        – for full text index
        include_in_ctx_index_p  char(1) check (include_in_ctx_index_p in (’t',‘f’)),
        – add forms should be prefilled with the default value
        default_value           varchar(200),
 check ((abstract_data_type not in (‘user’) and oracle_data_type is not null)
                or
              (abstract_data_type in (‘user’))),
        unique(table_name,column_name)
);

Notice that the km_metadata_elements table contains specifications for 1) generating the CREATE TABLE sql commands that you’ll need to build the database data structures for storing knowledge and 2) building user interfaces to manipulate data in those tables.

If you have trouble feeding the above table definition to SQL*Plus running under an Emacs shell, cut and paste it into a file called “foo.sql”. Then run

 sqlplus student23/thepassword < foo.sql

from the shell.

Exercise 1: Use Prototype Builder to Construct Admin Pages for your Metadata

Create a directory /admin/km/ under your Web server page root. Go to a Unix® shell and type

> > cd /web/yourservername/www/admin/
chmod a+w km

So that the /admin/km/ directory will be writable by the Web server. Now use the prototype builder (available on your own server at /admin/prototype/) to generate admin pages for the   
km_metadata_elements and km_metadata_object_typestables.

Exercise 2: Fill your Metadata Tables with Info

Fill your meta data tables with some info. Add one entry to the objects table for each of the object types listed above. For each object type, fill in some elements. Remember that each object will have the default fields specified above so you don’t need things like name or overview. Here are some examples:

For the person type

date_of_birth, title

For the language type

syntax_example, garbage_collection_p (whether the language has automatic storage allocation like Lisp or memory leaks like C), strongly_typed_p, lexical_scoping_p, date_first_used

For the publication type

this is for storing references to books and journal articles so you want all the fields that you’d expect to see when referencing something, include also an abstract field

For the data structure type

complexity_for_insertion, complexity_for_retrieval (varchars containing “O(1)”, “O(N)”, etc.)

For the system type

examples of systems are “Multics”, “Xerox Alto”, “TCP/IP”, “MIT Lisp Machine”, “Apple Macintosh”, “World Wide Web”. These are usually developed over a period of several years by an organization. date_of_conception, date_of_birth, organization_name (use links to objects of type person in order to represent the prime developers of the system)

For the problem type

this would be for storing something like “traveling salesman” or “dining philosophers”. Binary search". You need elements for pseudo_code and high_level_explanation. In general, objects of type problem will be linked to objects of type algorithm (which algorithm solves the problem), publication (papers that set forth the problem), and person (people who were involved in stating or solving the problem)

For the algorithm type

this would be for storing something like “Quicksort” or “Binary search”. You need elements for pseudo_code and high_level_explanation. In general, objects of type algorithm will be linked to objects of type problem (what need the algorithm addresses), publication (papers that describe the algorithm or implementations of it) person (people who were involved in developing the algorithm)

Exercise 3: Write a Program to Generate DDL Statements

Write a script called /admin/km/generate-ddl.tcl that will generate CREATE TABLE statements from the meta data tables. It can simply output this to the Web browser with a MIME type of text/plain (then you can save it to your local file system as km-generated.sql) or write it back into the Unix® file system as /admin/km/km-generated.sql (in which case you’ll have to be sure that the /admin/km/ directory is writable by the Web server user, which is generally “nsadmin”).

Each object table should have an object_id column. Use a single Oracle sequence to generate keys for this column.

In addition to the metadata-driven object table definitions we’ll define a generalized mapping table to support links between knowledge objects:

create table km_object_object_map (
 table_name_a  not null references km_metadata_object_types,
 – Assume all object type tables have integer primary keys.
        – For a generalized mapping table we can’t put an integrity
        – constraint on this field.
        table_id_a  integer not null,
        table_name_b  not null references km_metadata_object_types,
        table_id_b  integer not null,
        – User-entered reason for relating two objects, e.g.
        – to distinguish between John McCarthy the developer of
 – Lisp and Gerry Sussman and Guy Steele, who added lexical scoping
        – in the Scheme dialect
 map_comment  varchar(4000),
 creation_user  not null references users,
 creation_date  date default sysdate not null,
        primary key (table_name_a, table_id_a, table_name_b, table_id_b)
);

Notice that this table allows the users to map an object to any other object in the system, regardless of type.

For simplicity, assume that associations are bidirectional. If a user associates the Huffman encoding algorithm (used in virtually every compression scheme, including JPEG) with the person David A. Huffman (MIT Course VI grad student and then faculty member), we should also interpret that to mean that the person David A. Huffman is associated with the algorithm for Huffman encoding. This is why the columns in km_object_object_map have names like “table_name_a” instead of “from_table”.

The primary key constraint above has the side effect of creating an index that makes it fast to ask the question “is object A related to object B?”. For efficiency in querying “is object B related to object A?”, create a concatenated index on the columns in the other order. (The trees chapter of SQL for Web Nerds, gives some examples of concatenated indices. Also read the composite indices section of the Oracle Tuning manual. See also the Oracle SQL Reference section.)

Exercise 4: Write a Program to Generate a “Drop All Tables” Script

Write a script called /admin/km/generate-drop-tables.tcl that will generate DROP TABLE statements from the meta data tables. You probably won’t get your data model right the first time so you might as well be ready to clear out Oracle and start over.

Exercise 5: Build the Knowledge Capture Pages

Create a directory /km under the Web server page root. The index.tcl page should display an unordered list of object types and, next to each type, options to “browse” or “create”. You don’t have any information in the database, so you should build /km/object-create.tcl first. This page will query the metadata tables to build a data entry form to create a single object of a particular type. Build /km/object-create-2.tcl to process the results of this form. You may find util_prepare_update from /tcl/00-ad-utilities.tcl useful in building object-create-2.tcl.

When object-create-2.tcl is done inserting the row into the database, it should ns_returnredirect to object-display.tcl. This page should have small hyperlinks to edit single fields at a time (all linking to object-edit-element.tcl with different arguments). This page should show all the currently linked objects and have an “add link” hyperlink to object-add-link.tcl.

The page returned by object-add-link.tcl will look virtually identical to /km/index.tcl and will in fact link to the same URL: object-browse-one-type.tcl. When called with only table_name, this page will display a table of object names with dimensional controls at the top. The dimensions should be “mine|everyone’s” and “creation date”. The user ought to be able to click on a table header and sort by that column.

When called with extra arguments, object-browse-one-type.tcl will pass those arguments through to object-view-one.tcl and, if the user clicks a confirmation button, will eventually result in object-add-link-2.tcl being invoked. The extra arguments should be table_name_a and table_id_a, which will ultimately end up in the corresponding columns of a row in km_object_object_map.

Exercise 6: Gather Statistics

You want to know when people are looking at and reusing knowledge. Create a table to hold object views:

-- we will be updating the reuse_p column of views so it 
-- will be easier to have a primary key 
create sequence km_object_view_id;
create table km_object_views (
 object_view_id integer primary key,
 – which user
 user_id  not null references users,
 – two columns to specify which object
 object_id integer not null,
 table_name varchar(21) not null,
 view_date date not null,
 reuse_p  char(1) default ‘f’ check(reuse_p in (’t',‘f’))  
);

Modify object-view-one.tcl so that you explicitly close the TCP connection to the user (using ns_conn close). This will stop the Netscape icon to stop spinning but the AOLserver thread will remain alive so that you can log.

After the ns_conn close, insert a row into the km_object_views table iff there isn’t already a log row for this user/object pair within 24 hours. You could do this with a

  1. Open a transaction
  2. Lock the table
  3. Count the number of matching rows within the last 24 hours
  4. Compare the result to 0 and insert if necessary
  5. Close the transaction

However, you can also do this with a single ns_db dml statement. Here’s an example of an INSERT statement that only has an effect if there isn’t already a row in the table.

insert into msg_id_generator (last_msg_id)
select ('000000') from dual
where 0 = (select count(last_msg_id) from msg_id_generator);

Apply this idea to the problem of thread-safe logging if and only if there isn’t an identical row logged within the last 24 hours.

Date/Time Arithmetic

Exercise 7: Gather More Statistics

Modify object-view-one.tcl to add a “I reused this knowledge” button. This should link to object-mark-reused.tcl, a page that updates the reuse_p flag of the most recent relevant row in km_object_views. The page should raise an error if it can’t find a row to update.

Exercise 8: Explain the Concurrency Problem in Exercise 7

Explain the concurrency problem in Exercise 7 and talk about ways to address it.

Exercise 9: Do a Little Performance Tuning

Create an Oracle index on km_object_views that will make the code in exercises 6 and 7 go fast.

Exercise 10: Display Statistics

Build /admin/km/statistics.tcl to show, by day, the number of objects viewed and reused. This report should be broken down by object type and all the statistics should be links to “drill-down” pages where the underlying data are exposed, e.g., which actual users viewed or reused knowledge and when.

Exercise 11: Build a Site-wide Index

Build an index of the content in the KM system objects.

Create a file /admin/km/generate-sws-triggers.tcl to read the meta data tables and

  1. Generate database triggers that will automatically put indexable object content into the site_wide_index table
  2. Insert one row into sws_table_to_section_map for every object type defined in km_metadata_object_types

Add some new objects and verify that they are being automatically copied by the triggers into the index table.

Exercise 12: Query Pages for the Site-wide Index

Create a page /km/search.tcl that displays a form letting the user search for a phrase either through all objects or only in one type of object. For /km/search-2.tcl you’ll either want to use Intermedia or the DBMS_LOB functions and the pseudo_contains source code.

Optional Assignment

Show your system to a business school professor.

Who Wrote This and When

This problem set was written by Philip Greenspun in October 1999. It is copyright 1999 by him but may be reused provided credit is given to the original author.

Reading for This Week

Objectives

Students should learn:

  • The fundamentals of the ArsDigita Community System (the “ACS”, a toolkit that we’ll be living with for the rest of the semester),
  • How to build, install, and document a new module in the ACS,
  • Rudiments of ecommerce; how to bill someone’s credit card.

In the first problem set, you learned how to use lots of new software tools. Mostly you needed to learn commands and procedure calls. You didn’t care about the internals of Unix®, Oracle, AOLserver, or the Tcl interpreter. In this second problem set, you will learn how to extend a library of source code. Thus, you’ll have to learn about the internal structure of the toolkit. If you thought looking at your own source code was unpleasant, here’s your chance to look at 200,000 lines of someone else’s :-)

The Big Picture
The Big (Software) Picture
Making it Real
The Wide World of Oracle
The Narrow World of AOLserver

The Big Picture

Most software development projects start with a dialog between the users and the application programmers (you guys). From this dialog, the application programmers form a big picture of what needs to be done.

Users: We’re the world’s leading experts on computers. We have a big building full of researchers, hundreds of advanced computer systems, and a handful of conference rooms. We’d like a Web-based system for scheduling conference rooms.

You: How are you doing it now?

Users: Secretaries, pencils, and paper appointment books.

You: Ah. So how do you envision the computer-based system working?

Users: Jane Nerd comes to the server and asks for a room from Time X to Time Y, capable of holding N people. The server offers Jane a choice of rooms, each one accompanied by a description of the AV facilities, and she can pick whichever she likes the best.

You: Anything else special?

Users: There is a set of Uber-Users. If Alice Uber can’t find a room that she likes, we want her to be able to boot some grunts out of their reserved spot. Alice should not be able to preempt another Uber-User, though. The booted grunt should get an email alert from the server saying “Alice booted you; come back and pick another room.”

You: So any user can reserve any room?

Users**:** No. Most of the conference rooms are reservable by anyone. However, some rooms require approval of reservations by a designated administrator (who can be any user in the community), unless the reservation was requested by an Uber-User. The designated administrator is notified by email any time that a reservation is requested for a particular room, and should be able to approve or deny the request by clicking on one of two encoded URLs in the email message.

The Big (Software) Picture

You’re building a program that keeps track of users, groups of users, rooms, and room reservations. You could build this program from scratch, but we already have a documented debugged toolkit, the ArsDigita Community System, designed to support communities of users, subsets of which are collected into groups. You can probably get your program finished faster if you start with the toolkit, but more importantly you will achieve higher reliability (e.g., the user registration stuff has been debugged already) and higher maintainability (because the user and user groups portions of the software are documented in a book chapter residing at a permanent URL: Chapter 3: Scalable Systems for Online Communities).

Some of what you’ll be doing in this problem set is creating RDBMS tables from scratch and writing Tcl procedures to perform transactions on those tables, just as you did in problem set 1. Much of what you’re doing here, though, is figuring out clean ways of using and extending the toolkit. Sometimes this is as simple as visiting the toolkit admin pages (/admin) of your server and using Web forms to add user groups.

Exercise 0: Installing the Arsdigita Community System

The first thing you need to do is install the ACS. It is the base of code from which you will be building your service. Here are the steps for you to follow:

  • Move your code from problem set 1 to your home directory.

    mkdir /home/stdXX/ps1
    cd /web/studentXX/
    mv * /home/stdXX/ps1

  • Fetch the current ACS, and install it in your web directory. You might find the following commands helpful.

    cd /web/studentXX/
    wget http://software.arsdigita.com/dist/acs....tar.gz
    gunzip acs….tar.gz
    tar -xf acs….tar
    mv acs/*.

Read the installation instructions. (Which should also now be available in your /web/studentXX/www/doc/ directory.) Your AOLServers have already been configured so you can start at “Feeding Oracle the Data Model,” and skip the “Configuring AOLServer” section. Remember to run the “load-geo-tables” script before loading the ACS data model.

Exercise 0.5: Give Your Server a Personality

The “parameters” subdirectory contains server personality constants in a file called student**.ini. This has been customized with your server’s name (e.g., “yourservername” will be changed to “student51”). You will also find the original ad.ini in your parameters file. You can use this for reference but if you want to change the behavior of your server, you must edit server**.ini.

One horrible bug with AOLserver (at least in 2.3.2): If there are Emacs backup files in the /parameters directory that start with #, you might find that your AOLserver is unable to start. If all seems hopeless, do a m-x Dired on the /parameters directory and look for backup files to delete.

You should consider at least setting the system name, the publisher name, the email address of the system owner to your actual personal email address. For example, suppose you set the first parameter (SystemName) to “Joan’s Scholarly Center” and then restart your server (with restart-aolserver student** from the shell). Subsequently, you’ll see many pages of your site display “Joan’s Scholarly Center”, including the very top-level page.

Exercise 0.7: Become Your Own Best Friend

You now have an installed ArsDigita Community System but there aren’t any users. Visit the top-level page on your site (e.g., http://lcsweb51.lcs.mit.edu/) and note that you’re asked to log in. Type your standard personal email address and register as a user.

Exercise 0.8: Notice that Your Server is Self-documenting

Use a Web browser to connect to the /doc subdirectory of your server and notice that you have a complete set of ArsDigita Community System documentation. You might find it useful to read the installation instructions and you will certainly want to read the developer’s guide (developers.html). The latter starts off with an admonition to use proc_doc instead of proc when defining externally called procedures. Please follow this convention for the rest of the semester.

Exercise 0.9: Create Some Users

As noted in the installation instructions, you’ll want to add yourself as a user, log in as the “system” user, add yourself to the site administrators user group, and change the passwords of system and anonymous users.

Repeatedly visit the login pages to create a bunch of users for your community, i.e., pretend you are “billg@microsoft.com”, “president@whitehouse.gov”, “wile_e_coyote@iq207.supergenius.com” and fill out the registration form. Remember that these will be the folks who are reserving conference rooms. Make sure that at least a couple of them have real email addresses because you want to be sure that you can test your email alerts. If you don’t have any cooperative friends, you might wish to make yourself a hotmail or yahoo email account. Sadly, you can’t make several users with the same email address because we constrain the email column in the users table to be unique.

Exercise 0.95: Create a User Group

Read the documentation for the permissions package at /doc/permissions.html. You’re going to use this package to figure out if someone is an uber user or not.

Using the Web forms in /admin/ug/ create a new user group of type “Administration.” Call it “Reservation System Uber Users.” The module field should be “reserve” and leave the submodule field blank.

This is the only user group that you will need to complete the pset. Using the admin forms, place one or two users in this group. They don’t need to have any special role within the group, simple membership is fine.

Exercise 1: Build the Data Model

Using the file naming and placement conventions set forth create a SQL data model file to hold new table definitions for your room reservation system. Since part of the naming convention is that your data model file must be called “module-name.sql”, you have to come up with a name for your module. In order to make life easier for us in looking over your shoulder, please refrain from being creative and call your module “reserve”.

A good thing to do now is to step away from your computer with hardcopies of this problem set and Data Modeling. Read the problem set cover-to-cover before writing anything down. Otherwise, halfway through the pset, you might find yourself having to back up, alter tables, rewrite .tcl scripts, and rewrite documentation.

Here are some guidelines for your data model. Please note that this is not intended to be a complete listing of all the necessary steps.

  • Remember that users of your system are already represented in the ArsDigita Community System users table; an Uber User will be recognized by membership in the Uber Users user group that you created.

  • In order that your data model be comprehensible to other ArsDigita Community System programmers, try to follow variable and table-naming conventions. Start by skimming Chapter 3 of Philip and Alex’s Guide to Web Publishing, looking only at create table statements. Then surf around inside your own Web server’s /doc/sql/ directory and look at community-core.sql, news.sql, and other files. When you’re done, you should have picked up items such as

    • boolean data are represented in columns ending in “_p” and are char(1) with a check constraint limiting them to “f” and “t” (the “_p” comes from “predicate” and is an ancient Lisp programmer’s convention)
    • keys in the database are all-lowercase, no spaces, underscores for separation, e.g., “q_and_a” for a bboard presentation type.
  • Any time you are representing a person, you should do it by referencing the users table (defined in the /doc/sql/community-core.sql file). For example, here is how the bboard system records the identity of the person posting:

    create table bboard (
            msg_id          char(6) primary key,
     …
            user_id         integer not null references users,
     …

You will need to do something similar in the rooms table to record which user is the designated administrator.

  • You probably want to use generated keys for your table that holds rooms, rather than using the MIT classroom number or name. This will enable your system to handle gracefully situations in which a room is renamed because a rich person has donated money.

  • To represent room reservations, you can use start_time and end_time columns (both of type date) or with start_time and duration columns (the latter would be of type number and would represent the fraction of a day during which the room was reserved).

  • Instead of simply recording that a reservation was made (or approved), record the times of the users’ actions. Columns of type date may be used for this (and testing for NULL in the approval_time can be used to find an unapproved reservation).

Now that you’re written all of your create table statements and put them in a file named “/doc/sql/reserve.sql”, feed the definitions to Oracle. If you don’t want to cut and paste into SQL*Plus, you can use Unix® shell I/O redirection:

sqlplus / < reserve.sql

If you slept through 6.170 and didn’t learn how to write bug-free code, you might find yourself running this command repeatedly. Although Oracle provides powerful transaction management, data definition language statements such as create table can not be rolled back. So if your file defines three tables and there is an error in the definition of the second table, you will usually end up succeeding in defining two tables with the preceding command. You can go into Emacs and edit reserve.sql and then rerun the shell command. Oracle will ignore the first and third table definitions since the tables already exist and define only your second table.

You can check your work by using the SQL*Plus “describe” command, e.g., “describe users” or “describe rooms”.

Write a design spec for your system.

Include in this file a list of the legal transactions in your system. For each transaction, sketch the SQL and note any external implications, e.g., “administrator is sent email alert”.

Exercise 3a: Get the Basic Admin Pages Done

We’re not going to worry about being able to edit room names or designated room administrators for the moment. Instead, let’s build the simplest possible system that enables the site owner to create rooms:

  • Build an index.tcl page under /admin/reserve/ that shows an HTML unordered list (UL) of the names of all the rooms, a blank line, and then an “add a room” link. This page should simply handle the situation in which there are no rooms registered.

  • Build /admin/reserve/room-add.tcl (or .adp) and /admin/reserve/room-add-2.tcl that will insert a room into the database.

Note: By placing these pages under the /admin directory, you’ve restricted them to use by the site owner. This is acceptable because we presume adding a room to be an infrequent activity. Therefore, it isn’t worth building user submission pages for new rooms and admin approval pages for those submissions.

Exercise 3b: Write the Basic User Pages

We’re not going to worry about uber-users or room approval for the moment. Instead, let’s build a basic system:

  • Build /reserve/index.tcl, the gateway to the room reservation system. The script should start out by checking for a user_id cookie (use ad_verify_and_get_user_id, defined in /tcl/ad-security.tcl) and, if not present, redirecting users over to /register/index.tcl.

    For a logged-in user, the page starts out with an HTML form for requesting a new reservation. The form contains the following elements:

    • start and end date-times (or start time and duration; the user interface does not have to be consistent with the data model). If you want to make life easier for yourself, it is perfectly acceptable to force the user to enter date/times in ANSI timestamp format: “YYYY-MM-DD HH24:MI:SS”. Then you can drop this right into Oracle using a to_date.
    • number of people who will be occupying the room

    Underneath this form, the page displays a personalized list of the rooms that the authenticated user has already reserved for the future, ordered by the start times. By showing this list automatically, you free yourself from having to write special pages to confirm reservations or check reservation history.

  • Build /reserve/new.tcl that will take input from the form in index.tcl and find available rooms. This page should refrain from showing rooms that are booked for any part of the time period requested or rooms that aren’t large enough for the number of people specified. For your query, it is wise to remember that

    • you can select from the rooms table with a WHERE clause that subqueries the room_reservations table.
    • Although SQL contains a BETWEEN operator, e.g., t1.start_time between t2.start_time and t2.end_time, it might not be the best way to catch all cases of overlapping rooms.
    • a professional SQL programmer would probably end up coding this with a where not exists (select 1 from …)
    • don’t worry about efficiency; it is okay to have a query that Oracle executes with an O[N^2] plan. It might be fast enough and, in any case, we’re not paying you enough to build a scalable system.

    You’ll have to pass the user’s input from index.tcl through to new.tcl. You’ll either be doing this with URL variables (if you’ve chosen an interface based on simple links) or HTML form hidden variables (if you’ve chosen a form interface with method=POST). In /home/nsadmin/modules/tcl/utilities.tcl we’ve defined export_url_vars and export_form_vars that facilitate this. If you have start_time and end_time defined as Tcl local variables, you can export them to the next page with new-2.tcl?[export_url_vars start_time end_time] or [export_form_vars start_time end_time] inside a form.

  • Build /reserve/new-2.tcl to take input from new.tcl and ns_returnredirect back to index.tcl (where the user will see that the room has in fact been reserved). This page should make sure that there isn’t a time conflict, then insert the reservation. If there is a time conflict, return an message to that effect and a link back to new.tcl with the user’s original parameters in place (presumably this time new.tcl will not offer the already-booked room).

How could there be a time conflict given that the preceding page only showed free rooms? Here’s the concurrency situation which we’re worried about:

Olin Egghead says that he needs a room for 8 people at 3:00 pm tomorrow, for a 45-minute meeting.

Your module serves Olin the new.tcl page, showing that the Boesky Room is available. Dorothy Alvelda says that she needs a room for 5 people tomorrow at 3:30 pm.

Olin is about to confirm the Boesky Room when the phone rings. A Wall Street firm is offering him $250,000 to become a C programmer in their commodities trading division. Olin immediately replies that he wouldn’t consider abandoning his sacred academic principles for a minute.

An hour later, Olin is still on the phone.

Since Olin has not yet booked any rooms, new.tcl shows Dorothy that the Boesky Room is among the available rooms at 3:30 tomorrow.

Dorothy confirms the Boesky Room and new-2.tcl inserts a row into the database.

Olin hangs up and gets preoccupied with copying impressive-looking equations from the pizza delivery section of the Thessaloniki Yellow Pages into his latest research paper.

Olin remembers that he needs that room for tomorrow and unburies his browser window, but does not reload new.tcl (i.e., he is looking at the page generated a couple of hours ago). Oh yes, the Boesky Room will be fine. Olin clicks to reserve the Boesky Room at 3:00 pm tomorrow.

If you want to act like a professional Web programmer, you therefore have to run the same query in new-2.tcl that you ran in new.tcl to make sure that the room is in fact still free.

What do the users get from their professionally programmed systems? Double booked rooms. The average professional Web programmer probably won’t handle the case where two users submit room requests at precisely the same second.

Hold yourself to a higher standard by making new-2.tcl lock the reservations table before querying to confirm room availability. Read the transactions chapter of SQL for Web Nerds to find out how to lock tables.

Making it Real

We’ve got the rudiments. The site owner can define rooms. Users can book rooms. Reservations won’t conflict. If you were a West Coast software company, you’d declare victory at this point. Issue some press releases, ship the product in a nice-looking shrink-wrapped box (as Release 3.2), and sell some shares to the public.

One of the ugly facets of offering a Web service, though, is that users will hammer you with email until you make the thing real.

Exercise 4: Finish the Admin Pages

Here’s a step-by-step plan:

  • Link each room in the /admin/reserve/index.tcl listing to room-edit.tcl where the site owner can change a room’s name or mark a room as requiring approval. Part of marking a room as requiring approval is using /user-search.tcl to let the site owner choose a room administrator from the users table.

  • Build a /admin/reserve/reports/ directory. The index.tcl page in this new directory should list the rooms and, for each room, show the total number of hours that the room has been reserved and the average number of people in a reservation for that room. This list should be sortable by name or number of hours booked. All the stats should be limited to reservations in the preceding 12 months (use the ADD_MONTHS function in SQL).

    Hints: You can do this report with a 4-line SQL query. You don’t need to do any date arithmetic, summing, rounding, or averaging in Tcl (Oracle SQL has functions for these things). Keys to solving the problem are reading up on the SQL GROUP BY clause. Also note that order by 3 will order by the 3rd item in the select list.

    If you want to really do something nice, make the report include rooms for which there haven’t been any reservations (showing the total hours as 0; hint: use NVL). If you want to preserve the one-year limit, you’ll end up with a SQL query of the form

    select …
    from table1, (select … from table2 where …) as table2
    where table1.col1 = table2.col2(+)

  • Here you’re querying from a real table and a view created on-the-fly from table2.

  • Make the room names in the /admin/reserve/reports/index.tcl page hyperlink anchors. The targets for those anchors should be a details.tcl page where all the reservations are shown, most recent at the top.

Exercise 5: Finish the User Pages

  • Edit /reserve/new.tcl so that a member of the Uber User group sees both unreserved and reserved-but-bumpable (i.e., not reserved by another Uber User) rooms. A user who is not an Uber User should see a flag by rooms that require approval.

  • Edit /reserve/new-2.tcl to recognize the following cases:

    • Uber User bumping regular user (delete original reservation; send email to regular user; place Uber User’s reservation)

    • Regular user requesting approval-required room (insert reservation, email administrator, serve special page to user saying “your request has been submitted to … and we’ll let you know if it is approved”)

    See the AOLserver Tcl Developer’s Guide at AOLserver for an explanation of the ns_sendmail API call.

  • We would like room administrators to be able to approve or deny room requests directly from their email client (“one-click approval”). So the email notice to a room administrator must contain a URL which, if visited, will approve the room request without the room administrator having to log in. An alternative URL in the same email message would allow the room admin to deny a reservation request. This mechanism should be reasonably secure, i.e., a user should not be able to approve his or her own request by doing some URL surgery.

    Hint: You can write a secret key into the reservations table at the time of the request and require that a one-click approval URL contain the secret key. Tcl doesn’t contain a random number generator, though you can call randomRange, defined in our /home/nsadmin/modules/tcl/utilities.tcl file. Alternatively, you can simply use the integer returned by ns_time.

You now have a working system as envisioned by your clients. Congratulations.

Exercise 6: Making it Pay

If you’re not doing this on an ArsDigita-maintained machine, you’ll probably find it more trouble than it is worth to install the CyberCash software on your machine (painful), install our CyberCash interface from AOLserver (annoying), and get a merchant credit card account with CyberCash capability (takes 6 weeks and costs around $1000). You can still do the exercise, though, by using Eve Andersson’s CyberCash emulator.

Occasionally folks in Academia ask themselves “What is there that distinguishes us from McDonalds, Microsoft®, and Oracle Corporation, aside from a lower growth rate and smaller revenues?” If the answer turns out to be “not much” then why not apply commercial logic to the conference room scheduler?

Assume that the admin reports show that the Boesky Room is in heavy demand. Let’s make people pay to use the room. We’ll bill their credit card numbers in real-time. Best of all: no refunds (even if the fee-based room has a room administrator and that person doesn’t approve the request).

We’re not kidding. Your code will be putting real transactions into the real banking system using CyberCash. If you set a $1000 price for a room and use your credit card to test your software, you will find a $1000 charge on your next statement. Ergo, you might prefer to set prices more in the range of 50 cents.

Here’s the plan:

  • Add a column to your data model so that a room can have an associated charge (a number; name the column fee unless you are passionately devoted to something else). Assume that we’re only using US dollars so you don’t need to record units.

    Remember that one of the beauties of Oracle is that you can add columns to tables and legacy software will continue to work. Visit the data modeling chapter of SQL for Web Nerds if you need to be reminded of the syntax of alter table.

  • Alter the reservations table so that it has some of the ecommerce fields mentioned in Chapter 14: ecommerce, e.g., order_state, name_on_card, billing_zip_code.

  • Augment /admin/reserve/room-edit.tcl so that the administrator can adjust the prices of various rooms and then add a small fee to one room (we recommend making this 0.25 or 0.50; remember that these are real credit cards and real money).

  • Augment /reserve/new-2.tcl so that it will refuse to record a reservation for a room that has a non-NULL or non-0 fee. You’ll be sending fee-based reservations into a separate ordering pipeline (see below) and you don’t want clever nerds doing URL surgery to be able to bypass your billing system.

  • Augment /reserve/new.tcl so that it mentions the fee for rooms that have one and links reservations for those over to http://yourservername/reserve/new-2-with-fee.tcl

  • Augment /reserve/index.tcl so that it can display the following:

    • future rooms reserved for which a fee has been paid
    • old fee-based reservations
    • failed attempts to reserve fee-based rooms
  • Build /reserve/new-2-with-fee.tcl so that it generates a reservation_id (for double-click protection) and presents a form asking for a credit card number, the name on the card, expiration date, and billing zip code. The form should target /reserve/new-2-with-fee-2.tcl

  • Build /reserve/new-2-with-fee-2.tcl to:

    1. Open a transaction and take a lock on the reservations table.

    2. Check to see if there is already a row in the reservations table with the same reservation_id; if so, assume double click, abort the transaction (ns_db dml $db “abort transaction”, ns_returnredirect to index.tcl page, and call return to terminate execution of the page.

    3. Insert a row in the reservations table with order_state of “confirmed”. As explained in the “An Extra Layer of Transactions” section of Chapter 14: ecommerce, you want to make sure that you write something into your local database before going out to the credit card system and trying to bill someone’s card.

    4. Release the lock (end the transaction).

    5. Bill credit card via CyberCash. Everyone in the class is using the same CyberCash account. CyberCash requires unique order IDs. So that you won’t conflict with other students, you have to be absolutely sure that your order ID starts with “yourusername-” and then the reservation ID. You have two financial options. You can do an mauthonly with CyberCash, which will result in your credit limit being knocked down by the amount of the charge. No money will be withdrawn from your account. Or you can do the mauthonly and then a postauth to mark the transaction for settlement. In this case you’ll find a charge on your next credit card bill from “ArsDigita, LLC” (we’re using philg’s personal merchant account; your money will be donated to charity or spent on dog treats).

    6. Update the row to reflect “authorized” or “failed”.

See cybercash.com if you really want to understand the CyberCash system. If you just want to learn enough to get by, look at the code in order-2.tcl (from the open-source ArsDigita Shoppe package). Interesting lines in order-2.tcl include the following:

#the auth step
cc_send_to_server_21 “mauthonly” $args $cc_output

# the mark for settlement step; only do this if you
# want the thrill of seeing charges on your bank statement
cc_send_to_server_21 “postauth” $args $cc_output

  • Modify /new.tcl and /new-2.tcl to prevent booking of rooms where there is already a reservation in any order_state except “failed”.
  • Build a financials directory /admin/reserve/money/ where you show lists of paid reservations and any problems, e.g., orders that were confirmed some time ago but didn’t get pushed into “failed” or “authorized”.

Congratulations. You’re now an Official Web Entrepreneur (TM).

Exercise 7 (Extra Credit): Find a Venture Capitalist

Take the code that you wrote for this problem set. Give it a name that includes the neologism “middleware”. Find a venture capitalist to give you $6 million.

The Wide World of Oracle

We’re going to shift gears now into a portion of the problem set designed to teach you more about Oracle and SQL.

Oraexercise 1: Concurrency and Isolation

Connect to Oracle in SQL*Plus and type “describe my_stocks” to make sure that your my_stocks table is still defined from problem set 1.

  • start a second SQL*Plus session, connected as the same user as your first
  • type “delete from my_stocks;” in Session 1
  • type “select * from my_stocks;” in both sessions
  • type “rollback;” in the Session 1

Oraexercise 2: The Importance of Commitment

  • go back into SQL*Plus:

    SQL> create table foobar (
    the_key integer primary key
    );
    SQL> insert into foobar (the_key)
    values (1);

    1 row created.

    Leave this session open and do not type COMMIT

  • build an AOLserver .tcl page that gets a database handle and then executes the same insert into foobar (the_key) values (1) statement

  • go to a Web browser and request your AOLserver .tcl page

  • watch the Web browser hang

  • type “COMMIT;” into SQL*Plus

  • look at the AOLserver error log

Repeat the above sequence but use ROLLBACK at the end in SQL*Plus instead of COMMIT.

The Narrow World of AOLserver

Imagine that you are a working Web developer and a client calls you to ask how much traffic the site you’ve built is getting. At the very least, you need the following skills:

  • Find your AOLserver’s access log and load it into an Emacs buffer (hint: it should be at /home/nsadmin/log/yourservername.log). Generally we configure our AOLservers to roll the log at midnight so you’re only looking at requests since early this morning.
  • Use “m-x count-lines-page” to figure out how many hits there have been so far today.
  • Go to the end of the buffer (“m->”). Grab a page from your server with a Web browser. Type “m-x revert-buffer” and make sure that the new request is logged (this way you know you’re looking at your live access log).

Then the client calls upset because a development site is available for public access. Read the AOLserver administrator’s guide at www.aolserver.com and then restrict access to URLs beginning with /foobar. Actually make a directory “foobar” under your server root and place a file in it. See if you can grab the file from a browser. (Note that AOLserver 3.0 will have you editing the perms database files, and calling Tcl procs, while 2.3 has a fancy GUI interface for editing permissions while 3.0.)

Mostly relevant for AOLserver 2.3: suppose that the client wants access to server admin pages, e.g., so that they can create new users. You don’t want to give the client access to the nsadmin password, which you might share with other programmers for use on many servers. Create a new user called “myclient” in the “system” group so that they can have the same privileges as nsadmin.

Who Wrote This and When

This problem set was written by Philip Greenspun and Hal Abelson in February 1999. It is copyright 1999 by them but may be reused provided credit is given to the original authors.

Reading for This Week

  • Oracle Intermedia Text reference.
  • Oracle Tree Extensions sections in Oracle 8: The Complete Reference.
  • SQL for Web Nerds, tree chapter.

Objectives

Teach students how to build a collaborative authoring environment for structured multi-media data. Introduce students to Oracle’s full-text indexing system.

The Huge Picture
The Big Picture
Privacy
Commentability
Assumptions
Assignment
Optional Assignment

The Huge Picture

Civilization and its discontents have fractured the extended family. You are building a Web service to bring everyone back together.

The Big Picture

All of the members of your family should be able to come together to

  • Collaboratively maintain the family’s geneaology
  • Associate photographs and documents with individuals or subtrees at different ages, e.g., “Joey Smith at Age 3” or “Bill and Margaret Smith and theirs kids in 1983” or “Alice Wong’s English paper, Grade 10”
  • Plan reunions and smaller get-togethers

Privacy

Every item uploaded to the server should have three possible privacy settings:

  • Show to designated users (i.e., if it is something nasty about your Aunt Maude that your brother would appreciate, you want him to see it but not Aunt Maude)
  • Show to registered users (i.e., only people in the extended family)
  • Show to everyone

Why the last category? A family may wish to present a collaboratively developed face to the world.

Commentability

Every item on the server should serve as the focus for comments. If you use the ACS general comments facility, your server will automatically be able to accept photos and documents attached to comments.

Assumptions

You can’t assume that everyone in the family tree will be a user in the users table: some of them might be dead and/or not have email addresses.

You can assume that your server needs to serve only one family at a time. You don’t have to persistently maintain a tree for more than one family.

For the purposes of the problem set, you can assume that your family is trustworthy. You can let any registered user edit anything.

Assignment

Exercise 1: Build the Data Model

Using the file naming and placement conventions, create a SQL data model file. Before you can do this, you have to come up with a name for your module. In order to make life easier for us in looking over your shoulder, please refrain from being creative and call your module “family”.

Here are some guidelines for your data model:

  • create a table called “family_relatives” to store the family members. Here is a skeleton:

    create table family_relatives (
    relative_id integer primary key,
    – optional pointer to users table
    user_id references users,
    spouse references family_relatives,
    mother references family_relatives,
    father references family_relatives,
    – in case they don’t know the exact birthdate
    birthyear integer,
    birthday date,
    – sadly, not everyone is still with us
    deathyear integer,
    first_names varchar(100) not null,
    last_name varchar(100) not null,
    sex char(1) check (sex in (’m’,‘f’)),
    – note the use of multi-column check constraints
    check ( birthyear is not null or birthday is not null)
    );

  • create a table called “family_photos” to store photographs and scanned documents, such as diplomas. We’ll draw some inspiration from the /doc/sql/general-comments.sql file:

    create table family_photos (
    family_photo_id integer primary key,
    photo blob not null,
    – file name including extension but not path
    client_file_name varchar(500),
    file_type varchar(100), – this is a MIME type (e.g., image/jpeg)
    file_extension varchar(50), – e.g., “jpg”
    caption varchar(4000),
    – when was this photo taken
    item_date date,
    item_year integer,
    original_width integer,
    original_height integer,
    access_control varchar(20)
    check (access_control in (‘public’, ‘family’, ‘designated’)),
    check (item_date is not null or item_year is not null)
    );

-- a photo might contain more than one person so we need
– an extra table
create table family_photo_relative_map (
relative_id references family_relatives,
family_photo_id references family_photos,
primary key (relative_id, family_photo_id)
);

create table family_photo_access_control (
family_photo_id references family_photos,
user_id references users,
– note the order we spec the primary key
– (makes it fast to ask “who gets to see this photo”)
primary key (family_photo_id, user_id)
);

  • create a table called “family_stories” to record stories about family members at different ages:

    Date/time arithmetic create table family_stories (
    family_story_id integer primary key,
    story clob not null,
    item_date date,
    item_year integer,
    access_control varchar(20)
    check (access_control in (‘public’, ‘family’, ‘designated’)),
    check (item_date is not null or item_year is not null)
    );

-- a story might be about more than one person
create table family_story_relative_map (
family_story_id references family_stories,
relative_id references family_relatives,
primary key (relative_id, family_story_id)
);

Exercise 2: The Family Tree Page

Build a set of Tcl scripts at /family that will allow users to

  • View the family tree
  • Add rows to the family_relatives table
  • Rremove rows added by mistake

A good thing to remember is that the Macintosh user interface works well. Pick the object first and then the verb. So let the user add a parent, spouse, or child by first selecting an existing relative.

Exercise 3: Uploading Photos

Extend the “/family/one-relative.tcl” page that you built in Exercise 2. so that users can attach photos to relatives. You’ll need to use the ns_ora blob_dml_file API call documented in http://arsdigita.com/free-tools/oracle-driver.html and also ns_queryget from the AOLserver API.

You can find source code models for photo uploading in /comments/upload-attachment.tcl.

Once a photo is in the system and associated with one relative, provide an interface to add an association to one or more additional relatives.

Edit the family tree display page to show an extra little “p” when one or more photos including a relative are available. Once the user clicks on an individual, show the available photos of that person organized by the age the person would be in the photo (based on what your server knows about the date of the photo and the birth date of the relative).

For writing photos back to the Web, you’ll need to return appropriate HTTP and MIME headers, then call ns_ora write_blob to pull the photo direct from Oracle and write it to the Web. Examples of ACS pages that do this include the following:

Exercise 4: Making Photos a Focus for Collaboration

Using the system described at http://photo.net/doc/general-comments.html, extend your photo display pages so that family members can comment on photos and even upload additional photos or documents as attachments.

Exercise 5: Adding Stories

Repeat exercises 3 and 4 but for stories. You’ll need to use the ns_ora clob_dml API call documented in http://arsdigita.com/free-tools/oracle-driver.html

Exercise 6: Event Planning Data Model

Add an event planner, e.g., for reunions, to your system. You’ll need to extend the data model to include at least two extra tables: family_events (what is being planned) and family_events_registration (who is signed up). Each event should have a creator who will be the coordinator.

Exercise 7: Event Planning Pages

Modify the /pvt/home.tcl page to show registered users upcoming events right at the top of the workspace. If they click on an event summary, take them to a page where they can sign up. Upon registering for the event, the creator of the event should be emailed.

Provide pages in a /family/events/ directory that will let people see who is signed up for a particular event, spam the attendees, etc. Link this in with general comments so that users can make public comments on an event.

Exercise 8: Full-text Indexing

Once you’ve got a bunch of stories in the family_stories, it will be nice to provide users with the ability to search through them.

Build a /family/search-stories.tcl page that lets users search through the CLOBs in family_stories. Note that

  1. The SQL LIKE command doesn’t work with CLOBs (thank you, Oracle).
  2. If you had many megabytes of text, it would be very slow to sequentially search all of it
  3. It would be nicer to users if querying for “dogs” matched documents containing “dog” (stemming)

All of these issues can be addressed to some extent by using the Oracle Intermedia text indexing system:

create index family_stories_ctx on family_stories (story) indextype is ctxsys.context;

For examples of how to query and present results, read http://photo.net/doc/site-wide-search.html and the Oracle Intermedia Text reference at http://philip.greenspun.com/sql/ref/intermediatext.

Exercise 9: Access Control

Develop a test suite of URLs and users and make sure that the main /family area presents the right information to the public, to registered users (the family), and to designated users (ones on the access control list for particular items).

Optional Assignment

Read Sigmund Freud’s Civilization and Its Discontents.

Who Wrote This and When

This problem set was written by Philip Greenspun in October 1999. It is copyright 1999 by him but may be reused provided credit is given to the original author.

Course Info

Learning Resource Types
Online Textbook
Exams
Design Assignments