6.830/6.814: Database Systems (Fall 2009)

Units: 3-0-9 (H)
When: TR 1:00 - 2:30
Where: 32-155
Instructors: Samuel Madden (madden AT csail.mit.edu)
Robert Morris (rtm AT csail.mit.edu)

Staff mailing list:
6830-staff AT nms.csail.mit.edu
Instructor office hours: by appointment
TAs: Nizameddin Ordulu (nizameddin AT gmail.com)
Adam Seering (aseering AT mit.edu)
Please submit general questions to the Staff mailing list: 6830-staff AT nms.csail.mit.edu
TA office hours when: Monday 11:00-12:00 (Adam Seering)
Tuesday 16:00-17:00 (Nizameddin Ordulu)
TA office hours where: 32-G9 Lounge (right out of the elevators)

6.830 Announcements

6.830 Announcements

(RSS Feed)
11/17/09: Lab 5 Released

Lab 5 has been posted, to http://db.csail.mit.edu/6.830/assignments/lab5.html


11/19/09: Lab 4 Bug Regarding the Cardinality of Key-Foreign Key Joins

There was a bug in our test cases and lab description for estimating join cardinality. Our test cases were too strict about the cardinality estimates that they accepted, and we incorrectly stated that the cardinality of a key-foreign key equality join should be the cardinality of the primary key table (it should be the cardinality of the foriegn key table.) We updated JoinOptimizerTest.estimateJoinCardinality() to reflect this. You can download a new version of JoinOptimizerTest by clicking here. We have also updated the lab 4 document; click here to see the portion of the lab that has changed.

Note that if you have already gotten your code to pass this test case, you do not need to worry about this (our new test case will allow code that passes the old specification as well.) In fact, if you prefer to work with the old version of the lab/code (and ignore this update altogether), that is absolutely fine; we simply posted this fix to correct the conceptual error in the lab regarding the cardinality of key-foreign key joins.


11/17/09: Grade Calculators

We have put a simple calculator to allow you to determine what percentage of the total points available in 6.830/6.814 you have received thus far.

6.830 Calculator

6.814 Calculator


11/14/09: Lab4 Patch: TableStatsTest.java

Some of the assert statements in TableStatsTest.estimateSelectivityTest() are too strict and some of them are too lax. Please download the new TableStatsTest.java or the new supplemental code


11/05/09: Final project meetings

Profs. Madden and Morris would like to schedule meetings with all of the final project groups from 6.830 at the end of this week and early next week. Please go http://tinyurl.com/6830meet to select a time slot for your meeting. Select only one slot per group. All meetings are in Prof. Madden's office (32-G938). If none of the available times work for you, email Prof. Madden to schedule an alternate time.


11/04/09: Lab 4 is posted

Lab 4 is posted. This lab includes supplemental code (both test cases and lab files) that you will need. Follow these instructions for installing it into your existing lab 3 solution. If your lab 3 solution does not compile or you need assistance, please contact us.


11/03/09: Problem Set 3 is Posted

Problem Set 3 is posted. It is due December 1, 2009.


10/28/09: Lab 4

We will post Lab 4 early next week. We are working hard on getting it ready, and you will still have more than two weeks to complete it. Our apologies for the delay.


10/27/09: Office Hours on Thursday this week

Because of the lab3 due date, the TA office hours are going to be from 5-8pm on Thursday, Oct 29 this week.


10/20/09: PS2 Due Wednesday, 10/21

We inadvertently posted the due date of PS2 as 10/21 on the assignments page; therefore, we will accept submissions of PS2 until Wednesday, 11:59 PM, without penalty.


10/20/09: Quiz 2 Announcements

Quiz 2 is this Thursday. The quiz is open-book/open-notes (no phones, PDAs, or laptops allowed.) It covers through the lecture on Tuesday 10/20. Prof. Madden will hold office hours in 32-G938 from 10am-noon on Wednesday, 10/21.


10/19/09: Problem Set 1 Solutions posted

Solutions to Problem Set 1 have been posted on the Assignments page.


10/16/09: Lab 3 is posted

Lab 3 is posted. This lab includes supplemental code (both test cases and lab files) that you will need. Follow these instructions for installing it into your existing lab 2 solution. If your lab 2 solution does not compile or you need assistance, please contact us.


10/15/09: 2008 Quiz 1 and Solutions

The 2008 quiz 1 and solutions are posted. Note that we will not have covered ARIES and log-based recovery by the time quiz 1 happens this year, so you are not expected to be able to answer these questions.


10/15/09: Selinger Notes

Slides (in PowerPoint and PDF format) about the Selinger Optimizer are posted.


10/14/09: Lab 2 Clarification about Table Names

In Lab 2, your implementation of "addTable()" has particular constraints when called twice with the same table name. The wording in the lab document is a little bit unclear, so, to clarify: If two tables are added by name, both must be stored within the database; but if you try to access the table by name (rather than by ID), the more-recently-added table will be used.


10/13/09: Lab 2 Due Time -- Correction

A version of Lab 2 was published that indicated that the lab was due before class on Thursday. This is incorrect; the lab is due by the end of the day (11:59pm) on Thursday.


10/13/09: Office Hours on Thursday this week

Because of the lab2 due date, the TA office hours are going to be from 5-8pm on Thursday, Oct 15 this week.


10/13/09: Running the parser

The instructions for running the parser in lab2.html don't work. We accidentally stripped the parser code out of SimpleDB.java before shipping lab1. You can download the fixed SimpleDb.java from here, or patch it as follows:

In SimpleDb.java, after the else-if block:

        } else if (args[0].equals("print")) {
		...
	}
You should add:
        else if (args[0].equals("parser")) {
            // Strip the first argument and call the parser
            String[] newargs = new String[args.length-1];
            for (int i = 1; i < args.length; ++i) {
                newargs[i-1] = args[i];
            }
            Parser.main(newargs);
        }
This goes just before the code:
	else { 
             System.err.println("Unknown command: " + args[0]); 
             System.exit(1); 
        } 

Alternatively, rather than patching or updating SimpleDB.java, you can run the parser directly from the command-line by typing (from the top-level simpledb directory):

java -classpath bin/src/:lib/jline-0.9.94.jar:lib/sql4j.jar:lib/zql.jar Parser catalog.txt
where catalog.txt is the name of the catalog file describing the schema of the tables you wish to query (as described in Section 2.7 of lab2.html).


10/04/09: Lab 2 is posted

Lab 2 is posted. This lab includes supplemental code (both test cases and lab files) that you will need. Follow these instructons for installing it into your existing lab 1 solution. If your lab 1 solution does not compile or you need assistance, please contact us.


09/30/09: Lab 1 Office Hours Thursday 10/01

TAs are going to be available from 6-7pm (nizam) and 7-8pm(adam) for people with questions about lab 1. Location (same as OH): Gates building, 9th floor, the lounge across elevators.


09/30/09: Lab 1 version inconsistency

Over the past few days, we inadvertently posted links to an old version of lab 1 (from 2008) on this RSS feed. The link in the lab1.html file was correct, and the links from this feed have now been corrected. However, some of you may have downloaded and begun working the 2008 version of the lab 1 code. Here's what this means to you:

1) If you have not yet started lab 1, please re-download the lab tarball

2) If you have already started lab 1, you don't need to replace the version of the code you have. However, if you are working with the 2008 version of the code, the 2009 Lab 1 description is slightly incorrect in its description of the HeapFile/HeapPage format (we changed the format of HeapFiles between the two years). To determine if you have the old version of the code, have a look at HeapPage.java. If the constructor for HeapPage has a comment that looks like the code shown below, you have the old version of the lab.

    /**
     * Constructor.
     * Construct the HeapPage from a set of bytes of data read from disk.
     * The format of a HeapPage is a set of 32-bit header words indicating
     * the slots of the page that are in use, plus (BufferPool.PAGE_SIZE/tuple
     * size) tuple slots, where tuple size is the size of tuples in this
     * database table
     * (which can be determined via a call to getTupleDesc in Catalog.)
     *
     * The number of 32-bitheader words is equal to:
     * 
     * (no. tuple slots / 32) + 1
     * 
     * @see Database#getCatalog
     * @see Catalog#getTupleDesc
     * @see BufferPool#PAGE_SIZE
     */
    public HeapPage(HeapPageId id, byte[] data) throws IOException {

In contrast, the 2009 version of the lab has the following comment before the HeapPage constructor:

    /**
     * Create a HeapPage from a set of bytes of data read from disk.
     * The format of a HeapPage is a set of header bytes indicating
     * the slots of the page that are in use, some number of tuple slots.
     *  Specifically, the number of tuples is equal to: 
     *
     *          floor((BufferPool.PAGE_SIZE*8) / (tuple size * 8 + 1))
     *
     *  where tuple size is the size of tuples in this
     * database table, which can be determined via {@link Catalog#getTupleDesc}.
     * The number of 8-bit header words is equal to:
     * 
     *      ceiling(no. tuple slots / 8)
     * 
     * @see Database#getCatalog
     * @see Catalog#getTupleDesc
     * @see BufferPool#PAGE_SIZE
     */
    public HeapPage(HeapPageId id, byte[] data) throws IOException {

These comments illustrate the most significant differences between the HeapFile format in the two versions of the lab.

In the event that you do have the old 2008 version, you should refer to the 2008 Lab 1 description when working through the lab. The only section which is substantially different is 2.5., regarding the format of the HeapFile access method.

If you choose to submit the 2008 lab, we will grade it with the test cases from 2008, and you won't be penalized in any way.

We are sorry for this complication. Please let us know if you are confused or need help addressing this issue.


09/29/09: Minor correction to lab1 build.xml file

It turns out that the build.xml file of the first version of lab1 had a typo. If you downloaded the tarball before Monday, Sep 28 your build.xml might have this issue. Line 84 of the build.xml file should have value="SimpleDb" instead of value="simpledb". Please check your file for this typo. You can either


09/28/09: Lab Review Slides posted

You can download the Power Point slides or the pdf


09/28/09: Lab Review Tonight, 32-G449, 7PM

There will be a review session to help you get started with Lab 1 in 32-G449 at 7PM tonight. Slides will be posted afterwards.


09/28/09: Final Project Description and Suggested Projects

A description of what is expected for the final project (as well as some suggested ideas) is posted. Your project proposal is due 10/15/2009, and the final project presentations are 12/10/2009. Students taking 6.814 are not required to do a final project, but may instead opt to do two additional labs. We will arrange meetings with all of the project groups to discuss possible ideas and directions in the coming weeks.


09/27/09: Change in lab1: HeapPageReadTest.testDirty() removed

You are not responsible for passing HeapPageReadTest.testDirty() for this lab. You can either


09/24/09: Problem Set 2 is Posted

Problem Set 2 is posted. It is due October 21, 2009.


09/23/09: Lab1 Posted

Lab1 is posted. You can find the source code here.


09/22/09: SQL Zoo issues

Problem Set 1 recommends http://sqlzoo.net as a tutorial site for learning SQL. However, this site has been down for at least the past several days. An alternate tutorial can be found at http://w3schools.com/sql/sql_intro.asp.


09/18/09: Submitting PS1

PS1 is due next Tuesday (9/22). You may submit it in paper in class, or by 11:59 PM by emailing it to 6830-submit-2009@nms.csail.mit.edu . We prefer PDF or ASCII text format.


09/17/09: BCNF

PPT slides on BCNF are available.


09/17/09: Q6 in PS1 Changed

There was a bit of an ambiguity in the statement of Q6. We changed it to look like this: "Find the top 10 managers who have given the most number of grants to a single PI." We basically want you to find the top 10 managers, when we list the managers according to the number of grants they have given to a single PI.


09/15/09: Problem Set 1 Posted

Problem Set 1 is posted. Assignments (and their solutions when they become available) are posted on the assignments page.


09/15/09: Lecture 1 Notes Posted

Lecture notes are posted after each lecture on the notes page. Lecture 1 is now available (we won't post news items for future lecture notes.)


09/10/09: Red Book Online

One of the two course textbooks is available online, from books24x7.com, from any MIT IP address. Please follow this link. Both books are also available for purchase from the Coop, or from Amazon.com.


09/10/09: Course Mailing Lists

We've created two mailing lists for the course, 6830-discuss@mit.edu and 6830-announce@mit.edu. We will add you to these lists automatically when you sign up at the first class, but if you miss class you can request to be added manually using the links above (MIT certificates are required to access these lists). Please use the discuss list for any class related discussion (e.g., questions about labs or problem sets); we will post important class messages on the announce list (it will be very low traffic, so please don't unsubscribe!)


09/08/09: TA Office Hours announced

TA Office hours are now available on the web page.


08/27/09: Class web page posted

Course web page set up. The first day of class is Thursday, September 10.


Description

This course relies on primary readings from the database community to introduce graduate students to the foundations of database systems, focusing on basics such as the relational algebra and data model, schema normalization, query optimization, and transactions. It is designed for students who have taken 6.033 (or equivalent); no prior database experience is assumed though students who have taken an undergraduate course in databases are encouraged to attend.

Classes consist of lectures and discussions based on readings from the database literature. Grades in 6.830 are assigned based on a semester long project, as well as two exams and 7 assignments -- 3 labs and 4 problem sets -- of varying length. Grades in 6.814 are based on the same quizzes and assigments as 6.830, except that students may opt to do 2 additional labs in place of the final project. For more information about the readings and assignments, use the links at the top of the page.

Last change: 8/27/2009.