6.830: Lecture 4 (09/22/09)
We will continue the discussion of schema normalization from last week. We will then start discussing database system internals, based on he content of two papers. We will continue discussing these papers during the next lecture (so these are all of the readings for the week!). They are:- Joseph Hellerstein and Michael Stonebraker. The Anatomy of a Database System. In "Readings in Database Systems" (aka "The Red Book"). Focus on Sections 1-4, though you should also read Sections 5.1 and 5.2 and skim Section 6.
- M.M. Astrahan et al. System R: Relational Approach to Database Management. ACM TODS 1(2), 1976. Pages 97-137. [PDF]. Requires an MIT IP for access. Read up to page 122; you may also skip the "Optimizer" section, pages 110 - 114.
The purpose of these readings is to introduce the architecture of a database system at a high level. Our goal in lecture will be to tease apart the main components of most database systems. Once we've identified these components, we will discuss each of the over the next few weeks.
Both of these papers assume a certain degree of familiarity with database 'lingo', some of which will doubtless be unfamiliar to you. As you read, keep track of terms you do not know and come to class prepared to ask questions!
Also, as you read, think about and come to class prepared to answer the following questions:
- What is the purpose of the division between RDS and RSS in System R? Is there something fundamental about this design?
- Why are process models in database systems important? Under what circumstances would I want multiple processes in my database? Are there any circumstances in which a "process per query" model would be preferable to a "thread per query" model?
- What is the iterator model? Why is the iterator model convenient? Can you think of circumstances under which the iterator model is a bad idea?
Last change: 9/17/09.