- Eric Brewer. Combining Systems and Databases: A Search Engine Retrospective. In Red Book.
- Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters In OSDI, 2004. [PDF]
The first paper talks about how databases relate search engines and gives some of the basics of the functioning of a search engine. The second paper talks about a specific implementation of a simple query system (called Map-Reduce) on top of the Google cluster.
As you read the papers, consider the following questions:
- In the "Search Engine Retrospective", what features of database systems does the author recommend that designers of search engines adopt? Why?
- What does Brewer claim are the primary differences between search engines and databases? What issues do search engine designers not have to worry about that database designers often focus on?
- What is the CAP theorem?
- What kinds of failures can a search engine (or the Map-Reduce system) tolerate? What consistency guarantees are provided in the face of failures?
Questions or comments regarding 6.830/6.814? Send e-mail to the 6.830/6.814 staff at 6.830-staff at mit.edu.