Third Annual SIGMOD Programming Contest
A Durable Main-Memory Index Using Flash

Sponsored by:

NSF      Microsoft

Student teams from degree granting institutions are invited to compete in the annual SIGMOD programming contest. This year, the task is to implement a high-throughput main-memory index which is made durable using a flash-based SSD. The winning team will be awarded a prize of USD $5,000. Submissions will be judged based on their overall performance on a supplied workload. Up to two students from the top submissions will receive travel grants to attend SIGMOD 2011 in Athens, Greece. The details of this contest are listed below.

Updates
top

Please subscribe to our contest mailing list to receive updates and notifications (this will be a low-traffic mailing list.)

Task overview
top

The task for this contest is to implement a high-throughput main-memory index that uses flash-based SSDs for durability. The index will fit entirely in the main-memory, and all updates to it must be recoverable in the face of crashes. The system should use and optimize for modern flash-based SSDs to achieve durability.

The index offers an order-preserving key-value interface. It maintains key-value pairs that are byte strings and supports both random and ordered accesses to those pairs. All single-key operations are atomic. In addition to random and ordered access, the index offers a single-key atomic compare and swap primitive for building larger atomic operations. The index should support highly concurrent accesses. The optimization metric for this contest is overall operation throughput.

Contestants must supply the source code for their entries, and agree to license their code under the BSD or MIT open source license should their system win the contest. The contestants must be full time students at a degree granting post secondary institution. Both undergraduate and graduate students are eligible. Submissions may be written in any language, but it must be capable of linking against our x86-64 "test driver" on Linux. Details of the API and implementation requirements are below.

Important Dates
top
Jan. 28, 2011The detailed specification of the requirements is available on the website.
Mar. 31, 2011Contest ends at 23:59 EST (convert time zone) — see submission details.
Apr. 30, 2011Finalists notified.
(TBA)We will host the finalists at SIGMOD; details to be announced.

Task Details, API and Test Implementation
top

In addition to the requirements above, we list additional constraints and clarifications:

API

The commands which your system must support are specified in the API, which can be downloaded from the following link:

Unit Test and Performance Test

We have written a basic unit test and a performance test that you can use to test the performance and correctness of your implementation. This code might have bugs, so please discuss any issues on the mailing list, which we will also use to announce any updates. This also includes a reference implementation based on Berkley DB. It is not a high performance implementation, but it correctly implements the API. This implementation compiles on modern Linux systems with GCC.

scdb.tar.bz2

Final tests, including durability test: scdb_final.tar.bz2

Submitting Your Implementation, and Our Hardware Configuration

We will run on a 64-bit x86 machine running Ubuntu 10.10 (kernel version 2.6.35-25-server). The data will be stored on a single Intel X25-E SSD, formatted using ext4. The machine has 8 cores (Intel Xeon E5540 2.53 GHz), with hyper-threading disabled. It has 12 GB of main memory, but the performance test will be limited to far less RAM (probably 3 GB).

Submission requirements

You must supply a Linux static library (.a) called libscdb.a, which can be linked with our driver to produce the final executable. You must submit an archive (ideally .tar.gz with the following):

  • The complete source code for your implementation.
  • A Makefile that produces a statically linked library named libscdb.a when run with the command make.
  • A README file listing the names of all team members, the institute where they are studying, and email contact information.

You can upload your file via our submission site. You may upload as many times as you would like before the deadline: March 31st at 23:59 EST (convert time zone).

Benchmark

The benchmark will be a mixture of inserts, updates, deletes, compare and swap, and ordered traversal operations. The workload will begin by generating sufficient data to ensure that the entire SSD is full, and to force the implementation to perform garbage collection. We will then begin measuring the overall operation throughput. The measurement period will be long enough to ensure our measurement includes garbage collection. The overall throughput will be the metric used to compare implementations. We will also test the durability feature using fault-injection. Implementations that fail to durably record data will be disqualified.

While we will be providing a "test workload," the final workload will not be the same. Thus, your implementation should provide good performance on a wide range of workloads, and should not be tuned for one specific benchmark.

More details of the benchmark will be posted as soon as they are available.

Organizers

Please contact us at sigmod11admin@lists.csail.mit.edu if you have any questions or comments.

Sponsorship
The contest is supported by a grant from the NSF (IIS-0848727). Prizes will be donated by Microsoft.