I am the faculty director of the Systems that Learn Initiative and
co-direct the Data Systems and AI Lab and the Data Systems Group.
Personal Information
- Short bio
- Ph.D. (Berkeley) 2003; M.Eng. (MIT) 1999; B.S. (MIT) 1999.
Teaching
- Spring, 2024, 6.S079: Software Systems for Data Science
- Fall, 2023, 6.5830/6.5831 (Formerly 6.830/6.8140): Database Systems
- Spring, 2023 6.1800 (formerly 6.033), Recitation Instructor
- Fall, 2022, 6.5830/6.5831 (Formerly 6.830/6.8140): Database Systems
- Spring, 2021, 6.S079: Software Systems for Data Science
- Fall, 2021, 6.0001 and 6.0002 - Introduction to Computational Thinking and Data Science
- Spring, 2021, 6.830/6.814
- Fall, 2019, 6.S080: Software Systems for Data Science
- Spring, 2017, 6.S062: Mobile and Sensor Computing
- Fall, 2016, 6.830/6.814
- Spring, 2016, 6.S062: Mobile and Sensor Computing
- Fall, 2015, 6.830/6.814
- Spring, 2015, 6.033 (sections)
- Fall, 2014, 6.830/6.814
- Spring, 2014, 6.033 (lectures, with Dina Katabi)
- Fall, 2013, 6.885: From ASCII to Answers
- Spring, 2013, 6.830/6.814
- Spring, 2012, 6.830/6.814
- Spring, 2010, 6.033 (lectures, with Robert Morris)
- Fall 2009, Database Systems(6.830)
- Spring, 2009, 6.033 (lectures, with Robert Morris)
- Fall 2008, Database Systems(6.830)
- Fall 2007, Database Systems(6.830)
- Spring 2007, 6.033 (sections)
- Fall 2006, Database Systems(6.830)
- Spring 2006, 6.033 (sections)
- Fall 2005, Database Systems(6.830)
- Spring 2005, 6.033 (lectures, with Hari Balakrishnan)
- Fall 2004, Database Systems (6.893)
- Spring 2004, 6.033 (sections)
Research My primary research focus is on database systems,
including main memory databases, data warehousing/analytics, querying video data, and machine learning for data systems.
Recent and current research projects and interests include:
Past projects include:
- Starling, looking at building database systems on ephemeral computing services like AWS Lambda.
- TGDB, a system for efficient temporal graph analytics.
- Mapster a set of tools for generating street maps from satellite imagery and GPS traces.
- Datahub, a "github for data" platform that provides hosted database storage, versioning, ingest, search, and visualization.
- BlinkDB A system for runing queries with bounded errors and bounded response times on very large data
- Silo, a main memory high throughput transaction processing system that uses novel concurrency control algorithms and lock free data structures to provide scalability without partitioning.
- The CarTel networking and data management system for mobile sensor networks.
- C-Store, a column-oriented high performance database system for warehouse and semantic web applications.
- Relational Cloud, a project looking at building a scalable database service for the cloud.
- H-Store, a next generation high-performance OLTP engine.
- WaveScope, a signal-oriented stream processing system.
- Qurk, a project investigating how people (as accessed via crowdsourcing platforms like Amazon's Mechanical Turk) can be integrated into query processing.
- SciDB, a multi-institution project developing a data management platform for scientific applications, including astronomy and computational biology.
- The MACAQUE project, which focused on identifying ways to make data collected from sensor networks more reliable and useful.
- Developing novel query processing and optimization techniques for sensor networks and other acquisitional systems.
- TinyDB (a part of TinyOS.)
- Aurora/Borealis stream processor.
- TelegraphCQ continuous query processor.
Students
Current Ph.D. students:
Graduated Ph.D. students:
- Daniel Abadi (U. Maryland)
Thesis: Query execution in column-oriented database systems
- Adam Marcus
Thesis: "Optimization Techniques for Human Computation-enabled Data Processing Systems"
- Ryan Newton (U. of Indiana)
Thesis: Language Design for Distributed Stream Processing
- Arvind Thiagarajan (co-supervised with Hari Balakrishnan)
Thesis: Probabilistic Models for Mobile Phone Trajectory Estimation
- Evan Jones
Thesis: Fault-Tolerant Distributed Transactions for Partitioned OLTP Databases
- Eugene Wu (Columbia)
Thesis: Implementation and Applications of High Performance Provenance Systems for Data Analysis
- Yuan Mei (Facebook)
Thesis: The Sprawl Stream Distribution System li>
- Alvin Cheung (UC Berkeley)
Thesis: Rethinking the Application-Database Interface
- Manasi Vartak (Verta, Inc)
- Yi Lu (Google)
- Anil Shanbhag (Instabase, Inc)
- Favyen Bastani (AllenAI)
- Joana Matos Fonseca de Trindade
- Oscar Moll
- Albert Kim
Thesis "Optimizing Queries with Disjunctions"
- Matt Perron
Postdoc Alumni:
- Peter Bailis Sisu Data, Stanford
- Lei Cao, U. Arizona
- Raul Castro-Fernandez, U. Chicago
- Philippé Cudre-Mauroux, Fribourg University (Switzerland)
- Carlo Curino, Microsoft
- Aaron Elmore, U. Chicago
- Jakob Eriksson, U. Chicago
- Stavros Harizopolous, Google
- Alekh Jindal
- Barzan Mozafari, U. Michigan
- Aditya Parameswaran, UC Berkeley
Graduated M.Eng. students (Partial List, not updated after ~2010):
Awards and Honors
- C. V. Ramamoorthy Distinguished Research Award, UC Berkeley, 2003
- VLDB 2004 Best Paper Award
- NSF CAREER Award (Project Page), 2005
- MIT TR35 Outstanding Researcher Under the Age of 35, 2005
- MobiCom Best Paper Award, 2006
- SenSys Best Demo Award, 2006
- Sloan Fellowship, 2007
- IBM Faculty Development Award, 2007
- VLDB Best Paper Award, 2007
- Sensys Best Paper Award, 2009
- VLDB Best Demo Award, 2011
- CIDR Best Paper Award, 2013
- SIGMOD Test of Time Award
("Acqusitional Query Processing, SIGMOD 2003"), 2013
- VLDB Test of Time Award ("C-Store, VLDB 2005"), 2015
- SIGMOD Contributions Award, 2016
- SIGMOD Test of Time Award ("Fault-tolerance in the Borealis distributed stream processing system, SIGMOD 2005"), 2017
- MIT Burgess (1952) & Elizabeth Jamieson Award for Excellence in Teaching, 2018
- SIGMOBILE Test of Time Award, ("CarTel: A Distributed Mobile Sensor Computing System"), 2018
- ACM Sensys Test of Time Award, ("VTrack: Accurate, Energy-aware Road Traffic Delay Estimation Using Mobile Phones"), 2019
- ACM Fellow, 2020
- SIGMOD Edgar H. Codd Innovations Award, 2024
Company Involvement
- I am the Chief Scientist of Cambridge Mobile Telematics, a Cambridge, MA-based startup that develops solutions to make roads safter by making drivers better.
- I am a technical advisor to Map-D, a GPU-based database analytics and visualization startup.
- I am a technical advisor to Instabase, a platform to help people solve problems with data.
- I am a technical advisor to B12, a startup focused on orchestrating creative workers to help build technology products.
- I was a co-founder of Vertica, a column-oriented database system. Vertica was acquired by HP in 2011.
Miscellaneous Technical Articles and Software (Out of Date)
Community Activities
- SIGMOD 2004 Demo Committee
- OSDI 2004 Program Committee
- Organizer of the 2004 DMSN Workshop on Data Management in Wireless Sensor Networks held with VLDB 2004.
- CIDR 2005 Program Committee
- ICDE 2005 Program Committee
- SIGMOD 2005 Program Committee
- VLDB 2005 Program Committee
- Organizer of the 2005 DMSN Workshop on Data Management in Wireless Sensor Networks held with VLDB 2005.
- IPSN 2006 Program Committee
- SIGMOD 2006 Demo Committee
- KDD 2006 Program Committee
- SenSys 2006 Program Committee
- SenSys 2006 Publications Chair
- SIGMOD 2007 Program Committee
- CIDR 2007 Program Committee
- DCOSS 2007 Systems Subcommittee Program Chair
- VLDB 2007 Program Committee
- VLDB 2008 Industrial Track Program Committee
- SIGMOD 2008 Program Committee
- IPSN 2008 PC Co-Chair
- ICDE 2009 Program Committee
- PVLDB 2010 Program Committee
- PVLDB 2011 Program Committee
- SIGMOD 2011 Program Committee
- Editorial Board Member, Internet Computing (2006--2014)
- Editorial Board Member, Transactions on Sensor Networks (2005--2010)
- SIGMOD 2016 Program Chair
- Co-organizer North East Database Day (NEDBDay) 2023
|