MACAQUE - Managing Ambiguity and Complexity in Acqusitional Query Environments

[Overview | Projects | People | Publications]

Overview

The vision of ubiquitous computing promises to spread information technology throughout our lives. Though this vision can be compelling, it also threatens to overwhelm us with a flood of information, much of which is spurious, irrelevant, or misleading. Thus, the challenge of realizing this vision is separating the relevant, timely, and useful information out of this flood of data. The data management community has made significant progress towards achieving this goal, by providing tools that load and clean the data, languages and systems that can query the data and algorithms that mine the data for patterns and relationships that are of interest.

These efforts have largely been focused on mitigating data complexity once it has been captured and stored inside of a traditional computing infrastructure. In contrast, we propose a set of techniques designed to take an active role in managing this wealth of data by managing when, where, and with what frequency data is acquired from distributed information systems. There are many modern systems where the capability of local nodes to generate data far outstrips the resources available to transmit or store that data. Nodes in a sensor network, for example, typically have processors that run at several megahertz, with data collection hardware capable of collecting many kilosamples per second, but radios than only transmit kilobytes per second aggregate across all of the nodes in the network. Worse yet, these nodes are battery powered, and, when sampling at maximum rates, only have sufficient energy to last for a few days. In addition to limited resources, data from real world environments is often noisy, lossy, and hard to interpret. This noise and uncertainty can be misleading, particularly when the user is summarizing and aggregating data using a high- level language like SQL.

In the MACAQUE (for "Management of Ambiguity and Complexity in an Acquisitional QUery Environment") project, we are developing several sytems is designed to focus the resources of the computer system (e.g., network bandwidth or battery capacity) and attention of the user on capturing, refining, and interpreting portions of the data that are most relevant while de- emphasizing and decreasing the captured resolution of less relevant data. At the same time, we uses statistical and probabilistic techniques to identify data that is spurious, incorrect, or unreliable, and to infer missing data values.

Projects

Our effort on MACAQUE is divided into several related sub-projects, including:

People

Faculty

Students

Alumni

Publications

  • Philippe Cudre-Mauroux, Eugene Wu, and Samuel Madden. TrajStore: An adaptive storage system for very large trajectory data sets. In Proceedings of ICDE, 2010. [PDF]
  • Tingjian Ge, Stan Zdonik, and Samuel Madden. Top-k Queries on Uncertain Data: On Score Distribution and Typical Answers. In Proceedings of SIGMOD, 2009. [PDF]
  • Arvind Thiagarajan, Lenin Sivalingam, Katrina LaCurts, Hari Balakrishnan, Jakob Eriksson, and Samuel Madden. VTrack: Accurate, Energy-Aware Traffic Delay Estimation Using Mobile Phones. In Proceedings of SenSys, 2009. Best Paper Award. [PDF]
  • Arvind Thiagarajan and Samuel Madden. Representing and Querying Regression Models in a DBMS. In Proceedings of SIGMOD, 2008. [PDF]
  • Jakob Eriksson, Hari Balakrishnan, and Samuel Madden. Cabernet: Vehicular Content Delivery Using WiFi. In Proceedings of MobiCom, 2008. [PDF]
  • Yang Zhang, Bret Hull, Hari Balakrishnan, Samuel Madden. ICEDB: Intermittently Connected Continuous Query Processing. In Proceedings of ICDE, 2007. [PDF]
  • Ivan Stoianov, Lama Nachman, Samuel Madden, and Timur Tokmouline PIPENET: A Wireless Sensor Network for Pipeline Monitoring. In Proceedings of IPSN, 2007. [PDF]
  • Daniela Tulone, Samuel Madden. An Energy-efficient Querying Framework for Detecting Node Similarities in Sensor Networks. In Proceedings of ACM/IEEE International Symposium on Modeling, Analysis and Simulation in Sensor Networks (MSWiM), 2006. [PDF]
  • Bret Hull, Vladimir Bychkovskiy, Kevin Chen, Michel Goraczko, Eugene Shih, Yang Zhang, Hari Balakrishnan, Samuel Madden. CarTel: A Distributed Mobile Sensor Computing System. In Proceedings of SenSys, 2006. [PDF]
  • Vladimir Bychkovskiy, Bret Hull, Allen Miu, Hari Balakrishnan, Samuel Madden. A Measurement Study of Vehicular Internet Access Using Unplanned 802.11 Networks. In Proceedings of MOBICOM, 2006. [PDF]
  • Amol Deshpande, Samuel Madden. MauveDB: Supporting Model-Based User Views in Database Systems. In Proceedings of SIGMOD, 2006. [PDF]
  • Daniela Tulone, Samuel Madden. PAQ: Time series forecasting for approximate query answering in sensor networks. In Proceedings of EWSN, 2006. [PDF]
  • Amol Deshpande, Carlos Guestrin, Samuel Madden, Joseph Hellerstein, Wei Hong. Model Driven Data Acquisition in Sensor Networks. (Best Paper Award). In Proceedings of VLDB, 2004. [PDF]

This project is funded by the NSF award IIS-0448124.

[Overview | Projects | People | Publications]