23 Oct 2015

MauveDB: Supporting Model based User Views in Database Systems

http://db.csail.mit.edu/pubs/sigmod06-mauvedb.pdf

real-world data — especially when generated by distributed measurement … incomplete, imprecise, and erroneous, making it impossible to present it to users or feed it directly into applications. The traditional approach to dealing with this problem is to first process the data using statistical or probabilistic models that can provide more robust interpretations … at best, databases serve as a persistent raw data store.

In this paper we define a new abstraction called model-based views and present the architecture of MauveDB, the system we are building to support such views. … Just as traditional database views provide logical data independence, model-based views provide independence from the details of the underlying data … by using models to present a consistent view to the users … language for defining model-based views … declarative querying over such views using SQL, …

raw data needs to be synthesized (filtered) using models … forcing them to use external tools for this purpose … customized programs that are often quite similar to database queries …

1.1 example: wireless sensor networks

a DBMS is used to capture and store the raw data, but all of the data modeling and analysis is done outside of the database system … uses a model to filter the raw data and to present the application with a consistent “view” of the system

use existing tools to implement this software layer, however, is problematic

1.2 new abstraction: model-based views

a consistent view — over space and time — to the users or the applications

2 model-based views

allowing database views to be defined using statistical models instead of just SQL queries; we call such views model-based views

2.1 models as tables

contents of a model-based view are (the result of the select * query on the view).

an approximation of the attribute space as a relational table

3.1 view definition

view definition will necessarily be somewhat model-specific; however, a major goal in devising a language for model-based view definition is to exploit commonalities between different models …

3.3 writing queries over views

from the user’s perspective, model-based views are indistinguishable from normal views. users need not be aware … the views they are querying are in fact derived from a model, though they may see the view definition and query the raw data if they desire. … model-based views make their output visible as a discrete table of results, users can use those outputs in any SQL queries …

database views: views have been a mainstay of data management systems from the early days of relational system, and are used to both make it easier for users to access the data, and to restrict what users can access11

data mining: … there are also various commercial tools for data mining that sit on top of a database … SAS Analytics tool

probabilistic/incomplete data management

reference