01 Nov 2015

Query containment for data integration systems

query containment is fundamental to … query optimization, determining independence of queries from updates, and rewriting queries using views … not sufficient in data integration

relative containment, which formalizes the notion of query containment relative to the sources available to the data-integration system.

we show that relative containment for conjunctive queries is still decidable in this case, even though it is known that finding all answers to such queries may require are cursive datalog program over the sources.

1 introduction

A data-integration system frees its users from having to locate the sources relevant to their queries, interact with each source in isolation, and manually combine the data from the different sources.

(common approaches to specifying source description)

local-as-view (or source-centric) approach
describe data sources as containing answers to views over the mediated schema
global-as-view (or query-centric)
the mediated schema is described as containing answers to views over the source relations

The local-as-view approach allows new sources to be added and removed modularly, while the global-as-view approach requires source descriptions to be modified when such changes occur. On the other hand, query answering is straightforward in the global-as-view approach, where the answers can be obtained by simply composing the query with the views, while the local-as-view approach requires a more sophisticated form of query rewriting

query containment relative to views
refine the notion of containment
a query Q1 is contained in Q2 relative to the sources if, for any set of instances of the data sources, the certain answers of Q1 is are a subset of the certain answers of Q2.

reference