10 Oct 2015
ORCHESTRA: Rapid, Collaborative Sharing of Dynamic Data
http://www.cidrdb.org/cidr2005/papers/P09.pdf

bottom-up collaborative data sharing, in which independent researchers
or groups with different goals, schemas, and data can share
information in the absence of global agreement. … peer-to-peer data
sharing, which considers revision, disagreement, authority, and
intermittent partition. … yields to others with greater authority.
the central problem … science evolves in a “bottom-up” fashion,
resulting in a fundamental mismatch with top-down data
integration methods. Scientists make and publish new
discoveries, and other new concepts, build upon, and refine the most
convincing work. Science does not revisit the global models after
every discovery: this is time consuming, requires consensus, and may
not be necessary depending on the long-term significance of the
discovery.
The web evolves rapidly, and it is self-organizing, and
self-maintaining.
rapidly contribute new schemas, data, and revisions. …
Orchestra emphasizes managing disagreement, and it supports rapid
changing membership …
conflicting data and updates: the traditional emphasis in
distributed data sharing has been on providing (at least eventually)
consistency ... the goal is to merge update sequences in an ordered
and consistent way, yielding a globally consistent data
instance. … we propose a model that intuitively resembles
that of incomplete information[2]. … each participant has an
internally consistent database instance, formed by the set of tuples
that it accepts. Importantly, no participant is required to modify its
data instance to reach agreement with the others, although it has the
option if it so chooses.
Orchestra coordinates a set of autonomous participants who make
updates to local relation instance and later publish them for others
to access. the general mode of operation is to operate in
disconnected fashion, then to reconcile.
a participant p reconciles its updates with those made by others
through
- compute the effects of these updates on all shared relations
… determine which updates would be accepted by p and remove
those that conflict
- propagate to p’s relation those updates that are accepted and non-conflicting
- record the updates originating from p, or accepted by it, in
Orchestra for future reconciliation operation.
4 consistency with conflicting data
The fundamental unit of storage and propagation is an atomic
delta over a single relation, representing a minimal encoding
for the insertion, deletion, or replacement of a single tuple.
reference
- 2 L. Antova, C. Koch, and D. Olteanu. … worlds and beyond:
Efficient representation and processing of incomplete
information. In ICDE, 2009.