09 Jan 2016

Transactions and consistency in distributed database systems

http://dl.acm.org/citation.cfm?id=319734

The concepts of transaction and of data consistency are defined for a distributed system. The cases of partitioned data, where fragments of a file are stored at multiple nodes, and replicated data, where a file is replicated at several nodes, are discussed. It is argued that the distribution and replication of data should be transparent to the programs which use the data. That is, the programming interface should provide location transparency, replica transparency, concurrency transparency, and failure transparency.

a distributed system can be modeled as a single sequential execution sequence.

To our knowledge, no general-purpose distributed system provides the notion of a “network job,” a coordinated unit of work which operates at several nodes.

We conjecture that the notion of transaction as used in most data management systems generalizes to the network environment. This paper suggests that network systems should provide the notion of a transaction as an abstraction which eases the construction of programs in a distributed system.

The transaction notion is not a panacea. Rather, it is a convenience for a general class of applications.

replicated, partitioned, centralized

A transaction issues requests to manipulate entities. These requests against entities are translated by the system into one or more actions on objects. Actions are the primitives supported by the individual nodes of the network.

transaction execution, which is a sequence of requests (actions) that must be viewed as a single logical unit of work.

3. LOCATION TRANSPARENCY AND REPLICA TRANSPARENCY

Data are partitioned among nodes to distribute work, minimize message traffic, and minimize response time.

Partitioning and replication may complicate programming … The complexity of locating each record and issuing the appropriate call to the appropriate node would dwarf the logic of the FundsTransfer program.

Location transparency
allows the movement of objects without invalidating application programs that reference the corresponding entities.
A system that supports location transparency would accept the FundsTransfer program in the form presented above and would translate the requests into actions at the appropriate nodes.

The argument for replica transparency is similar. We would like the freedom of moving and replicating entities without affecting program logic.

4. TRANSACTION CONSISTENCY

Transaction execution is not instantaneous; … executed in parallel as an economy which improves resource utilization (hardware and information)

These concurrency anomalies are very difficult to understand and guard against and therefore most transaction management systems hide concurrency by implementing a lock protocol which precludes such anomalies.

5. CONSISTENCY IN DISTRIBUTED SYSTEMS

6. CONCURRENCY TRANSPARENCY