Distributed storage

invent and explain distributed storage This is the main article of Category:distributed storage.

Goals

Storage of the data must be decentralized, i.e. distributed among many different storage nodes located far apart form each other and not subject to one central power.
Data must be stored redundantly (synchronous replication) in some (5..10, say) different places on this planet, ideally far apart from each other, so that neither a government nor an earthquake can harm the availability of the data.

Some definitions

Notion	Shortdef
Node	A machine running a database accessible via a storage server through the server-storage interface
Storage	the store for objects and relations (classes and instances), finally distributed among a network
Table join problem	Performance lacks when joining tables on different nodes

Theory

Distribution of data among nodes

We want to distribute data among several nodes. This should be taken into account:

in view of limitations of speed and capacity
make use of the fundamental k-coordinates
try to have the computing limitations taken into account already when creating a class

We use one database per node, i.e. network nodes are at the same time database nodes.

node list

Each node must be able to locate each other node in the network. This requires an up-to-date list of all nodes kept on each node. - Since an immediate synchronization of all these lists across the network cannot be achieved when a node joins or leaves the network, we need a robust update mechanism.

On which nodes is a k-object stored?

This is tough. Especially since tables might become too big to bestored on a single node.

Networking

See distributed transactions for some general theory.

Possibilities for communication between nodes:

The spread toolkit
Twisted spread facilitates communication between objects in distinct locations.
Twisted Web2
python libs for distributed programming

An example for stack for inter-node communication

Routing

Protocols

Synchronous replication

Links

Distributed storage

Contents

Goals

Some definitions

Theory

Distribution of data among nodes

node list

On which nodes is a k-object stored?

Networking

Routing

Protocols

Synchronous replication

Links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools