Table join problem

From cosmopool meta
Jump to navigation Jump to search

Performance lacks when joining tables on different nodes

Due to distributed storage within the network, tables will often be on different nodes (connected via slow internet) and therefore cannot be joined easily (i.e., without affecting performance very badly).

wikipedia article: SQL JOIN

The problem

Suppose we have a table t1 on node N1 and a table t2 on node N2, and let c1 be a column of t1, which holds a foreign key to a column c2 of t2. A query which involves columns from both tables and identifies rows via the foreign key is usually done with an SQL JOIN statement. This does not work with different nodes for two reasons:

  • The SQL query involves different (remote) databases.
To circumvent this problem one can do (even remote) queries using e.g. postgresql dblink. This reduces the set of possible queries and especially does not allow for efficient joins.
  • The join requires transfer of data (or indexes) of the columns c1,c2 between the nodes, which causes a huge delay for big tables.

Example

A user is searching for a place to stay in city C and wants to find a host interested in subject S. The places to stay and the hosts are in different tables t1 and t2 on different nodes N1 and N2. There are forward and backward links (foreign keys) between both tables. The problem is to find linked pairs (place, host) with the attributes place.city=C and host.interest=S in almost no time.

Why is this a problem? Because the usual algorithms (nested loop, ...) for inner joins would require lots of data to be transferred between both tables/nodes, and internet is slow.

Solution strategy

This is not a really satisfying solution; feel free to propose a better one:

  • have links between the tables in both directions how do inner joins between remote tables work via relations?
  • select from t1 the subset t1-matching with matching attributes into a temporary table and let its row count be r1
  • select from t2 the subset t2-matching with matching attributes into a temporary table and let its row count be r2
  • suppose r1<r2; then transfer the smaller table t1-matching to the node N2 like this (see the postgresql documentation for better ways to do this):
  • on N1 do
COPY BINARY t1 TO '/tmp/t1-matching'
  • on N1 transfer the file to N2, e.g. with scp
scp /tmp/t1-matching db@N2:/tmp/t1-matching
  • on N2 do
COPY BINARY t1_tmp FROM '/tmp/t1-matching'
  • do a SELECT on N2 joining t2 and t1_tmp
  • deliver the results to the transaction manager and later do some cleanup

Notes:

  • Actually we do not need to transfer the whole rows; the row ids are enough; once both tables are joined the transaction manager has the list of linked ids (to be cached) and has to query both nodes again for the whole rows it needs at the moment.
  • If we transfer only bigint rowids, then we could in theory reduce the amount of data to be transferred to 8*r1; this allows for a few thousand rows. If there are more matches (like 100000) on both tables, then we can ask the user either to refine the search, or to wait, or we can partition the results by transferring the first 1000 rowids, then the next 1000, and so on... this gets complicated regarding sorting and stuff.


Resources

join algorithms

distributed queries

misc

See also