Access to data

From cosmopool meta
Jump to navigation Jump to search

Goals

  1. We have users.
  2. Users can define groups.
  3. Users can assign rights (read, write, change_access) to a set of users (list of groups and/or users) for either a self-created object class as a whole, or for single instances of an object.

Problem

The distributed nature of our database entails difficult problems already for the easiest things like determining access to an object.

Suppose we have an instance I of an object O and user U1 wants to grant access (read, say) to it to users U2, U3 and groups G1, G2.

The easiest thing would be to add a column to the table of object O holding the information about read access in an array contining all the users. This means the groups G1 and G2 would have to be resolved to their users. The problem is that groups can become rather big; after all, that's why we want to use computers to simplify all a bit. And if we store some thousand user ids in each row of a table this does not make sense. So we have to stay with groups and use group ids.

Groups are defined implicitly, i.e. not through an actual list of users, but through unions and intersections of other groups, and single users. - Suppose now that user U4 wants to see all instances of object O. Of course, he/she will only be shown those instances to which he/she has read access. But how do we determine this? The instance (row) tells us only which groups have access. So we must know whether the user is in the respective groups. To evaluate membership we cannot go through the tree of implicit group definitions each time (~ for each row!). We must know in advance the membership of a user in a group.

Solution

To accomplish this we use this strategy:

  • When a user defines a group, then store this definition on her/his home node.
  • And also evaluate this definiton to a list of users. (This may take time, but you don't define a group taht often.)
  • Store the list of users together with the grup definition on the home node.
  • Add the group to the membership table of each user; this requires connections to the home nodes of each user and might really take a long time (it's definitely an asynchronous job!).
  • When this is finished thr group is usable.
  • The group id contains the node id of the home node of its creator.

Now when a user U4 requests all instances of O, we first look up all groups from her/his membership table and join this with the read access for O, see table join problem. should O have a separate groups access table, or can we use the postgreSQL array data type?

To membership table must be limited in size, because we do have to transfer membership information nearly every time we want to query an onject. But let's say we have a limit of 10000 groups in this table. Then it could happen that user users add U4 to their groups and all 10000 entries are used and the next time the user will not be added (10000 groups should be really enough). But then, if U4 want to define an own group including herself/himself, she/he cannot become a member of her/his own group. So we have to ensure that some, 1000 say, entries in the membership table are reserved to the user herself/himself.

Another twist: it is often desirable to give access to a collective driving a node, e.g. because it is in the neighborship, or because the collective shares some interests. The collectives are so to say natural groups in this system, which don't have to be defined once more when a user wants to grant access to them. work out the access data structures more clearly