Resource Access

Previously, I’ve been writing about how to perform task delegation with a distributed authorization scheme. While that article managed to outline the principle well enough, it left a few things not yet covered. The summary was that if we find a generic scheme for treating a resource as a series of changes, then we can encrypt and authenticate each change separately, leading to a kind of distributed authorship of resources – which is what we really want in a distributed system.

Looking up from the bottom of Bertha's access pit — Figure: “Looking up from the bottom of Bertha’s access pit” by WSDOT is licensed under CC BY-NC-ND 2.0

In this article, I’ll explore the meaning of key concepts of access control in a distributed system, as well as outline some high-level requirements deriving from that. Following articles can then dive deeply into how these requirements can be fulfilled in a more specific scheme.

Key Concepts

A resource in this system is, as I mentioned above, a series of changes starting at some initial state which might be a null state. What this system does not define is what this resource represents – it could be a document, but it could just as well be a chat, or a directory of files, or an API endpoint perhaps.

However, resources don’t spring into existence from nothing. They have an originator; some kind of human or machine actor. In the distributed authorization scheme, we’ve called this the owner of the resource. We’ll have to revisit ownership a little, but for now this is a good enough approximation.

There is a parallel to this in local systems: file system entries. They are also resources of variable kind (files, directories, named pipes, special device files, etc.), and they also have an owner.

On UNIX-like systems, file system entries have a number of attributes, and these attributes are stored for the owner, a group, and “others” – i.e. the public. Each of these three categories might have different values stored for the attributes.

We’ll get to the specific attributes in a moment; let’s first focus on the group. This idea of groups – often representing system roles – exists to give privileged access to a file to people who are neither the owner, nor the general public. And for many use cases, this is sufficient. However, as the introduction of Access Control Lists makes clear, it is sometimes convenient to define attributes for multiple groups, or even multiple individuals.

In principle, we have this covered by our distributed authorization scheme. A subject in a token is some kind of identifier, which can identify a single user or a group of users. What we have left to explore is how to model group membership in a distributed system.

Finally, there is the question of the attributes themselves. At minimum, typical file attributes signal whether a file is readable by the subject, or writeable. Other attributes include whether the file is executable or hidden. Some attributes, such as the setuid and setgid bits, need to be interpreted by the local operating system, and do not have any particular distributed meanings.

We’ll consider two classes of attributes here. On the one hand, there are attributes that need to be recorded, but have no distributed meaning. We’ll call them supplementary attributes, and otherwise don’t really need to discuss them here. We’ll just assume that a change might include an update to supplementary attributes, and leave it at that. Supplementary attributes include setuid, setgid, and executable bits, but can really be application defined.

The attributes we’re most focused on here are the read and write attributes. These are core attributes, which the distributed system itself needs to interpret somehow. It’s arguable that other attributes, such as the sticky bit are core attributes. For now, we’ll restrict ourselves to reading and writing, though. The sticky bit only poses additional limitations on writing – that is, it does not introduce a new method of interacting with the resources, but modifies an existing one.

Reading and Writing

Reading from and writing to a resource can be moderated at three different points in a distributed system.

The local machine. If the local user has no access to a resource, the local machine can and must deny access, even if there is a copy of the resource cached locally.
The remote machine. Really, this is the exact same case as above, except a second machine to which the resource has been copied. The remote user is a local user on the remote machine.
Any transit machines. A distributed authorization scheme permits transit machines to deny transit if read or write operations are not permitted. In the previous article, the example here was the data server Dave.

There’s a reason I’m being a stickler here and treat the local and remote machine as different: even though each operates on local information only, both the authorization tokens as well as the resource data itself has to exist on both ends in order for each node to authorize any actions requested by the node-local user.

For transit machines, it turns out there are two classes. If the transit machine holds resource data like Dave does, the same applies here as above. If the transit machine does not hold resource data, it is akin to Ted in the previous article’s example, and merely passes authorization tokens and commands around.

If most nodes pass both resource data and authorization tokens for a resource around, it raises the question whether these tokens can be considered part of the resource in the same way that supplementary attributes are, or an entirely distinct thing.

In the discussion about tokens themselves, I already pointed out that the order in which tokens are received is somewhat important. By adding sequence numbers, we enforce the order in which they are to be applied. They are, in that sense, a series of changes – starting from some null state in which only the originator role had any kind of access to the resource.

It turns out, there is a similar need for sequencing applies to supplementary attributes; without knowing the sequence of setting and clearing a bit, and only the number of set and clear operations, it is not possible to determine the final state of the bit. It may be that adding sequence numbers is not the ideal choice here – maybe here, as well as for the resource data, conflict-free replicated data types are the optimal choice.

Sub-Resources

The conclusion must be that a resource is really not one series of changes, but three interleaving ones:

The resource data series contains changes to the, well, the resource data.
The metadata series contains changes to metadata, such as supplementary attributes. Other metadata such as a resource name, etc. can also be included here.
The authorization series contains changes to authorization, that is, authorization tokens.

The main difference between these three series is that the first two must be considered part of the resource, and probably therefore should be interleaved in any data stream transmitting the entire resource. The third series can be modified outside of this communications, as in the previous example where an authorization token passes through Ted and Prilidiano to Dave. But it is (or should be) just as possible to interleave these tokens with the other series in the data stream.

There are interesting implications for storing resources deriving from this, such as how each series may deal differently with truncating changes that are superseded by newer ones, which may also depend on the resource contents and local policies.

That discussion is going too far off track, however. The key point to take away is that read and write authorization tokens should be modelled in such a way that they can be interleaved in data streams with the resource data and metadata series, but also transmitted out-of-band by separate routes. The latter is the key to distributing authorization, but the former is important for archiving, accounting, and similar tasks that may be necessary in a complete system.

The implication of this is that each series of changes must be identifiable as such by transit nodes – at least in principle. That is, for a transit node to enforce access control, it must know which other nodes are authorized and which are not. It does not, however, require access to any of the other change series in order to do this.

This means that “read” is not an attribute that applies universally to the entire resource, but rather one that applies to each series individually. Transit nodes must have read access to the authorization series (which is modelled via an authorization token, which can in turn be part of that same series).

If reading is an attribute that applies to each series individually, then so is writing. While, as file systems teach us, it is possible to write to a resource without being able to read as well, it’s also the case that being able to add data to a resource does not imply being able to e.g. rename it or modify authorization data.

This means we essentially do not deal with single resources, but rather that resources have at least three sub-resources we need to handle in parallel. This probably means our resource identifier should contain some kind of namespacing structure (which I’ll revisit later). The main point is that we can treat each sub-resource as an individual resource for authorization purposes, and managing a resource means orchestrating authorization between two or more of them, the authorization sub-resource on the one hand, and “anything else” on the other.

Transit vs. End Nodes

Treating a resource as consisting of several sub-resources provides a neat way for dealing with different types of nodes in a system, i.e. transit nodes and end nodes (local or remote). Transit nodes may require access to the authorization sub-resource to do their job, while end nodes are typically more concerned with the other sub-resources, and largely interested in individual authorization tokens.

I’ve previously discussed that authorization tokens can be revoked. There is a window between receiving and verifying an individual authorization token, and receiving a revocation for it, in which a node may be convinced to grant access to something it shouldn’t.

Identifying a sub-resource in an access token actually helps solve that issue as well. Let’s say that the transit node Ted receives an authorization token for a resource. If Ted can determine the resource identifier for the authorization sub-resource, he can try and synchronize this series before deciding how to handle the authorization token (or as an optimization, permit some kind of reduced preliminary access, like accepting writes but not committing them to the resource until he’s sure that is correct).

While this is very similar in principle to the idea of a certificate revocation list (also discussed previously), the key difference is that the method by which to retrieve the list is not specified via a location of sorts – it is based solely on the identity of the resource in question, which is beneficial to caching in a distributed system.

The rule of thumb should then be the following:

If an authorization token names the receiving node as the subject, the node can treat the token as-is, i.e. try to perform an action which should be granted by the token.
If an authorization token names a different node than the receiving node as the subject, the resource identifier within may be used to determine the authorization sub-resource for checking for revocations.

Note that there is a potential optimization here for having a distinct sub-sub-resource solely for the purpose of revocations. Let’s note this, but otherwise skip it as something to deal with if/when it becomes necessary. All this does is open up the possibility that a resource consists of more than three series; the rest remains the same.

Read and Write Summarized

The summary of these preceding points is this:

Read and write access is modelled via authorization tokens (the exact mechanism here is discussed in the next articles, while the previous article already outlines a solution idea).
A stream of authorization tokens (and revocation tokens) is modelled as one of several sub-resources of a resource; this sub-resource may be transmitted separately from or interleaved with the rest of the sub-resources.
1. There is, at minimum, need for a data and for a metadata sub-resource for reasons that have little to do with authorization.
Transit nodes may query the authorization sub-resource to determine whether an authorization token is not yet revoked.
End nodes are typically interested in individual authorization tokens in order to access an entire resource (or a significant sub-set of the sub-resources).

If this is how read and write are modelled in the abstract, an interesting thing falls out of it as a result: if you imagine each node in the distributed system as a local service exposing an API both to other local applications and to the network, then the API/service can actually handle remote and local requests for (sub-)resources in much the same way. An application must authenticate, and possess an authorization token to access local data. This token should likely be issued by the user, i.e. owner of the data.

If this is somewhat reminiscent of how apps on mobile operating systems ask the user for permission to certain hardware or the file system, that is entirely on purpose.

As a side note, just like making the sticky bit part of the core attributes, it is not unlikely that a specific append-only write token could be useful. The number of core attributes should remain small, but can be extended for such uses. The main distinction is that they are part of the authorization sub-resource, not the metadata sub-resource.

Individuals vs. Groups

Having sorted how modelling read and write will work in the abstract, it’s worth exploring how this can be applied to individuals versus groups.

In operating systems, group membership is a matter of record. Some local database contains a list of all members of a group. This same model is transferred to centralized AAA systems, where the database is no longer local, but remote – but in a central location, such as an LDAP server.

It is possible to simply distribute the database. That is the underlying principle of blockchain, of course. Amongst the many downsides of blockchain, however, is an in-built need to either synchronize the entire chain contents, or partially centralize the solution again by making a distinction between light nodes, which are merely clients of full nodes that have synchronized the chain.

Another downside is that by making everything public, blockchain works to directly oppose privacy, which is not in our interest. That is, once on the chain, group memberships are known to everyone. Encrypting ledger entries can help here, but it is at best a workaround to an existing problem rather than a solution, and raises the question why a public ledger is a reasonable approach in the first place.

I think we can do better than that.

Recall that in the last article that reading and writing (to some extent) can be modelled simply by possessing an encryption key? Well, the same applies to group membership.

The key (haha, pun intended) realization is that when presenting an authorization token to a party, a node also needs to authenticate. Otherwise, the remote end cannot decide whether to grant access. Authentication usually happens via some challenge/response mechanism, where a node needs to prove that they possess a secret key – combined with distributed authorization, this key needs to be cryptographically linked to the subject in the authorization token.

It is in no way required that this secret key is in the possession of only a single entity. Any group member may possess it, and as such, authenticate as “the group” – which then implies that authorization tokens for this group can be applied.

This scheme does not leak group membership. It does not even leak whether a subject named in a token is an individual or a group¹.

The upside of this scheme is that adding members to a group is as simple as sharing the secret group key with them. The downside is that group members leaving – or being expelled – is harder to model. That’s a challenge for a later article.

As discussed at the beginning of this article, there needs to be an owner, an originator of a resource. It turns out, the previous section on individual vs. group access has already provided insight into how to model ownership.

What does ownership mean, after all? In filesystems, it’s a record associated with a file that can be changed – but no file exists without such a record. That is, file systems make creating a file and recording ownership an atomic operation, where the owner defaults to the user creating the file. The operation also sets initial access rights.

While it’s possible to create a file without doing more with it, typically, files are opened either for reading or writing. Opening a file with the open(2) system call (Linux man page) explicitly permits an O_CREAT flag to be passed, which tells the system to create the file if it doesn’t exist.

Apparently, opening and creating a file go together often enough that introducing a special, atomic case here was historically sensible – and it is still is useful today. In practice, it seems to be most common to specify this flag when trying to write.

If we transfer this to our own discussion so far, then it may make sense to model resource ownership as corresponding to whatever authorization token is passed during resource creation. More specifically, the very first authorization token that is required before any sub-resource can be written to is a write authorization for the authorization sub-resource.

Resource creation can then be understood as writing this initial write authorization token to an authorization sub-resource of a resource not yet in existence. The subject of this authorization token is automatically the owner of the resource, simply because no other data about the resource exists.

So how about shared ownership? How about transfer of ownership?

Turns out… it’s all the same.

Recall how groups are handled by simply sharing a secret group key? What if the subject of this initial authorization token was not a user identity, but a newly generated key? We could share this key in order to share effective ownership, forming a kind of implicit group. We could transfer the ownership by sharing the key, and then discarding it ourselves (though in practice, like leaving a group, this process needs more detail).

It’s (almost) that simple.

Resource Identifiers

The above neatly slots into something else we’ve glossed over for the time being, and that is how resources are identified. In a sense the resource identifier format is not particularly relevant to the authorization scheme. But by introducing sub-resources, we do need some kind of structure for the identifier – something like a main identifier which acts as a namespace for sub-identifiers.

How about we further restrict the above to state that a main resource identifier must be identical to subject (i.e. (hash of) public key) of the subject named in the initial write authorization token written to the authorization sub-resource? The initial, implicit group is the same as the resource; it “owns” itself.

Note that this scheme also helps avoid collisions of resource identifiers in a global namespace. Collisions during the generation of key pairs is a fundamental issue in cryptography, and can be considered solved inasmuch as cryptography is “solved” – that is, enough so that we don’t have to worry about it here.

Privacy

The idea that a newly created resource is always associated with a newly generated key pair also helps create privacy. In this way, the identity of a user is never leaked when dealing with ownership of resources. The user never presents themselves as a particular identity, but instead presents a token for a particular resource.

This principle applies to almost all operations in the distributed system. The only time this is broken is when the user must receive a secret key (for being added to a group). We’ll see how this works in a later article.

Lost Keys

A major concern with tying cryptography so deeply into something as fundamental as writing a file (resource) is of course what to do when keys are lost. The answer to that is both very simple and, unfortunately, out of scope: key escrow can solve it.

The quick summary here is that one can give the key(s) to a trusted third party, and recover it from them. The details of such a scheme, well, we can come back to them. But not now.

Revocations

Revocations are a recurring theme, of course, and partially addressed via revocation tokens, as well as an authorization sub-resource that can collect them. But if the initial key with which a resource is created is also a group key to be shared, then revoking the group key for any reason whatsoever, essentially removes all access to the resource, by any party, forever.

That seems a little brittle. Then again, it is also a fine way to effectively delete a resource.

Can we make it less brittle? Yes, absolutely. The brittleness stems from the fact that shared ownership of a resource is more of a side-effect of sharing the initial secret key in an implicit group. In practice, it is much more likely, and definitely more convenient to model shared ownership by creating an explicit group. This group needs to have fundamental access to everything – that is, predominantly write access to the authorization sub-resource – and then its key can be shared just as freely as before.

The additional benefit here is that this explicit group does not have to be created per resource; instead, a group key that has access to multiple resources can be used, thus modelling file sharing behaviour even more closely than before.

Sharing the initial key should then be restricted to transferring ownership, and coupled to discarding the key locally. The current holder of the key can then use the revocation of itself as a delete operation. Transit nodes receiving such a revocation can feel reasonably safe in discarding data for the resource, perhaps after some grace period governed by local policy (aka garbage collection interval).

Summary

This article was a lot more abstract than the last ones. I explored high level requirements on distributed access control for shared resources.

We treat a resource as consisting of several sub-resources, one of which contains a sequence of authorization (and revocation) tokens.
The first write authorization token signals creation of a resource, and is self-signed; subject, owner and resource identifier are identical here.
Group membership is modelled via sharing of secret keys; this means it is a fully distributed operation, and there is no leakage of group membership information.

The entire thing hangs on safely sharing secret keys. We know we can use some variation of a Diffie-Hellman scheme for this. We can also, as stated in the previous article, take inspiration from Signal’s key exchange schemes. The main point to stress, though, is that our scheme must additionally be safe to embed into the authorization sub-resource.

Which means in the next article(s), I will dive into such a key exchange scheme, covering the requirements above.

There is a side-channel to this, in which Eve the eavesdropper might observe all communications between two nodes and conclude that all subjects named in all tokens may map to the same person. Transport encryption should typically take care of this.

Which implies, I should add, that the machine identity for transport encryption should not necessarily be the same as the identity of the user or groups making resource requests. There are other reasons why this separation is a good idea, but this one rears its head now. ↩︎