vessel

Introduction

Vessel is a container format for information-centric networking (ICN) resources, suitable also for streaming applications. It provides optional confidentiality via encryption, and permits multiple authors to contribute to the same resource in parallel.

Being a container format, it encapsulates arbitrary data. In order to support multiple authors, it provides an extensible mechanism by which the owner of the resource can specify which authors are permitted1. As a side effect, vessel distinguishes between separate multiplexed data streams within a single container, which can also be used by applications to distribute related sub-resources.

Motivation

In ICN, resources are addressed via an identifier – but resources may also be modified over time. The typical ICN approach is to subdivide a resource into content blocks (called extents in vessel), each of which can be individually addressed. The simplest and most elegant way to address such blocks is to use a cryptographic hash of the block’s contents as a unique identifier2.

The blocks are then collected into a manifest. Approaches here vary from providing explicit manifests, which are themselves resources, which may specify additional manifest blocks – to calculating a merkle trie of all content blocks in the resource’s preferred order, and using the resulting hash as an overall resource identifier.

The downside to these approaches is that with every modification of a resource, a new resource identifier is generated, which needs to be advertised and/or transported to consuming nodes. This creates overhead which in the worst case may increase with the number of modifications, and is thus not ideally suited for streaming applications.

At the same time, an implicit merkle trie approach to a resource manifest does not support multiple authors out of the box; explicit manifests can contain new authorized keys.

The motivation of vessel is to solve both issues in one.

Approach

Vessel is similar to other approaches in subdividing a resource into extents. Extents are somewhat flexibly sized. Each extent has only a single author; this helps avoid conflicts between multiple authors’ changes. Each extent is still individually addressable, such that distribution and caching mechanisms typical for ICN approaches still work.

The main difference lies in how each extent’s identifier is generated. They do not depend on the extent’s content, but remain stable. The origin extent identifier is randomly generated, and may be used to identifier the resource as a whole. The author of the origin extent automatically becomes the resource owner.

Any subsequent extent identifiers are created by hashing the preceding extent’s identifier with the author identifier of the current extent. This restricts a single author to only creating a single extent at a time, but permits multiple authors to create extents in parallel. A deterministic algorithm is then used to order parallel extents. Furthermore, a similarly deterministic algorithm is used to pick the best candidate for a preceding extent at any given time.

This permits the extent stream to eventually become consistent with synchronization.

It also permits modifying an extent; the extent identifier remains stable with modifications. A side effect is that each extent must also contain a kind of content version.

Caveat

The result of this construction is not that multiple authors can create e.g. a shared document without further work. It merely provides the groundwork: resources can be fully or partially synchronized, authorization can be encapsulated into the resource itself, and a processing order for extents can be provided.

For absolute ordering of data written to the extents, this is not sufficient. For this, applications must e.g. use a conflict-free replicated data type and encapsulate this in a vessel resource. This is the approach taken by wyrd.

However, for single author scenarios, such as typically supported in ICN systems, vessel provides everything to prepare a resource for secure streaming.


  1. This is an authorization topic; vessel is itself agnostic to how precisely authorization is handled, and can encapsulate arbitrary authorization data. However, using CAProck is highly recommended here. ↩︎

  2. Cryptographic hash functions are designed to have a low probability of collisions, which makes this approach feasible. ↩︎