Differential Reference Counting

Differential reference counting is a reference counting with strong thread-safety. So what is that strong thread safety and how does it differ from plain old atomic reference counting?
Strongly thread-safe object permits concurrent read and write accesses, while basicly thread-safe object permits either concurrent read accesses or exclusive write accesses (basically, "as safe as int"). Consider the following example:
some_ptr<T> g_ptr;

// thread 1
T* ptr = g_ptr.acquire(); // read access
...
ptr->release();

// thread 2
T* ptr = new T
T* prev = g_ptr.exchange(ptr); // write access
prev->release();
If some_ptr is only a basicly thread-safe pointer (for example, boost::shared_ptr, which is "safe as int"), then the usage is illegal - the program can crash or whatever. But if some_ptr is a strongly thread-safe pointer, then the usage is perfectly legal.

To understand the problem consider the standard implementation of usual reference counting:
T* acquire(T** pp)
{
    T* p = *pp; // dereference
    //(***)
    std::atomic_fetch_add(&p->ref_count, 1, std::memory_model_relaxed);
    return p;
}
The dereference and the increment are not atomic. That is, the increment is an action on an object reference for which we don't yet own, and that's illegal - to do any action on an object we need to own a reference for it first. Kind of cyclic dependency. If a thread is preempted in (***), and another thread will free the object meanwhile (remember the first thread is not yet incremented the counter), then the first thread will do the increment on the already freed object (which can cause all sorts of bad things).

So what we need is an atomic load of the pointer and increment of the counter. And here differential reference counting kicks in. Don't worry, soon you will understand why it's called 'differential'. Here is kind of structural diagram of the thing:


Contrast it to plain old reference counting:



I should say at once that differential reference counting in it's straightforward form requires double-word atomic RMW operations (either XADD and XCHG or CAS). However there are specialized implementations that does not require that, they usually combine the counter and the pointer (or a substitute) into a single word, for example, Wait-free Object Storage with Single-word Atomic Operations.

An object can be deleted when both inner counters are equal to 0 (strong_counter == basic_counter == 0). And the outer counter is used as a transient cache, it collects all "+1", while inner counter collects "-1", thus the name - differential. When a pointer is updated with a new value, outer counter value is transferred to the inner counter.

Differential reference counting features 2 types of pointers: there is no established terminology so let's call them a 'strong' pointer - one that supports arbitrary concurrent accesses (that is, one thread acquires/copies a pointer, and another thread updates the pointer), and a 'basic' pointer - one that is thread-safe as int (that is, supports either concurrent acquire/copy or exclusive updates).

Now let's decide on operations that we want to support.
  • We need an operation that allows to obtain a basic pointer from a strong pointer:
    acquire(strong_ptr) -> basic_ptr
  • We need a set of operations that allow to update a strong pointer:
    update(strong_ptr, basic_ptr)
    exchange(strong_ptr, basic_ptr) -> basic_ptr
    cas(strong_ptr, basic_ptr, basic_ptr) -> basic_ptr
  • And we need standard operations for basic pointers:
    acquire(basic_ptr)
    release(basic_ptr)

Now let's try to implement the whole thing.

Comments