Improved Lockfree SeqLock

Assume we have data structure which we want to protect with the SeqLock. Data structure is relatively big. Write frequency is low, but not as low as we want. So sometimes readers have to wait for writer, and sometimes readers have to retry, and retries are expensive as data structure is big.

I am going to trade some space for speed. X times more space we use, X times lower probability of retries. Quite fair deal. Blocking of readers is completely eliminated. Readers are lockfree.

Here is the algorithm sketch:

struct data_t

{

int seq; // sequence number

int data [1024]; // user data

};

struct seqlock_t

{

data_t* current;

data_t pool [16]; // 16 user objects are held

};

seqlock_t sl;

mutex_t mtx; // mutex to coordinate writers

void read()

{

for (;;)

{

data_t* d = sl.current; // load-consume

int seq1 = d->seq;

// load-load fence

if (seq1 % 2)

continue;

process(d); // user processing

// load-load fence

int seq2 = d->seq;

if (seq1 == seq2)

return;

}

}

write()

{

mtx.loсk();

data_t* d0 = sl.current; // load-consume

int idx = (sl.current - sl.pool + 1) % 16;

data_t* d = &sl.pool[idx];

d->seq += 1; // start writing new object

// store-store fence

modify(d0, d); // user processing

// store-store fence

d->seq += 1; // end writing new object

sl.current = d; // store-release

mtx.unloсk();

}

Note that object, which is now current, will be modified only after 15 previous objects will be reused. So this trick can greatly reduce probability of reader retries. The cost is increased memory consumption (note that still no dynamic memory allocation involved), and writers have to copy object in process of modification, instead of modification in-place (like in classical SeqLock). In some situations object is rewritten completely, so no copying will be involved. In classical SeqLock, because we have only one object, if object is loсked reader has to block (usually passive spin is used, i.e. yield). In this optimized SeqLock, because we have several objects, if objects is loсked then this means that reader just works with substantially outdated object, so it just have to reload pointer to current object, no blocking/spinning is required.