Improved Lockfree SeqLock
Assume we have data structure which we want to protect with the SeqLock. Data structure is relatively big. Write frequency is low, but not as low as we want. So sometimes readers have to wait for writer, and sometimes readers have to retry, and retries are expensive as data structure is big.
I am going to trade some space for speed. X times more space we use, X times lower probability of retries. Quite fair deal. Blocking of readers is completely eliminated. Readers are lockfree.
Here is the algorithm sketch:
struct data_t
{
int seq; // sequence number
int data [1024]; // user data
};
struct seqlock_t
{
data_t* current;
data_t pool [16]; // 16 user objects are held
};
seqlock_t sl;
mutex_t mtx; // mutex to coordinate writers
void read()
{
for (;;)
{
data_t* d = sl.current; // load-consume
int seq1 = d->seq;
// load-load fence
if (seq1 % 2)
continue;
process(d); // user processing
// load-load fence
int seq2 = d->seq;
if (seq1 == seq2)
return;
}
}
write()
{
mtx.loсk();
data_t* d0 = sl.current; // load-consume
int idx = (sl.current - sl.pool + 1) % 16;
data_t* d = &sl.pool[idx];
d->seq += 1; // start writing new object
// store-store fence
modify(d0, d); // user processing
// store-store fence
d->seq += 1; // end writing new object
sl.current = d; // store-release
mtx.unloсk();
}
Note that object, which is now current, will be modified only after 15 previous objects will be reused. So this trick can greatly reduce probability of reader retries. The cost is increased memory consumption (note that still no dynamic memory allocation involved), and writers have to copy object in process of modification, instead of modification in-place (like in classical SeqLock). In some situations object is rewritten completely, so no copying will be involved. In classical SeqLock, because we have only one object, if object is loсked reader has to block (usually passive spin is used, i.e. yield). In this optimized SeqLock, because we have several objects, if objects is loсked then this means that reader just works with substantially outdated object, so it just have to reload pointer to current object, no blocking/spinning is required.