Eventcounts

An eventcount is a condition variable for lockfree algorithms. That is, it permits a thread to efficiently wait for an arbitrary condition to occur, but unlike condition variables, an eventcount does not require a mutex to protect the state (it's kind of stupid to surround a lockfree data structure with a mutex to permit conditional waiting).

An eventcount can be implemented as a fine-grained, fairly efficient and reusable component. Fine-grained in a sense that it supports notification of a set of interested threads rather than single/all threads (PTHREAD FUNCTIONS). Fairly efficient in a sense that producer (notifying thread) overhead is a single load and a single conditional branching on fast-path, and no consumer (blocking thread) overhead on fast-path. Reusable in a sense that it supports waiting on an arbitrary user-supplied predicate.

Eventcounts allow to separate a lockfree data structure and blocking/signaling logic, so that there is generally no need to reimplement and inject it into each and every lockfree algorithm. For example, some people tend to implement so called blocking producer-consumer queues (instead of returning 'false' it blocks until new elements available), that's not only complicates the implementation, it also does not permit to poll, for example, several producer-consumer queues.

Let's consider an example. Assume that we have a set of non-blocking producer-consumer queues and deques, that we want to poll from. So the predicate that we want to wait for is "at least one of the containers is not empty". It's very easy to implement with an eventcount:

eventcount ec;

// non-blocking consume (basically, predicate)

void* consume_impl()

{

// poll a set of non-blocking containers

return high_prio_queue.dequeue()

|| normal_prio_queue.dequeue()

|| work_stealing_dequeue.pop()

|| low_prio_queue.dequeue()

|| global_root_task_queue.dequeue();

}

// let's turn it into a blocking consume

void* consume()

{

void* data = 0;

if (data = consume_impl())

// it's the fast-path

// as you see, the eventcount adds nothing here

return data;


// so all containers as if empty, so it's slow-path

for (;;)

{

// raise the cock of the eventcount

ec.prepare_wait();

// check the predicate

if (data = consume_impl())

{

ec.cancel_wait();

return data;

}

// the predicate is still not satisfied,

// so we block

ec.commit_wait();

if (data = consume_impl())

return data;

}

}

// and here is a signalling producing function

void produce(void* data)

{

some_queue_or_deque.enqueue(data);

// just as with a condition variable

// (but without a mutex)

ec.notify();

}

class condition_variable

{

eventcount ec_;

public:

void wait(mutex& mtx)

{

ec_.prepare_wait();

mtx.unlock();

ec_.wait();

mtx.lock();

}

void signal()

{

ec_.notify_one();

}

void broadcast()

{

ec_.notify_all();

}

};

http://software.intel.com/en-us/forums/showthread.php?t=62364