Producer-consumer queues are one of the most fundamental components in concurrent systems, they represent means to transfer data/messages/tasks/transactions between threads/stages/agents. If you hope that there is a single magical "one-size-fits-all" concurrent queue (MS PPL and Intel TTB fallacy), sorry, there is no. So what flavours of queues are there?
Depending on allowed number of producer and consumer threads:
- Multi-producer/multi-consumer queues (MPMC)
- Single-producer/multi-consumer queues (SPMC)
- Multi-producer/single-consumer queues (MPSC)
- Single-producer/single-consumer queues (SPSC)
I hope this aspect is clear - for example, if you have only 1 producer and 1 consumer thread, you can use SPSC queue instead of more general MPMC queue, and as you may guess it will be significantly faster.
Depending on underlying data structure:
Array-based queues are generally faster, however they are usually not strictly lockfree. The drawback is that they need to preallocate memory for the worst case. Linked-list queues grow dynamically, thus no need to preallocate any memory up-front. And hybrid queues (linked-list of small fixed-size arrays) try to combine advantages of both.
For linked-list based queues depending on intrusiveness:
Intrusive queues are generally better performance-wise if you need to transfer already dynamically allocated data, because there is not need for additional node memory management. However they are inapplicable if your data is not dynamically allocated and/or you need to put the same message to unknown number of queues simultaneously.
For linked-list based queues depending on maximum size:
An unbounded queue can hold infinite number of messages, while bounded - up to some predefined limit. If the limit is reached further enqueue operations fail. Note that array-based queue are always bounded. On first sight unbounded queues are more attractive (because they allow you to not care). But they are not. They are dangerous. What will happen if your queue will grow up to 10^6 messages? 10^7? 10^8? 10^9? What? It should not happen? So why you put an unbounded queue in the first place? In 95% of cases you need a bounded queue, because it will enforce what you think should happen, and will save you from bad things.
For bounded queues depending on overflow behavior:
- Fails on overflow
- Overwrites the oldest item on overflow
Most queues fall into the former category, and the latter is quite specific. Consider that a producer submits some real-time data, and if a consumer can't catch up it's better to lose the oldest data rather than the newest.
Depending on requirements for Garbage Collection (object life-time management):
- Requires GC
- Does not requires GC
There are a lot of lockfree algorithms (not only queues) that require GC - they don't know when it's safe to free memory. For garbage collected environments like Java/.NET it's not a problem. However for non-garbage collected environments like C/C++ it may be a serious problem (if you don't have an object life-time management scheme yet, of course).
Depending on support for priorities:
- without support for message priorities
Priority queues reorder messages, so that consumers always dequeue an element with the maximum priority. Priority queues are generally significantly slower and scale worse.
Depending on ordering guarantees:
- provides causal FIFO/LIFO (strongest)
- provides per-producer FIFO/LIFO
- provides best-effort FIFO/LIFO (weakest)
- no ordering guarantees
The differences are actually quite subtle (not counting the last category), and they have to do with corner use-cases - namely in what use cases you expect particular ordering of messages. Best-effort FIFO can be significantly faster and scale better than causal FIFO.
Depending on forward progress guarantees for producers:
- Waitfree producers
- Lockfree producers
- Blocking producers
Depending on forward progress guarantees for consumers:
- Waitfree consumers
- Lockfree consumers
- Blocking consumers
If you are implementing a hard-real time system, then you need no less than wait-free producers and consumers. If a producer works in a context of signal or interrupt handler then it must be at least lockfree. However in most cases you are Ok with whatever guarantee. Generally, blocking is faster than lockfree, and lockfree is faster than waitfree.
Depending on expected usage:
- A queue usually contains very few or zero messages
- A queue usually contains substantial amount of messages
This aspect can also affect queue's design. One queue can be faster in former scenario, while another in the latter.
There is also an aspect related to behavior on failure:
- When a queue is empty/full consumers/producers get blocked
- When a queue is empty/full consumers/producers instantly get 'false'
While a lot of queues incorporate blocking, I believe that it's not a queue's concern. It should be solved externally with eventcounts. What if you want to block on one instance, and get 'false' from another? What if you want to block in one context, and get 'false' in another? What if you want to block on several queues?
So how much combinations can you count? I count more than 10'000 combinations! Huh!
Of course, not all combinations have optimal implementations (that is, they would be implemented the same way as some stronger combination), and, I think, some combinations just make no sense. But I hope you get some realization of the design space :)
Move on to the Queue Catalog