A Tale of Two Queues

antirez · on Feb 23, 2013

High quality benchmark for one time! Thanks.

A few remarks:

1) It's worth to try Redis 2.6 against this. It is possible that it will perform better or worse, not sure, but more probably better.

2) Believe it or not Redis Pub/Sub was never tuned for speed so far, nor profiled / optimized, because as far as I can tell nobody asked for more performances given that with the order of magnitude we can see with both Redis and ZMQ, it is pretty hard to hit the wall. However there are demanding applications, so probably it's worth doing it.

3) Maybe ZMQ only uses one core as well, otherwise to have an absolutely fair comparison, N Redis nodes should be used simultaneously. Pub/Sub is the kind of application where sharding sometimes it is really really easy, just by channel. In general with Redis you have three options to go distributed with Pub/Sub.

Option A) Have N nodes and shard by channel.

Option B) Use replication, as it also does PUBLISH of messages on slaves.

Option C) Use Redis Cluster, but currently it is in alpha. However it already does message propagation across all the cluster so it is very easy to implement a reliable HA Pub/Sub system with it. However currently the propagation is not smart, every message is propagated to every node, however in Redis the cost of Pub/Sub is proportional to the number of receivers, so this is usually not a big issue, but we'll improve this aspect in the future anyway.

spamizbad · on Feb 23, 2013

Question: Was gevent, used by hiredis, using select or kqueue? IIRC gevent will default to select on OSX. kqueue could be faster.

Also, both kqueue and select were slightly buggy in OSX: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#OS_X_AN...

Not sure if that's still the case.

Would be interesting to see how these benchmark on a Linux or FreeBSD machine.

loeg · on Feb 23, 2013

Zed Shaw benchmarked epoll vs poll a while back[0]. It looks like it really depends on the proportion of "active" clients to the total number. I would expect similar results for kqueue vs select (and as you point out, kqueue was horribly broken in OS X for a while).

[0]: http://web.archive.org/web/20120225022154/http://sheddingbik...

stephen_mcd · on Feb 24, 2013

gevent wasn't used in the benchmarks

ajays · on Feb 23, 2013

I'll wait for AntiRez to chime in. It is possible that ZMQ has better performance than Redis, because the Redis server has to parse the request and then act on it, and then there's the marshalling on the client side; with ZMQ, you're just sending a command. I'm not sure if Redis has an efficient binary protocol, but having that may eliminate some of the bottleneck?

reinhardt · on Feb 23, 2013

From the graphs it doesn't look that ZMQ has generally better peformance; it depends on the number of clients and whether they run on Python or Go. It's pretty interesting:

- For up to 4 clients, (buffered) redis is better than 0MQ in Python but worse in Go.

- For more than 4 clients it's exactly the opposite: redis is worse in Python but better in Go.

I'd be interested to hear an explanation for this, even if it turns out that a graph line was mislabeled :)

diroussel · on Feb 23, 2013

Seems that latency was not measured. I'd expect it to be much lower in 0mq. First there is no intermediate server, and secondly you don't have to wait for the flush thread to send a message.

zokier · on Feb 23, 2013

I think there must be something seriously wrong with zmq golang bindings if the performance plummets like that. I mean, it's significantly slower than the python version.

SethMurphy · on Feb 23, 2013

This seems to be the bigest one: "GoZMQ does not support zero-copy."

There still seems to be some work to do to improving the Golang bindings: https://github.com/alecthomas/gozmq#caveats

SethMurphy · on Feb 23, 2013

If all things are equal (as they appear to be here)), I would go with 0MQ if only for the lack of adding another server to manage and that hassle. However that being said 0MQ is quite the black box and that offers it's own disadvantages.

I would be curious to see what would happen if gevent were added to the Python code.

njharman · on Feb 23, 2013

You're still adding a server. But you have to write it first.

SethMurphy · on Feb 23, 2013

A Redis database would be a server in addition to the server you write in either case. 0MQ uses no server, it is just a library.

stephen_mcd · on Feb 24, 2013

In the Python flavour of these tests there's no concurrency occurring within a single OS process, all the work is CPU bound - adding gevent wouldn't achieve anything.

Also as njharman mentioned, you're still running a separate broker with zmq, so the number of components doesn't change.

daneturner · on Feb 23, 2013

This is a fun read, and the external links alone are worth it for me (msgpack, 0mq guide, python multiprocessing). But frankly, the conclusion says it all: "Conclusion. What can we take away from all of this? To be brutally honest, not much... ."

stefantalpalaru · on Feb 23, 2013

Is there a way to ensure that both redis and zeromq have the same buffer size / flushing interval?

stephen_mcd · on Feb 24, 2013

This is a really good question. I don't know the details around how zmq approaches buffering, I suspect it's much smarter than the approach used by the buffered Redis client in these tests (flush on every 1000 messages and every 200ms).