I went through a length evaluation process of Riak recently, and came away with ...

siculars · on March 7, 2011

I've given a few talks about NOSQL in general and Riak in particular and blogged[0] about both. I agree with all of your comments and will try to add a bit. One of the major driving design considerations of Riak is that of predictable scalability. Meaning that for each unit of hardware that you add to the system you are returned a predictable level of performance. You see this throughout the design of the system. Homogeneity amongst the nodes - meaning that all nodes are the same and there are no special nodes - is a big win. So is tunable CAP, as you already mentioned. Another major win from my perspective is the modular nature of the code base. I think it is one of the best layouts of a large system that I have seen yet. If you look at the main Riak repo you will see that it is comprised of a few sub modules. Basically most of the system is compartmentalized in this fashion. Major advantage is that development of components can proceed at their own pace and components can be reused/mixed more easily.

Re. Key limitations when using the default bitcask backend:

There is a spreadsheet[1] that outlines the number of keys you can have in your system based on a few variables. Definitely worth checking out. Two things to note here. There is the hard limit of keys per machine - due to each machines max memory - and there is the larger limit of keys per cluster based on the max memory of the cluster. Note that all this applies specifically when using the bitcask backend. There has been lots of talk about how to change this going forward and I know the Basho team is looking into it. Riak is quite interesting in that it can support a number of different backends - at the same time even. So you could have a bitcask backend for some data and a memory backend for other data. Since Riak is distributed, the implication is that outside of thinking about a single machines resources you should also think of the total clusters resources. Particularly, cluster total memory, total cpu cores and total disk spindles, that last one quite important but under-considered.

Re. max nodes in production:

One of the major considerations when dealing with a Dynamo derived system like Riak is cross node chatter. Riak does a lot of its magic by way of a gossip channel that is sending around all kinds of data. As the number of nodes in the system increases the level of chatter increases. I think there needs to be more work in optimizing that chatter for larger cluster sizes and I that has been happening if you follow the change logs.

If you follow the mailing list or irc you will notice that the primary concern amongst people new to Riak is that of querying. Riak has no secondary indexes, outside of Riak search, which is a separate download that is built on top of Riak (see code modularization). Riak has no native ordering. All of those things need to happen in the m/r phase. As it stands I think this is one of the major friction points to further adoption and why I have consistently recommended pairing Riak with Redis whenever possible.

These limitations aside, Riak represents, IMHO, the best mix of distribution, ease of use, raw power and growth potential of the currently in production NOSQL persisted datastore offerings. Philosophically I am very much attuned to the Dynamo world view and as a strict adherent think that Riak is the best representative from that perspective. Take that for what you will.

Disclaimer - Riak and Redis are my nosql databases of choice.

[0]http://siculars.posterous.com

[1]https://spreadsheets.google.com/ccc?key=0Ak4OBkABJPsxdEowYXc...

ot · on March 7, 2011

> (and each key has, on the order of, 20 bits of overhead, IIRC)

20 bits? Really? Less than an integer? Or did you mean bytes? (Not nitpicking here, I'm just curious)

aphyr · on March 7, 2011

I believe the limit using the current bitcask backend is (40 bytes + average key size) * (replication factor) / (number of cluster nodes * memory capacity of the smallest node). If that factor grows above 1, you can't store any more.

IIRC correctly a 64-node cluster with 24gb of ram per node will handle a few billion 32-byte keys, replicated to three nodes. For larger keyspaces, the current recommendation is to use innodb, which doesn't need to keep keys in memory.

grourk · on March 7, 2011

Given that Riak seems to implement buckets "for free" by essentially making them key prefixes under-the-hood, does the bucket name size need to be considered as part of the key size? e.g., if my bucket name is 32 chars and my keys in that bucket are 32 chars, should I be using 64 bytes for average key size?

This is a question I've gotten conflicting answers to.

SoftwareMaven · on March 7, 2011

I think 20 bits is too small. I've been looking for a source without success; the closest I've found[0] is 32-75 bytes per key which is something like 11 million keys per GB of RAM.

[0] http://www.quora.com/Matt-Heitzenroder

pkulak · on March 7, 2011

20 bits can support a million records. As an average, per node or cluster, it seems reasonable to me.

Devilboy · on March 7, 2011

Only a million records? That's waaaay too small.