"Two years in, Go has never been our bottleneck, it has always been the database...

jerf · on Aug 27, 2013

Profile your code. You might be surprised.

You might not.

Or, you might be surprised when you find something like "I'm making 5000 DB queries?" instead of your language being slow.

But certainly if "enough" people take my advice here who have not profiled one of their web pages before, there's a non-empty set of them that are going to go "Oh crap, I didn't realize that's what was so slow, I just assumed it was the database!" Not every web app is a glorified select statement.

And there'll also be quite a few people who discover that their page isn't "slow" or anything, but who will discover that the CPU vs. IO is closer to 50/50 than they realized or something.

bhauer · on Aug 27, 2013

In my experience, with slower platforms/languages, while it may be conventional wisdom that the database is the bottleneck, that's not actually the case in many circumstances.

Certainly in circumstances where you're doing a complex query involving fields that are not indexed or several joins, you're going to be waiting on the database.

But if you're just fetching rows by ID or indexed fields, slower platforms and languages end up being a bigger bottleneck than modern databases. Sometimes this is masked somewhat by the fact that the database drivers and/or ORM are slow, so from the application's perspective, the "database" is the bottleneck. But one should not confuse the drivers and ORM for the database.

jd007 · on Aug 27, 2013

If you are talking about a breakdown of time consumed during a single request's processing, then yes on slower platforms/languages/frameworks, the database access portion may not be the most significant percentage of time used. But this is not that relevant as even the slower platforms usually can handle a single request reasonably fast.

What I was talking about was more about in the scaling of a system, i.e. what happens when your architecture needs to handle lots of requests. In this case, it is very rare for the application server part of your architecture to be a bottleneck in scaling because it is generally stateless (for normal web apps at least) and hence very easily horizontally scaled out. Of course a faster platform will allow you to use fewer servers but 15 servers on Go vs. 20 servers on Python is not that big of an issue.

bhauer · on Aug 27, 2013

"Reasonably fast" is in the eye of the beholder.

In my opinion, many popular platforms and frameworks are not reasonably fast at providing a response in real-world applications. As a result, many web sites are frustratingly slow in my opinion (for example, a popular site used for hosting source code repositories). If those sites were to capture and share their profiling data (including time spent in drivers and the ORM), I would guess the database proper would not be as great a bottleneck as conventional wisdom says.

Perceived slowness is latency, and horizontal scaling doesn't necessarily address latency. Horizontal scaling may help alleviate an over-taxed CPU dealing with too many concurrent requests, but if a single request in isolation runs in 300ms, it will not run quicker than 300ms. It may run worse when contending for CPU capacity versus concurrent requests, but not better, unless a faster CPU is dropped in.

Performance matters, even in the world of horizontal scalability. Performance brings reduced latency (user experience) and reduced cost (size of cluster). If we can get that paired with an efficient, enjoyable developer experience, then yay for us.

Finally, "15 Go servers == 20 Python servers" seems a little unfair to Go.

jacques_chester · on Aug 28, 2013

I think the confusion arises because jd007 is discussing throughput and you are discussing latency.

Latency and throughput can be inversely related depending the precise architecture of a system (queueing is the classic mechanism that trades them off).

But in terms of "horizontal scaling", the goal really is to improve total throughput. Often imposing a tax on latency due to coordination costs.

bhauer · on Aug 28, 2013

True enough. This thread of conversation was kicked off by the premise that most applications are bottlenecked on their database server. When I said, "not necessarily," I meant as far as latency is concerned--for any given request on a slower platform, it is likely that the database's contribution is actually a minority.

That is the conventional wisdom I find in need of disruption.

But you are correct. Since a conventional database server can be more difficult to scale horizontally on its own right, even that small latency contribution, when multiplied by the number of queries being run by a wide array of application instances may ultimately mean the database server is the first observed bottleneck. By which I mean, the first device to reach 100% CPU without the simple recourse to just throw more money at Amazon and spin up another instance to solve the problem.

So I buy that.

But when I see slow web applications--when I criticize a site for being slow--I am always talking about latency. When we look under the hood, assuming the application is not being silly with its queries or doing a fundamentally challenging work-load, user experience slowness (latency again) usually originates from the application's own code or slow platform.

For example, I've observed applications that require 200ms of server-side time to render a login page. Behind the scenes, I may observe that they are badly designed and include one or two trivial (but utterly unnecessary) queries. Still, those two queries can be fulfilled by modern hardware in ~5 to 10ms. The remaining ~190ms of server processing is on the application. To my mind, that is unacceptable. A login page should be delivered in ~3ms of server time (under load!) on modern hardware.

And back to the OP, Go is a platform that brings JVM-class speed (the capability to return a login page in ~3ms) to those who can't stomach Java. Bravo to Go!

encoderer · on Aug 28, 2013

I don't entirely disagree with you but I do think you're not doing anybody any favors by "disrupting this conventional wisdom."

When I was an engineer at Formspring I profiled our social graph service which was basically PHP client hitting a Python service that queries against Cassandra. Thrift was used to communicate between PHP and the Graph Service and, being Cassandra, Thrift was used between the Python Cassandra client and the Cassandra server. So, two-way serialization, twice.

In the end, this isn't a bad design. I didn't write it then but if I were re-writing it now I'd probably use protobuffs but aside from that, it was a clean separation of concerns and fit in nicely with our larger SOA.

Point being, though, that serialization is CPU expensive. Reading from Cassandra was blazingly fast compared to the work Thrift was doing.

All that said, I think noob engineers should be taught that network operations (db queries included) are at least an order of magnitude slower than, say, opening and writing to a local socket. Experienced engineers can see the balanced view of things and agree with the point you're making, so advocating it, IMO, will only serve to make you look smart and confuse noobs.

bhauer · on Aug 28, 2013

I appreciate your point of view. It's not my intent to confuse noobs.

You're right, with that in mind, the conventional wisdom is worth retaining so to instill the proper fear of treating database queries as trivial. Thinking of queries as cheap--or not thinking about queries at all--is what gets applications into a state where a single request runs dozens if not hundreds of queries to deliver what is effectively static content! :)

That said, I've met more than a few senior folks who continue to fiercely stick with the premise that database queries trump all. As someone else in this thread said, profiling can be illuminating, even for senior developers. But it seems on that point, we're all in agreement.

paddyforan · on Aug 27, 2013

I think one of my favourite things about Go is that it makes it easy and obvious to code well. Generally the first thing that comes to my mind is going to be performant and extendable.

The reason this is the case, I think, is that Go strives to make algorithmic complexity clear: you know when you're allocating memory, but you don't need to jump through hoops to do it. You know the rough performance costs of what you're doing because the core team works hard to make it obvious. For example, some regular expressions features (lookahead, if I remember correctly) aren't implemented in the Go standard library's regular expression package, because they're impossible to achieve in O(n) time. https://groups.google.com/forum/#!topic/golang-nuts/7qgSDWPI...

This level of care in making simple, clearly-defined tools with known properties makes it easy to code well. Ruby, Python, PHP, NodeJS... you can shoot yourself in the foot and not even notice.

encoderer · on Aug 28, 2013

Ah yes the old "engineers are more expensive than machines" adage.

Thing is, that's true for some small number of machines. But when you build scalable systems and actually scale them you get to a point where the balance tips. And it just so happens that building those systems is exactly what the Golang team has in mind.

noelherrick · on Aug 27, 2013

Horizontal scaling does not take away latency. Also, iron.io saved a bunch of money by not using extra instances.

nexneo · on Aug 27, 2013

Thats not true for Ruby or it needs lots of fine tuning to make this statement true.

asdasf · on Aug 27, 2013

Throwing servers (money) at the problem is often incredibly wasteful, and exasperates the other problem you mentioned (the database). Having 500 servers all connecting to the database is not so awesome. Having 10 is a lot nicer.