How does functional programming lead to generally fewer objects allocated? I wou...

dschiptsov · on Aug 8, 2016

It is not only about absolute number of objects created, of course. It is also how long they kept and how often the are changing.

Immutability gives simplifies everything including incremental, parallel GC and hence reduces the pauses for GC.

This is the big idea behind Clojure and Datomic - immutability makes everything simple (less complex runtimes) and even more scalable (share nothing means no locks, mutexes, busy waiting) and in the case of Erlang - fault tolerant (strong process isolation - no threads with shared stack or any writable data).

What is funny - everything has been well-researched and right principles formulated by Erlang team decades ago, and the resulting product just works for telecoms.

The two fundamental principles they described are 1) Posix threads and fault tolerance does not match. 2) Imperative code and GC does not match. While Golang, Lua or PyPy are trying really hard to disprove the second principle, Java is an existential proof.

catwell · on Aug 8, 2016

That's kind of a digression, but the Lua GC is pretty simple (which is good) and not so efficient. The LuaJIT GC has some interesting features but sadly Mike Pall did not complete all of his plans for it.

What makes LuaJIT so efficient regarding the GC is that you can allocate memory that is not managed by it (using the FFI). You have to if you need to use a lot of memory anyway, since depending on the architecture GCd memory can be limited to as little as 1 GB.

pjmlp · on Aug 8, 2016

Gut feeling, not sure how much this holds.

The majority of functional languages support value types (in the stack/global sense), which many OO languages that follow Smalltalk don't. Also escape analysis tends to be relatively weak in most JVM implementations (maybe Graal is the exception here).

Immutable data makes it easier to collect garbage when moving living data between nurseries, thus leading to shorter pauses as in imperative languages.

EDIT: Updated the sentence about escape analysis.

chrisseaton · on Aug 8, 2016

I think as you say immutability is probably the real benefit of functional for GCs as it simplifies a lot. But that doesn't reduce the number of objects allocated, which was the claim made.

catnaroek · on Aug 8, 2016

The directed graph of immutable objects and links between them is by necessity a DAG. Only mutation can introduce cycles. If 90-95% of your objects are immutable, your GC algorithm can be specifically optimized for them. For instance, the mark and compact phases could run concurrently (interleaved) and in-place.

(Yes, this means that, from a GC implementor's POV, a lazy language like Haskell mutates like crazy.)

notacoward · on Aug 8, 2016

It seems to me that objects are being copied no matter what. In one world, mutation is common and the copying occurs within the GC subsystem. In the other world, mutation is rare and the programmer has to manage the copies manually. Pretending that the copies are wholly new objects which owe nothing to their predecessors strikes me as disingenuous. For all practical purposes, creating a new object that differs from its predecessor in one field is equivalent to mutating that field (except that it's far worse for performance).

Is moving a burden from the runtime to the programmer a good idea? Generally, no. There might be something about this particular case that makes it an exception to the general rule, but the point seems very far from proven.

zaphar · on Aug 8, 2016

As I've been teaching myself rust this point has been hammered home hard. It's really eye opening to see all the copies plainly represented in code and the lifetime checker is really good about highlighting where you should be making a copy vs sharing a reference.

I haven't reached the stage where I can write performant Rust yet since I've only recently started to truly grok the borrow checker but I can see how longer term it will be easier to figure out how to make my code performant since the all the information required is front and center. It will be interesting to see whether that promise holds for rust code as the community matures.

catnaroek · on Aug 8, 2016

> It's really eye opening to see all the copies plainly represented in code and the lifetime checker is really good about highlighting where you should be making a copy vs sharing a reference.

From a Rust POV (a useful one, even if you're using a higher-level language), when you use a garbage collector, technically everything is owned by the garbage collector, and merely borrowed by you. That's why in most languages you can get away with making shallow copies everywhere, even when Rust would tell you that a `.clone()` is in order.

catnaroek · on Aug 8, 2016

> For all practical purposes, creating a new object that differs from its predecessor in one field is equivalent to mutating that field (except that it's far worse for performance).

(0) If the object has, say, just two or three fields (which isn't uncommon in functional languages, since they really tend to decompose everything in a piecemeal fashion), this is no big deal.

(1) If a functional language has substructural types (which admittedly most don't), the compiler can even tell, purely from type information, when an object won't be used anymore and can be overwritten anyway, as though you had used imperative assignment all along.

notacoward · on Aug 8, 2016

So, best case, with proper type annotation and a really good optimizer, you get the performance you would have had with plain old mutability? Pardon me if I remain unconvinced that immutability is solving the problem adequately.

catnaroek · on Aug 10, 2016

> you get the performance you would have had with plain old mutability

And muuuch better type safety. You can only mutate data in-place if either no one else has access to it, or it's wrapped in a mutex and you've taken a lock.

catwell · on Aug 8, 2016

To be honest when I said that I can was thinking about how less boxing occurs in (statically typed) functional languages. But like you and others said in the thread immutability can actually result in the creation of more objects, they just have different life cycles and properties that make the work of the GC easier.