I forgot to mention it (and I guess I'll need a part 2 to fully explain some of the subtler points), but it's important to note that ANY class can be extended by these extension methods. Even java.lang.String and its ilk.
That wasn't exactly clear from the post, but it was already getting too long as it is.
One of the more legitimate uses I've found for monkeypatching is fixing bugs in libraries you don't control when the maintainer is for whatever reason unresponsive to your pleas. I.e. replacing someone's broken method despite lack of a pre-existing contract. Is that supported?
Every function in a protocol is namespaced. There's nothing preventing you from defining a new function with the same name in a different namespace. This doesn't actually replace anything, though, and any references to the old function will have to be changed to point to the new one.
It is also possible to extend a protocol a second time for a given type/record, which will overwrite the previous implementation. This can be occasionally useful (mostly for incremental development), but for the most part runs counter to the philosophy of Clojure. Seeing it used in production Clojure code would raise a lot of alarm bells.
So anything that requires changing all the references to the new one is a non-starter. I want something that lets me fix it in-place without going and hacking their source directly.
If you could do something like "monkey patch this, but only for calling code inside my private namespace" that would probably get you 90% of the way there.
No, I understand. I used monkeypatching to work around some problems with numerics libraries in Ruby too. But let's face it, it's not as good a technique as this. And I didn't even touch on the ability to fold a closure around a class instance using (reify ...) or the differences between declaring methods in deftype or using extend-type.
There is a lot of power in this relatively simple concept without going into the gnarly tilt-the-whole-world problems that monkeypatching introduces.
I'm not exactly defending monkeypatching. It's better than nothing, and if used judiciously it doesn't have to be especially dangerous. All the rope you need and all that...
And, what you're describing is very cool on its own merits.
What I want to know is whether this will get you out of the same messes that monkeypatching can. If it will, but in a more controlled way, then that's really awesome.
`defprotocol` and `extend-protocol` remind me of type classes and type class instances in Haskell. The latter provide the same kind of ad-hoc polymorphism - i.e. you can tell the compiler that your custom datatype implements a predefined protocol, and then code that talks to that protocol automatically works with your new datatype, in a statically type-safe and (I believe) efficient way.
For example, this (built-in) type class is similar to Java's `toString()`:
class Show a where
show :: a -> String
-- works with any instance of Show
showTwice :: (Show a) => a -> String
showTwice s = show s ++ show s
data Fruit = Orange | Apple
instance Show Fruit where
show Orange = "orange"
show Apple = "apple"
x = Orange
show x -- => "orangeorange"
Can anyone more familiar with Clojure (or Haskell for that matter) compare and contrast?
A popular meme that's going around the Clojure community right now is that Clojure's types and protocols are "dynamic type classes". Clojure's offering does not provide all of the same things that Haskell's type system gives -- like say, return type based method selection.
And the benefit here is that you aren't doing "my_int_instance.make_hash()" so everyone can have their own definition of make-hash, right?
What's the difference between this and defining a function in Python/Ruby (rather than monkeypatching)?
# Python
def make_hash(val):
if not isinstance(val, MyIntHolder): raise TypeError
return val.data
That's a namespaced function that checks it was passed a MyIntHolder and returns the data[1]. Each library/module can define its own version.
I mean this as an honest (and stupid) question. I'm fascinated with Clojure and am working on learning more, but none of the Protocol examples I've seen have given me that "Ah ha!" moment I've had with so many other parts of Clojure. Can you help me understand?
[1] Side question, I didn't understand "(.x this)" in your example, what is .x?
> What's the difference between this and defining a function in Python/Ruby (rather than monkeypatching)?
Your function can't handle new types in the future that you're not aware of. How, exactly, would I as a BloomFilter writer know about MyIntHolder in order to put it in that conditional? How could a library writer extend it? Your response might be to make a hash table, in which case you have re-implemented an slow, expensive, error prone version of what Clojure is doing under the covers.
> [1] Side question, I didn't understand "(.x this)" in your example, what is .x?
It's like saying "this.x", which is to say "access the public x field." Clojure is in the camp that enforced visibility on data members is generally a bad idea, and it expresses that opinion with deftype.
I understand that Clojure does this under the covers with great speed, but is that the only benefit? I want to reiterate that I'm merely confused here and not trying to argue my silly function with type check is good code. I still feel like something about types/protocols isn't clicking in my brain.
What's the difference between extending a protocol including make-hash and human-readable-hash for MyIntHolder and doing two defmethods that define those and are dispatched on an instance of MyIntHolder? (Sorry if my Clojure lingo is off there, I hope this makes sense)
> How, exactly, would I as a BloomFilter writer know about MyIntHolder in order to put it in that conditional?
Didn't the person writing this need to know about MyIntHolder?
(extend-protocol BloomFilterable
MyIntHolder
...
> It's like saying "this.x"
But what is "x" there? I'm probably reading it too literally, but I see that MyIntHolder takes 1 value (of type int). I don't see any name associated with it, is the "x" just an unrelated example?
Protocols are just a faster, less general version of multimethods. They dispatch only on the type of the first parameter, and they do it quickly.
This speed is important because it allows the java interfaces that define the core constructs of Clojure (such as clojure.lang.IFn, clojure.lang.Seqable, etc.) to be defined purely in terms of protocols, thus bypassing the need for Java altogether. Using multimethods would have been untenably slow.
Protocols can also be useful in that they recognize that certain behavior is defined by a set of functions, rather than just one. Consider the Cantor library (http://github.com/ztellman/cantor), which does simple floating point math. Using only multimethods, it's possible that you could define addition for a type, but not subtraction. With protocols, it's much more difficult to make that mistake.
> Didn't the person writing this need to know about MyIntHolder?
Yes, but they wrote it much later. The library writer only wrote the protocol and coded to the protocol. Your function method is not extensible unless you totally re-implement this mechanism (effectively re-implementing single dispatch). If your question is, "Why can't I get this effect in Python?" then of course you can, but you're basically implementing this feature, and all that entails.
The reason I mentioned monkey patching is to point out how powerful and seductive mechanisms that allow for unforeseen modifications to existing behaviors can be. Clojure provides a fast, well-designed system right out of the box for handling these situations succinctly. Unlike existing methods which are error prone or tedious (monkey patching and retooling dispatch every time, as in your approach), this is elegant and trivial.
> I understand that Clojure does this under the covers with great speed, but is that the only benefit?
Clojure provides one mechanism that can work two ways. Firstly it can provide flexible specifications where code can interface in a functional way, even across library and time boundaries. Secondly it can induct pre-existing types into a behavior pattern without causing a (potentially destructive) ripple effect in existing code.
> But what is "x" there? I'm probably reading it too literally, but I see that MyIntHolder takes 1 value (of type int). I don't see any name associated with it, is the "x" just an unrelated example?
Ahh, it was meant to be ".data". Sorry, there was a version mismatch between two iterations of the code when I was posting. Thank you for catching it, it has been corrected.
That wasn't exactly clear from the post, but it was already getting too long as it is.