This post is slightly out of date and I’m planning on rewriting it soon.
In the meantime…
Let’s talk about the Y-Combinator. If you’re not hip to what that is, here is a challenge for you:
Try to write an anonymous recursive function.
Because “anonymous AND recursive” might not be immediately clear, let’s go over both one at a time so that there is no confusion.
The first requirement is ‘anonymous’. Here are some anonymous functions (henceforth to be called lambdas
) in a couple
popular programming languages:
Fine. That’s easy enough. But what about the recursive part? Let’s start with a “regularly defined” recursive function as a point of reference. Factorial, which is almost always used in these types of tutorials, will do the trick.
Great. Easy enough if you have some experience with recursion. If these examples don’t seem easy, then stop this tutorial and work through The Little Schemer. It helped me understand recursion much better. It’s a great book.
Still here? OK - let’s carry on. The final step is to make a lambda version of factorial, that’s also recursive.
Most people will try to do something like this:
It’s a nice try, but it’s a little bit cheap because it’s not 100% anonymous. And the Erlang version doesn’t even work.
Bummer. Back to the drawing board. Hmm.
What do we need? Well, at the point of recursion, factorial needs to be able to call itself. We know that much. Would it be possible to pass a copy of factorial into itself? What would that even look like? Would this work?
Before you get too weireded out here, we’re not really doing anything too different except for 2 things.
One, we’ve wrapped our old example in another lambda
( lambda do |f|
in ruby and fun(F)
in erlang). Two, we’re calling Fac(Fac)
(in erlang)
and fac.(fac)
(in Ruby).
Initially, seeing something like Fac(Fac)
or fac.(fac)
or any other example of a function
taking itself as an argument is a bit of a mind stretcher. It’s too meta! Your mind will melt
as your inner mental-stack grows out of control…
It’s actually not that hard. Let’s think if it like this… You see something like G(H), where you know that both G and H are functions. You could read it like this:
“I’m a function and my name is G. When I get to the point that I need help, I call the function H.”
How would you read G(K)?
“I’m a function and my name is G. When I get to the point that I need help, I call the function K.”
How would you read Fac(Z)?
“I’m a function and my name is Fac. When I get to the point that I need help, I call the function Z.”
So, finally, what is Fac(Fac) ?
“I’m a function and my name is Fac. When I get to the point that I need help, I call the function Fac.”
Which is basically another way of saying, “I’m a recursive function”.
When you see Fac(Fac) or F(F), etc etc. Repeat the magic sentence and then think: we’re just building a recursive function.
You could see F(F) and say: this isn’t a big deal. This guy is just building a recursive function here. F(K) or F(G) or F(Z) might have been some other type of function. But we want to build a recursive one, so F(F) will build what we want.
So…back to our examples. Do they work? Walk through them (the Ruby if you’re a Rubyist, the Erlang if you’re more comfortable with Erlang). Try using both 0 and 1 for values of n.
For example, what is fac.(fac).(0)
and fac.(fac).(1)
?
Or in Erlang, what is ((Fac(Fac))(0).
and ((Fac(Fac))(0).
?
Do they work? No.
0 will work. But 1 won’t work. We need to make an adjustment.
What did we do? We needed to change f.(n-1)
to f.(f).(n-1)
in Ruby and
F(N-1)
to (F(F))(N-1)
in Erlang. Why? Think back to this:
“I’m a function and my name is F. When I get to the point that I need help, I call the function F”. F isn’t ready by itself to take N (an integer). F(N) won’t work. F forever and always needs a helper function, and because we want it to be recursive, that will take the form of F(F).
So we need to change F(N) to be (F(F))(N). F(F) is our recursive function builder, and we were hoping to call a recursive function.
If you understand this, you literally have everything you need to understand the Y-combinator. The only thing you need to get the Y-combinator is to refactor the example above. Literally, and to re-iterate: the Y-combinator is just a refactoring of these final two examples.
So If you understand this, then please proceed. If you don’t, bookmark this page, stop here, open up irb or erl and play with the examples until they make sense. It may take a while. That’s ok. This may very will be very different than what you’re used to. I spent a few months with The Little Schemer to get here.
So we’re refactoring now. The main impetus in each step will be, “I don’t like how this looks. Let’s make it look nicer”. Nothing new in functionality. Just changing the looks.
Starting points:
What don’t I like? Making the user write (Fac(Fac)(10)
or fac.(fac).(10)
to
get the factorial of 10 (in this example), isn’t very user friendly. Something like
G(10)
would really be a nicer looking api. So let’s make that happen:
If the above looks intimidating: all we are doing is passing our old function into
a new function of the form lambda { |h| h(h) }
or fun(H) -> H(H) end
. All that’s
doing is making it so that the user doesn’t have to do Fac(Fac) anymore.
What else don’t I like? Well, the user no longer has to do Fac(Fac), which is nice, but it still lives in the function body which is kind of ugly. We need to get rid of it, which we’ll do in 3 steps so you can see exactly how to do this type of refactoring.
So we wrapped the ugliness with a lambda.
f.(f).(x)
became lambda { |x| f.(f).(x) }.(n-1)
. Convince yourself
that these are identical statements.
It’s no different than this:
changing 1 + 2
to lambda { |x| 1 + x }.(2)
.
So we’ve made ugly code even uglier. Great job! Heh. Remember, this is just step 1. We’re going slow so you can see. “Wrap offending code with an inline lambda”. We’ve done that. On to step 2.
Looks pretty crazy, huh? Remember how we got here. We’ve just been refactoring.
If you’re still stuck, stare at the following examples and convince yourself that they are all the same. Open up erl or irb and enter each example and see that the value doesn’t change.
So if you understand that, then you should be able to see what we did in the crazy
examples above. Just replace 1 + 2 with fac.(fac).(n)
in ruby or (Fac(Fac))(N)
in Erang.
Congrats. You’ve derived Y. Here they are, written generically, in all their glory.
As you’ve already seen, Y takes an argument which is a higher-order function of the form of:
Where f is a function that needs to recur. Don’t forget F(F) ->
“I’m a function and my name is F. When I get to the point that I need help, I call the function F”.
The Y-combinator has the distinction of being easier to derive than to read. Just remember what you’ve learned: when you see h(h) or f(f) say: he’s building a recursive function here. If you see f.(f).(x) say: he’s building a recursive function and then passing it an argument. That will get you through the trickier parts of reading it.
A Y-combinator helps anonymous, higher order functions recur that couldn’t otherwise recur on their own. Paul Graham’s incubator is appropriately named, I think.
Erlang developers have more use for this than Rubyists, but there is a nicer way now for Erlang devs (see section on named funs).
Challenge: Can you derive a Y-Combinator that helps functions of 2, 3 or 4 arguments recur? It’s easier than you think if you were able to follow the refactorings above.
If I’ve made some mistakes (and I no doubt have), please send me an email at jimtron9000@gmail.com and I’ll make the necessary corrections.
Thanks - and be sure to check out the Little Schemer. It’s the best computer science book I’ve ever read.
– Jim