But before that...
On Ruby blocks
Blocks in ruby provide a way of creating functions that act on a code block defined later on.
An example of blocks in ruby is the following:
- array = [1, 2, 3, 4]
- array.each { |n| puts n ** 2 }
- 1
- 4
- 9
- 16
According to their site: "A block is like an anonymous function or lambda [and] the variable between pipe characters is the parameter for this block". What's missing from this description is that ruby also provides the syntactic sugar to create functions that receive blocks using the "yield" statement. In other words, it is a way of creating closures and attaching them to other methods. Using closures in ruby comes very easily even for people that donsn't know what a closure is. If we were to implement the previous behaviour ourselves we would do something like this:
- class Array
- def each2
- for i in self
- yield(i)
- end
- end
- end
- array = [1, 2, 3, 4]
- array.each2 { |n| puts n ** 2 }
- 1
- 4
- 9
- 16
I won't go into details because there is a lot of documentation on ruby blocks out there.
So, onto Python...
The Python "builtin" that resembles the most to blocks is that of PEP343, the with statement; but I wanted something that immitetad the ruby syntax as much as possible. The with statement is nice, but it doesn't cover all the cases.
So I decided to use a decorator to convert the function that uses the "block" into something that receives the block, inserts it into the namespace, and execute the original function with the block as a corutine.
The idea was something like this:
- @receive_block
- def simple_iterate():
- for i in [1,2,3]:
- print block()
- @simple_iterate
- def _():
- return "a"
This copies the Ruby syntax except for the receive_block decorator, but I considered it a reasonable sacrifice.
Using "def _()" leaves the function as anonymous and allows you to specify parametrs for the block.
Implementing blocks in Python
So, to implement the syntax I just need to write the receive_block decorator.
The objective of this decorator is to convert the block receiving function, A, into another decorator that receives a function, B, introduces B into A's scope and subsequently calls A.
The key step is to add the block function to the scope. To do this we use Python's builtin types module. It includes the FunctionType method which creates a function object.
- types.FunctionType(func.__code__, scope)
There is more to this method than what we use here, but I won't go into details about the method since we only need the simplest use of it.
Once we know this, the decorator is pretty simple:
- import types
- def receive_block(func):
- def decorator(block):
- # Add block to globals
- scope = func.func_globals #globals()
- scope.update({'block':block})
- #create the function with the new scope
- new_func = types.FunctionType(func.__code__, scope)
- return new_func()
- return decorator
Lets see how it works:
- @receive_block
- def external():
- for i in [1,2,3]:
- print block()
- print "External"
- @external
- def _():
- return "a"
This will print
- External
- a
- a
- a
And if we add a parameter:
- @receive_block
- def param_external():
- for i in [1,2,3]:
- print block(i)
- print "External with param"
- @param_external
- def _(i):
- return "a " + unicode(i)
It, as expected, prints:
- External with param
- a 1
- a 2
- a 3
But what if we wanted to implement something like ruby's Array class? Lets create an Array class that extends the builtin list type and see what happens.
- class Array(list):
- @receive_block
- def each(self):
- for i in self:
- print block(i)
When calling
- a = Array([1,2,3,4])
- print "Each Square"
- @a.each
- def _(x):
- return x**2
we get the exception:
decorator() takes exactly 1 argument (2 given)
because the instance method takes self as the first argument.
So we modify our initial decorator to work with instance methods:
- def receive_block(func):
- def decorator(*args):
- if len(args) == 1:
- block, = args
- instance = None
- elif len(args) == 2:
- instance, block = args
- # Add block to globals
- scope = func.func_globals #globals()
- scope.update({'block':block})
- #create the function with the new scope
- new_func = types.FunctionType(func.__code__, scope)
- if instance:
- return new_func(instance)
- else:
- return new_func()
- return decorator
This modification is pretty straight forward. So I won't explain it because it speaks for itself.
Lets write some functions:
- class Array(list):
- @receive_block
- def each(self):
- for i in self:
- print block(i)
- @receive_block
- def collect(self):
- for (i, value) in enumerate(self):
- self[i] = block(value)
- @receive_block
- def handled(self):
- for i in self:
- try:
- block(i)
- except:
- print "This raised an exception"
and pass them some blocks:
- a = Array([1,2,3,4])
- print "Each Square"
- @a.each
- def _(x):
- return x**2
- print "Each"
- @a.each
- def _(x):
- return x
- print "Collect"
- @a.collect
- def _(x):
- return x**2
- print a # a is changed
- print "Handled"
- @a.handled
- def _(x):
- if x != 9:
- raise Exception("this won't work")
- else:
- print "this works"
We, then, obtain the desired output:
Each Square
[1, 4, 9, 16]
This raised an exception
This raised an exception
this works
This raised an exception
It works. =)
Another interesting way to do this would have been to add the block variable as a free variable of the function and have the code object reference it. In Python, when a closure is created the free variables are stored in an attribute of the function's code object and it's values are stored in the function itself using the cell type. Take this closure as example:
- def test():
- a = 10
- b = lambda x: x+2
- def inner():
- print a, b(a)
- return inner
- >>> i = test()
- >>> i.func_code.co_freevars
- ('a', 'b')
- >>> i.func_closure
- (< cell int object at 0x802400 >,
- < cell function object at 0xf68848 >)
Adding the closure to the new function should be easy, since FunctionType accepts a closure keyword argument to do so. Unfortunately, the code's co_freevars attribute is read only:
- >>> i.func_code.co_freevars += ('block',)
- TypeError: readonly attribute
If anyone, who knows better than I do, cares to provide an implementation using closures I'd love to hear your solutions.
So this is how we implement Ruby-style blocks in Python using decorators. Hope you enjoyed it.
This is by no means meant to be used in a production environment. Not even in a semi-serious environment. It is just a hack to demonstrate the how this could be done and it hasn't been tested.
Other notes
* The code for this project is hosted in http://github.com/nicolaslara/blocks/tree/master
* I wanted to implement some of the builtin Ruby functions that use blocks but I didn't have the time. If somebody is up to the task I'd love to see what interesting things could be done with this.
* Also, if somebody is willing to improve the code you are more than welcome.
* Some of the problems of this code:
-It clutters the global namespace.
-The word "block" is introduced as a global reserved word.
* For some reason blogger doesn't let me edit this site's template in 2009, so I couldn't add code syntax highlighting.
Nicolas: FYI the indentation in some of your code blocks is a bit messed up, this is caused by using <code> instead of <pre> (or vice versa, I forget which :P ).
ReplyDeleteThanks Alex! I was using <pre> because that's what my syntax highlighter uses but I couldn't add the highlighter 'cuz blogger wouldn't let me edit the template.
ReplyDeleteIt's fixed now thanks to a a cool widget by FaziBear
Beautiful post!
ReplyDeleteDid you see this: http://www.voidspace.org.uk/python/articles/code_blocks.shtml ?
I've combined your way to define anonymous code blocks and technique from this article to inject `block' variable to closures.
Result is little bit ugly, "but it works" :)
Nice implementation. I found it quite clean, actually. Though I don't love the dependence on byteplay or the @@ syntax, but it definitelly gets the job done. Nice work!
ReplyDeleteJust a thought - using _ as a function name can interfere with gettext.
ReplyDeleteWhy not just?:
ReplyDeletedef simple_iterate(block):
for i in [1,2,3]:
print block()
def _():
return "a"
No particular reason. I just wanted to avoid passing the argument explicitly. Note also that this breaks when using other paramenters in the block handling function. We would need some use of partial to evaluate the params.
ReplyDeleteI was playing with similar idea the other day
ReplyDelete..def _(e):
..map(_, [1,2,3,4,5])
It's a bit backwards, but once you accept that def _ will be followed by a mapping function or some such, it would be quite readable methinks.