In Lisp and its dialects everything is a first class language construct, that is it can be evaluated and changed from within the language. This gives a few very powerful abstractions, which help in constructing short but still readable programs. There are a few very powerful patterns coming from this single concept.
A function is a first class variable. It can be assigned to.
This is directly applicable in PHP. Since the php symbol table is a big hash (roughly), you can access the function by it's name - for example:
$func = 'a_function';
$var = $func();
A program is a data structure, a list, so you can manipulate it using the language
That in php is a bit difficult. evaland create_function() go some way towards this goal, but their major limitation is that they work with strings, which is awkward to manipulate. tokenizedoes another bit - it does return the string's php tokens into an array, but that doesn't go far enough either - you don't have an immediate or easy way to execute the array, or to turn it into a program. So our obvious options are limited to manipulating strings, which is awkward and usually unreadable. With some of the new pecl extensions like runkit, we can get access to the generated program opcodes, but it is really awkward to work with them at this stage. I wish there was an interface to manipulate the language constructs directly and in a readable way. In theory it should be possible, but in practice it is black-magic.
A Lisp function is a first class object
I admit, this is an odd one - due to the way lisp treats variable scoping and lifetime you could have access to variables defined within a function outside of it. It is a kind of black-magic thing, but it gives you the possibility to treat a function like you would treat an object in OO languages, which is kind of cool. It gives way to immediate and transparent implementations of various architectural patterns like the factory pattern. Apropos, you have the same closures concept in javascript. When I stumbled across that I was really surprised at the time. You wouldn't want to be around me then.
In PHP that is not available immediately. You could define a class (or an object) and have a default execute method, this way treating the object as a function. Some call that a functor. This is the closest I can think of at the moment. If php was to be extended to allow the definition of a default method for a class/object, then we could have an implementation of functions as first class objects. Vote for a __default() method everyone.
A program is a data structure, revisited
Let's have a look how to get closer to this goal. We need to have access to a convenient to manipulate language structure, which can be used as a program structure. The simplest program is a function call. Easy - $func(). A sequence
$func1();$func2();$func3();
, put that into an array and execute it
$commands = array('func1','func2','func3');
foreach($commands as $func) $func();
Now we need to be able to add argument passing, and branching. It is very simple, just use multi dimensional arrays, for example, and specialized iterator procedure. First let's define functions with arguments.
$func=array("function_name",array($arg1,$arg2,...))
The previous example is changed to:
$commands = array(
array("func1",array($f2_arg1,$f2_arg2,...)),
array("func2",array($f1_arg1,$f1_arg2,...)));
foreach($commands as $func)
call_user_func_array($func[0],$func[1]);
A bit of an awkward syntax, but does the job. The commands array can be now manipulated as an normal php array, for example filtering, on demand construction, etc... For branching code, there are different methods, the most straight forward one will be to implement a branch function, similar to if and taking a guard, left and right commands array as arguments. The implementation is trivial, so I won't write it down. I'll leave it as and exercise for the reader, as they say in the thick books.
This doesn't mean that every array can be treated as a program, no bu you can generate, manipulate and evaluate syntactically correct programs this way sacrificing not too much efficiency and not resorting to the dreaded eval() call. You might have noticed that the code above implements the spirit (for the OO fundamentalists) of the command and interpreter patterns, without resorting to an overcomplicated design in a couple of lines of code. We could make things more readable, by using string indexes in the func arrays, or using objects instead of arrays. The latter would have been far more readable, if we had functions as first class objects. But never mind. Vote for a __default() method everyone.
Emulating quote
In Lisp the quote form, suppresses the evaluation of an expression. It is usually used, when you want to get access to the underlying program. The $commands array from above for all practical reasons can be treated the same way as a quoted lisp form.
Emulating Lisp macros
Lisp macros are used to transform a language construct or expression. The equivalent of a Lisp macro will be a function transforming, not evaluating the $commands array.
Emulating Lisp closures
This is a difficult one for me. I'm not that intimate with Lisp. But it seems only natural that in order to better emulate closures to use php objects. This way we can have both similarly scoped data and "member functions", which can be viewed as data, and manipulated as such. This definitely needs to be investigated further. It can be very useful in clean but minimalistic implementations of things like the state pattern. So, vote for a __default() method everyone.
Continuations?
Continuations are an interesting idea - you should be able to suspend a program or a fragment of a program and being able to continue from where you left. In order to have proper continuations, they need to be automatic, the code must look as though we have a linear, non-interruptible program. In Python you have the yield statement, which does exactly that. A bit of a limited version - since one explicitly denotes the break points in the code. How can we code continuations in php? Again - using the above interpreter pattern code should just do most of the job - the most important part is that we have access to every step of the program ( the $commands array ) and we can use some marshaling technique to save the state of the program. The beauty is that you can have a relatively brief code.
The approach for implementing continuations presented here is not strict, since continuations are usually defined as the future [state] of computation, which is a strict functional form. It could be done in php, no problem, but the imperative programming style is more natural to php, so I'm looking into the spirit of continuattions, rather than every single letter of the definitions.
Mini languages, or Domain specific languages
One of the coolest and best things about Lisp and its relatives is the easiness with which one can define mini languages, that is a language designed to for a particular problem domain. Very often the definition of the language looks like a BNF form. I won't try to go that far. What I want to achieve is to try and find simple answers to the question: How can I define an human readable and legible domain specific language, with support for continuations, etc... in php?
There are several approaches possible. Generally a domain language could be simply a library(set of files), with the added constraint - to solve a problem in that domain you shouldn't go and look to functions beyond the domain language. This way we ensure good readability and no real crossover with the host language semantics. Write what you mean, hacks not allowed. The new (4.7) drupal form api is an example of a domain specific language.
I will demonstrate a more complicated example of building a domain language, since I want to evolve it later to handle continuations. For simpler situations a set of functions and constructs will suffice. The first part is nearly there - the earlier interpreter pattern. Since we are using php, the syntax of the mini language will be derived from the php syntax. The performance trade off resulting from this language definition should be kept to minimum possible. This way optimizations can be kept to minimum, leading to cleaner code.
An example toy language
Here is an example toy language interpreter. It shows some of the traits of a 'proper' programming language.
class language {
function let($args){
$this->$args[0]=$args[1];
}
function cond($args) {
if( $this->is_var($args[0])&&$this->$args[0]['name']
|| $this->fun($args[0]))
return $this->fun($args[1]);
else
return $this->fun($args[2]);
}
function fun($args) {
$block=new language();
return $block->interpreter($args);
}
function add($args) {
return $result = $args[0]+$args[1];
}
function mul($args) {
return $result = $args[0]*$args[1];
}
function interpreter($code) {
//Iterate through the program executing each instruction
$this->counter=0;
while(isset($code[$this->counter])) {
$this->$code[$this->counter][0]($code[$this->counter][1]);
++$this->counter;
}
}
}
A program in this toy language might look like:
//a subroutine
$block = array(
array('add',array(1,2)),
array('mul',array(4,2)),
array('let',array('test',4))
);
$code = array(
array('add',array(1,2)),
array('mul',array(4,2)),
array('fun', $block)
);
//make an an interpreter
$l1 = new language();
//now execute the program
$l1->interpreter($code);
The array syntax of the program is sufficient but ugly. Let's make this a bit better. A parser for our toy language can look like:
//parser class
class parser {
private $code = array();
function code() {
return $this->code;
}
public function __call($f,$args){
//we could add some checks here
$this->code[]=array($f,$args);
}
}
//instantiate the parser
$prog = new parser();
//make an interpreter
$l2 = new language();
//a program in our toy language
$prog->add(1,2);
$prog->mul(4,2);
$prog->let('test',4);
//now execute the program
$l2->interpreter($code);
The object syntax is definitely better than the array one - more readable and intuitive for someone familiar with php. Using the overload pecl extension you can turn overload operators, so you could in theory make it look like proper php. Well, it is proper php actually. The so called parser, uses php, the biggest benefit is that we achieve an execution environment, usable for continuations style programming. We can serialize() the interpreter object at any execution step, dump it to disk, a database, network and suspend it. Then just restore it and continue the evaluation from where we left off.
Continuations in the toy language
Let's add continuations. The first approach will be to add explicit suspend - yield function and it's handling in the interpreter. The changes to the original code are minimal a the function yield, a change in the while condition, and putting the $counter in the object scope.
class language {
var $counter=0;
//..variables and methods come here
function yield(){
$this->run=false;
}
function interpreter($code) {
$this->yield=false;
//Iterate through the program executing each instruction
while(isset($code[$this->counter]) && $this->run) {
$this->$code[$this->counter][0]($code[$this->counter][1]);
++$this->counter;
}
}
}
Now whenever yield is encountered, the interpreter stops, the program will do whatever it needs to do in between, and we can continue. We have continuations. Notice that yield can be called from within any method. A reasonable scenario for the web would be to call yield after you have sent all content to the browser. Serialize the 'interpreter' into the session. On the next request continue. A very natural way to define wizards - step one, step two, step three, .... If you do nested functions, it is better to use exceptions to handle the interrupted code.
What next?
After spell-checking, this will be de-drafted. I'm planning another write-up on continuations based web controller in php, adding some more example pattern implementations. I have a couple of mini-languages in the works. I'll describe them in a series of writeups
At what point do you give up and build a R5RS scheme interpreter inside of PHP? ;)
At any rate, I really enjoyed this article, and the other two on closures. Well done.