programming

Micro-benchmarking php streams includes from database vs standard includes

During the drupal plugin/update manager discussions I had an aha moment. One of those weird and wonderful ideas came back to me. What if most of the code lived in the db? One would be able to arrange the co-habitation of several concurrent versions of the same website relatively easy. Backups would mean database backup.

Funnily enough, this can help two opposite (scale-wise) types of users - the bottom end, cheapest or free hosting ones and the load balanced crowd.

Why "back"? Well... I had this idea ever since the user streams appeared in php, version 4.3 or there abouts, but it just nestled cosily in the back of my mind, waiting for love, the shy little thing.

He Sells Shell Scripts Not Just to Intersect Sets

He Sells Shell Scripts to Intersect Sets is a good read. Set and multiset operations from the command line or in .sh. Rather clean, straightforward shell magic.

Science paper of the month, IMHO

In this paper we will propose the first — to our knowledge — method that finds the global optimum of the TSP by reducing the NP complexity of the problem to N2 complexity. We achieve this by using white light interferometry
(WLI)[14]. From the mathematical point of view we unfortunately haven’t disapproved the commonly believed conjecture that NP 6= P (the Clay mathematics institute set out a price of 1.000.000 US-$ for doing that) but we apparently reduced the complexity by a trick, namely replacing operations by photons (and the physics of interference).

In section 2 we introduce the method as a gedankenexperiment[thought experiment in friendly german]....
http://www.opticsexpress.org/abstract.cfm?id=140598

Just cool. It reminds me of the often forgotten analogue computers. They are very rarely used in practice these days, although they could be used to solve predefined probles fast, furious and sparkling.

With the introduction of uncertainty and loops you could get some very interesting results. Just think about the offspring of a theremin and a miniMoog on acid. Raving BEAM bots? Pacifist fighter drones?

An exercise with continuations, prompts and scheme macros

control resumed inspired me to take a small exercise.

"Code a minimal approximation of the python yield operator based generators"
Here follow the results I managed to get with help from Matthias Felleisen, via the plt mailing list. You can see the non-edited exchange via the mailing list archives and control and macors. All control&prompt are forms specific to plt scheme's control.ss module. The first attempt looks something like this:

(require (lib "control.ss"))

(define (make-step)
  (define (control-state)
    (yield 1)
    (yield 2)
    (yield 3)
    'finished)
  (define (yield value) (control resume-here (set! control-state resume-here) value))
  (define (generator) (prompt (control-state)))
  generator)

(define step (make-step))

(step) (step) 

It does the job. Kind of. For me the interesting bit is how the interplay of prompt and control achieve maintaining the current state of execution before the return and allowing to continue from that interruption. It can be done with call/cc but it looks a helluva a lot more complicated and is far more unreadable.

There is a problem with the code above though. As Matthias notes in his email:

LESSON: never 'test' by running things at the top-level prompt. It IS a prompt. -- Matthias

So comes version 2:

(require (lib "control.ss"))

(define (make-step)
  (define (control-state)
    (yield 1)
    (yield 2)
    (yield 3)
    'finished)
  (define (yield value) 
    (control resume-here 
             (set! control-state 
                   (lambda () (prompt (resume-here)))) 
             value))
  (define (generator) (prompt (control-state)))
  generator)

(define step (make-step))

(equal? '(1 2 3) (list (step) (step) (step)))

control-state is a variable holding a reference to a function with no arguments, doing what the python function would do. yield is a variable holding a reference to a function which does the interrupt magic. generator is a variable holding a reference to a (sugary) function which does the initial step.

The logic of the above algorithm is roughly as follows - mark thy state, do what you need to do, when you encounter control stop, remember your state by wrapping it in a function (resume-here), remember where do you what to continue from ( the set! control-state ...). Notice that the value of control-state is replaced. This is possible since the resume-here continuation is a function and a first class value.

So what next. Boilerplate code is tedious to write. And there is a lot of boilerplate code there. And every function, which needs to follow this recipe, will need to duplicate it.

So scheme macros to the rescue. It is not a trivial problem. A 'generator' function can look something like this:

Drupal, computer science, random bits

There are quite a few interesting computer science artefacts in Drupal. I'm going to try to highlight a few of them. Sorry, this is unfinished, and might not be finished ever, but I'm just trying to capture a snapshot of confused meandering thoughts.

Very late, dynamic binding

Late binding came to popular life with the advent of object oriented languages. Essentially it means binding of values, for example functions, to names at object creation, as opposed to compile or link time. In drupal, this can happen at any time, more even, it is algorithmic - that is you can change at runtime what is to be executed at a specific control point. The hook system is one of the ways to do it.

Models, records, databases, friends and foes

the beginnings of a long on DB related pattern discussion.

First sketches

Ok, Let's recap:

  • Relational database (SQL) tables are records. We can view them (in our heads) as record type definitions.
  • JOINS define (some kind of) derived union types.
  • SQL CROSS JOINS are full cartesian products, i.e. union types where no name folding/aliasing is done by the imaginary type system

How can we model that in php?

Classes (and objects) are records. Arrays are records too. Both are candidates for doing the job. We need to be able to somehow represent this (meta) type information and manipulate it. The aim is to have a natural feeling/looking abstraction of the database in php. It should be flexible enough, for us to modify the relations at runtime, as we need. It should allow us in the long run to have a near optimal speed and not too much complications in the end code. That is a tough cookie.

Problems of active record and friends

After a lot of silence caused by lot's of work (good, my bills are happy) and a continuous diversion into scheme, haskell, dylan and other interesting languages, I'm back into php speaking land. Funny feeling that. So here comes the beginning of something I've been continuously rubbing my (leftovers of) brains against.

copout This is not php specific, but since it discusses frequent problems occuring in php apps, I've labeled it with as php related as well. Makes it easier maintaining the site you see.

I have a problem with active record as a pattern. It is a logical one. Essentially it turn a (SQL) database row into an object. The class of the object represents the SQL table or view. Of course you can add behaviour to those classes and objects, i.e. activating those records.

All that is nice and dandy, but when you start talking about relations and constructing dynamically queries and corresponding objects, we start hitting the limitations of what I will call from now on the naive active record.

The problem is not trivial at all. It boils down to how to make peace between the host language type system (for example classes) and the SQL's dynamic compound types - the relations expressed as SELECT variants for example. It gets even harder when we decide to reverse the direction, so that we actulally want to update the database. Yuk! As though somebody actually does that!

macro processor - initial notes

Most of my work on this is on paper, and some code sketeches. There quite a few non-trivial issues, and hard decisions to be made. Since I have the knack of loosing or recycling the various paper bits lying around, I decided to keep at least some of them in my diary, aka this blog.

initial requirements

convenient syntax and semantics
convenience matters. The syntax should be obvious. The meaning of the syntax should be easy to grasp.
minimal core
the core of the macro processor should be minimal. This will help managing the code quality. The minimal core should be enriched using macros.
retargetable/multi-targets
using one and the same 'surface syntax' to produce different target code. It should be possible to have a one to many code generation. Example application - one definition producing javascript, php and sql code. Alternative example: one definition - three code targets - debug, deployment + a set of autogenerated tests
target and syntax agnostic
no assumptions should be made (at least at top level) about the target or syntax. We should be able to accomodate any theoretically parsable syntax, within reason, of course. Subject to parsing strategy selection, etc...

macros, higher-order functions, datatypes and design patterns

While reading, researching and experimenting for my macros project the terms in the title get mixed a lot. I'll try to summarise what is my understanding of a lot of the above. To be fair, most of these words are heavily overloaded with sometimes contradictory and or ovelapping meanings, so this short is only about my personal summary, no pretence for a general study or lit review.

Generic programming intuitevely corresponds to design patterns, or a tool to implement them. There should be a distinction between the functional or algorithmic side of the term and the structural or data (type) genericity.

Powered by Drupal, an open source content management system