Problems of active record and friends

After a lot of silence caused by lot's of work (good, my bills are happy) and a continuous diversion into scheme, haskell, dylan and other interesting languages, I'm back into php speaking land. Funny feeling that. So here comes the beginning of something I've been continuously rubbing my (leftovers of) brains against.

copout This is not php specific, but since it discusses frequent problems occuring in php apps, I've labeled it with as php related as well. Makes it easier maintaining the site you see.

I have a problem with active record as a pattern. It is a logical one. Essentially it turn a (SQL) database row into an object. The class of the object represents the SQL table or view. Of course you can add behaviour to those classes and objects, i.e. activating those records.

All that is nice and dandy, but when you start talking about relations and constructing dynamically queries and corresponding objects, we start hitting the limitations of what I will call from now on the naive active record.

The problem is not trivial at all. It boils down to how to make peace between the host language type system (for example classes) and the SQL's dynamic compound types - the relations expressed as SELECT variants for example. It gets even harder when we decide to reverse the direction, so that we actulally want to update the database. Yuk! As though somebody actually does that!

A first look into the meaning of SQL higher types

What are the SQL tables, views and joins?

Records actually. The sql tables are simple records with named fields. The name of the record type is the name of the table. The type definition is the table definition. In SQL you have the usual projectsion operators, like table.filed_name. More interestingly you can actually construct new temporary record types. For example in the statement:
SELECT n.one, n.two, n.three FROM n;
We construct a new (anonymous) type whose slots inherit the names from the record where they are copied from. Well, you could assign a different name if you so desire.

Joins

If we think of tables as record types and rows as records, then JOIN is an operator on record types not records. It is more complicated than that, but for the time being let's stick to this meaning.

Consider the following example:
A::{ a, b, c} JOIN B::{b, c, d} -> {A.a, A.b, A.c, B.b, B.c, B.d}
as used in
SELECT A.*, B.* FROM A,B
;; alternatively
SELECT A.*, B.* FROM A CROSS JOIN B;

The result type is a union, or more precisely a cartesian product, of the arguments of the JOIN op. This resembles multiple inheritance, with a crucial difference - you preserve all ops of the parents (the tables arguments of the JOIN). This causes one of the majour problems with active record - we don't have a language, let alone php doing it, that allows us to naturally model this using the language itself, rather than constructing strange hacks ourselves. The opposite direction - folding inheritance and friends into SQL meets similar kinds of problems.

The rest of the JOINS are essentially specialised products (unions) which can be modelled with some extra operators on types, which are dutifully skipped for the time being.

Views are 'named' joins, so they are not much more interesting. What is interesting is the fact that you can use anywhere, where you can use tables.

So what do we have? Some kind of a mental model, meta-language, if you like, expressing how to construct records, or what are these records made of, (which is the same). If we were to implement that language, mapping as much as possible native language constructs, we would have the means to reason about our objects and automagically construct database queries of the appropriate kind. Maybe even optimise the queries with respect of what is required for our application, rather that simply relying on the generic SQL optimiser or a handy local SQL guru.

Active record comes to play after we know the record. The question is how to get to that point. The answer lies in the tools of the native language. I'll describe later some possible approaches in php, probably not good, but you are welcome to improve them and tell me how to do it better.