Most of my work on this is on paper, and some code sketeches. There quite a few non-trivial issues, and hard decisions to be made. Since I have the knack of loosing or recycling the various paper bits lying around, I decided to keep at least some of them in my diary, aka this blog.
Take a step back, close your eyes and try to imagine a macro transformer. Good. Now let's look what is going on.
The thing magically finds a pattern in a program tree and replaces it with a template substituting any found variables defined in the pattern in the process. There are a few possible followup strategies from here.
A single expansion pass stops when there is no more input left.
The expansion is looped until there is nothing to do in the output(s). There is definitely a possibility of non convergent macros.
Some interesting and very useful observations and comments, which should help with the specifications.
The grammar of the parser should be changeable. We need to be able to add, and maybe redefine and remove grammar rules. This will allow for embedding the macros directly into the parsed grammar. The first obvious candidate is probably coming from the domain of combinator parsers. We should strive to minimise the backtracking though.
It is obvious that we need an ambiguity resolution strategy. There is no simple answer to that. We need to order all matching rules and select the best one. We need to strike a balance between convenience and speed. Obviously shortest match will remove backtracking for most if not all cases. There is always the option to investigate the packrat (or variant of it) strategy as well. There is a need for a measure which incorporates match length, specificity, possibly position or some other order, and as a last resort priority. The problem with fixed priority levels is that they are static, and counter intuitive. We could introduce relative order, that is being able to state something like rule1 is to be considered with a higher priority than rule2.
How to deal with left and right associative rules since it has direct consequences for pattern matching?
A freak idea - since there are quite a few parameters for resolving ambiguities, we could code a learning parser, that is a parser, that learns the proper order of the rules space based on examples and feedback. To keep the things standardised, we will need to simply give standard axioms, maybe axioms for namespaces and/or domains.
You know, you might be right, but when you have that nagging thing in your top box, you can do nothing but follow
You are crazy for attempting this. Good on you!