This is the first of a multi part series of posts on the creation of the Mite Migrations DSL.

In this post I will show you how to write a non-trivial transformer that truly customizes our Domain Specific Language so that we're not using POBoo (Plain Old Boo).

Syntax Matters

A fundamental principle of DSLs is that syntax matters.  A friendly syntax makes a language easier to read, easier to understand, and makes programmers more productive.  We are not slaves to the machines.  The machines are slaves to us!  They should understand a simple syntax that humans are comfortable with.

Now that we've established that, let's get started shall we?

Download the sample project to follow along.

Desired CreateTable Syntax

This is the syntax that I am going for:
CreateTable "Cats":
    Int32 "Id", { "identity" : true, "primary" : true }
    String "Name", { "length" : 50 }


Int32 and String are constructors for subtypes of Column.  Don't worry about that now, I'll define it later for you.

Also, since this is a language and not XML configuration, we should be able to define and use methods and variables in our DSL script.  The following should work:


tableName = "Cats"

getKey = do(name, dataType):
    return Column name, dataType, { "primary" : true }

CreateTable tableName:
    getKey "Id", DbType.Int32


I know that I haven't explained everything that's going on here yet, but bear with me.  I'm just demonstrating the type of syntax that I want to create.

Also, this is a first draft of the syntax.  Eventually the need for strings will be alleviated via the use of sigils.

Boo Basics

In Boo, a method may be invoked without parameters.  Therefore  Int32 "Id", { "identity" : true, "primary" : true } and     Int32("Id", { "identity" : true, "primary" : true }) are equivalent.  Leaving the parenthesis out of a method call makes code much more human readable.

The { "something": somevalue } statements are hashes.  Boo.Lang.Hash is based on Hashtable from the .NET framework.

In Boo, constructors are called without the new operator.  For a class called Int32, calling Int32(...) is a call to the constructor.

Rhino DSL

As in my last posting, I will again be using Rhino DSL to help customize the Boo compiler.

Note: Since my last posting, AnonymousBaseClassCompilerStep has been renamed ImplicitBaseClassCompilerStep.

Implicit Base Class

The foundation of DSL creation with Rhino DSL is to define a base class with a method (or methods, but we'll talk about this later) in which to insert the DSL code.  The DSL code will then be able to implicitly call the base class methods.  Thus, the base class methods provide the API for the DSL.

public abstract class MigrationBase
{
protected void CreateTable(params Column[] columns)
{
...
}

public abstract void Run();
}

The DSL script will be loaded into the abstract Run method of MigrationBase and the CreateTable call in the example will actually call the CreateTable method on MigrationBase.  This is a simple yet extremely powerful concept. 

For completeness sake, here is the Column class

public class Column
{
public Column(string name, DbType dataType, Hash options)
{
Name = name;
DataType = dataType;
if (options.ContainsKey("primary"))
Primary = (bool)options["primary"];
if (options.ContainsKey("identity"))
Identity = (bool) options["identity"];
if (options.ContainsKey("length"))
Length = (int) options["length"];
}

public string Name { get; set; }
public bool Primary { get; set; }
public bool Identity { get; set; }
public int Length { get; set; }
public DbType DataType { get; set; }

public override string ToString()
{
return Name;
}
}

public class Int32: Column
{
public Int32(string name): base(name, DbType.Int32, new Hash())
{
}

public Int32(string name, Hash options) : base(name, DbType.Int32, options)
{
}
}

public class String: Column
{
public String(string name, Hash options) : base(name, DbType.String, options)
{
}
}

Transformers

The Boo compiler is extensible.  This is what makes it perfect for writing DSLs.  The Boo compiler allows you to interact with the Pipeline.  There are 46, last I checked, compiler steps in the regular Boo Compiler Pipeline.  The Pipeline consists of compiler steps.  A compiler step can interact with the parsed code and change the abstract syntax tree.

We will customize the compiler pipeline by adding two steps.  The first step we will add will be the ImplicitBaseClassCompilerStep from Rhino DSL.

The second will be the compiler step that runs the transformer that will turn the block in our sample DSL CreateTable code into arguments.  I'll show you how to make the transformer and the compiler step in just a moment.  Before that, lets talk about blocks.

Blocks

In Boo, when you see an indented section of code after a line ending in a colon, the indented section is a block.  Therefore, in this code:

CreateTable "Cats":
    Int32 "Id", { "identity" : true, "primary" : true }

The Int32 constructor call is inside of a block.  Now look at the definition of CreateTable on our MigrationBase.  It expects the columns to be passed as arguments.  A block is not the same as arguments.  Therefore, we need to translate the abstract syntax tree of the boo program to match what we want.

Before the ExpandMacros compiler step, we need to perform this translation.  Why the ExpandMacros compiler step?

Macros

When the Boo compiler encounters an unknown keyword, it treats it as a macro.  I highly suggest reading up on macros.

A boo MacroStatement takes arguments and a block.  The syntax is as follows:

SomeMacro arg1, arg2, ...:
    block

DepthFirstTransformer

We will write a transformer to walk the abstract syntax tree (AST) turning the statements inside of the block into arguments to the macro.  Boo provides a helpful class to override to make this easier: DepthFirstTransFormer.

DepthFirstTransformer walks an AST calling a method each time a node is encountered.  You can inherit from DepthFirstTransformer and override the appropriate methods to add your own handling for those nodes.  In this case, we need to deal with macro blocks, so we will override the OnMacroStatement method.

Here is the overridden method:
public override void OnMacroStatement(MacroStatement node)
{
if (methods.Contains(node.Name))
{
if (node.Block != null)
{
var expressions = TransformBlock(node.Block);
foreach (var expression in expressions)
node.Arguments.Add(expression);
node.Block = null;
}
}
base.OnMacroStatement(node);
}
TransformBlock takes a block and returns Expression[].   These expressions are then added as the arguments to the macros argument list.

The variable methods in the if statement is a string array that contains the names of the methods that we wish to transform.  In this case, we will be passing in only "CreateTable".  It is defined in the ctor.

private string[] methods;

/// <summary>
/// Creates an instance of BlockToParametersTransformer.

/// </summary>

/// <param name="methods">Names of the methods whose
/// invocations should have blocks changed to arguments.
/// </param>

public BlockToParametersTransformer(params string[] methods)
{
this.methods = methods;
}

Now for the interesting part, the TransformBlock method.

private static Expression[] TransformBlock(Block block)
{
var expressions = new List<Expression>(block.Statements.Count);
foreach (Statement statement in block.Statements)
{
if (statement is ExpressionStatement)
{
expressions.Add((statement as ExpressionStatement).Expression);
}
else if (statement is MacroStatement)
{
var macroStatement = statement as MacroStatement;
if (macroStatement.Arguments.Count == 0 && (!macroStatement.Block.HasStatements))
{
// Assume it is a reference expression
var refExp = new ReferenceExpression(macroStatement.LexicalInfo);
refExp.Name = macroStatement.Name;
expressions.Add(refExp);
}
else
{
// Assume it is a MethodInvocation

var mie = new MethodInvocationExpression(macroStatement.LexicalInfo);
mie.Target =
new ReferenceExpression(macroStatement.LexicalInfo, macroStatement.Name);
mie.Arguments = macroStatement.Arguments;

if (macroStatement.Block.HasStatements)
{
// If the macro statement has a block,
// transform it into a block expression and pass that as the last argument
// to the method invocation.
  var be = new BlockExpression(macroStatement.LexicalInfo);
be.Body = macroStatement.Block.CloneNode();

mie.Arguments.Add(be);
}

expressions.Add(mie);
}
}
else
{
throw new InvalidOperationException(
string.Format("Can not transform block with {0} into argument.",
statement.GetType()));
}
}
return expressions.ToArray();
}

A lot is going on in this method, so lets tackle it piece by piece.

if (statement is ExpressionStatement)
{
expressions.Add((statement as ExpressionStatement).Expression);
}

This is the trivial case.  If we encounter an expression statement, then we just take the expression and add it to the list of expressions to return.

Next we check to see if we have a MacroStatement.

If we do, then we have to see if it has any arguments or a block.

var macroStatement = statement as MacroStatement;
if (macroStatement.Arguments.Count == 0 && (!macroStatement.Block.HasStatements))
{
// Assume it is a reference expression var refExp = new ReferenceExpression(macroStatement.LexicalInfo);
refExp.Name = macroStatement.Name;
expressions.Add(refExp);
}

If it has no arguments and no block, we assume that this is a ReferenceExpression that has not been bound yet.  We transform it into a reference expression and add that to our list of expressions to return.  This is a very important step, otherwise local variables declared inside of the DSL script would not be usable inside of our block!  I suggest that when you download the sample project, you experiment without this clause and try to use the DSL and see what happens.

Ok, so if it has arguments or a block, this macro statement must be intended to be a method call.  Therefore we do this transformation:
else {     
// Assume it is a MethodInvocation
var mie = new MethodInvocationExpression(macroStatement.LexicalInfo);
mie.Target =
new ReferenceExpression(macroStatement.LexicalInfo, macroStatement.Name);
mie.Arguments = macroStatement.Arguments;

if (macroStatement.Block.HasStatements)
{
// If the macro statement has a block,
// transform it into a block expression and pass that as the last argument

// to the method invocation.

var be = new BlockExpression(macroStatement.LexicalInfo);
be.Body = macroStatement.Block.CloneNode();

mie.Arguments.Add(be);
}

expressions.Add(mie);
}

If we have encountered something that is neither an Expression nor a MacroStatement, there's nothing we can logically do with it so we throw an InvalidOperationException.

Plumbing

public class MiteEngine: DslEngine
{
public MiteEngine()
{
Storage = new FileSystemDslEngineStorage();
}

protected override void CustomizeCompiler(Boo.Lang.Compiler.BooCompiler compiler, Boo.Lang.Compiler.CompilerPipeline pipeline, string[] urls)
{
compiler.Parameters.Ducky = true;
pipeline.Insert(1, new ImplicitBaseClassCompilerStep(typeof(MigrationBase), "Run", "BlockToParameters"));
pipeline.InsertBefore(typeof(ExpandMacros), new BlockToParametersCompilerStep());
}
}

class Program
{
public static void Main(string[] args)
{
DslFactory factory = new DslFactory();
factory.Register<MigrationBase>(new MiteEngine());
var bc = factory.Create<MigrationBase>("test.boo");
bc.Run();
Console.ReadLine();
}
}
Test.boo

someVariable = Int32("Age")

someMethod = def(x as Column):
    return x

someMethodWithABlock = def(x as ICallable):
    return x()

CreateTable:
    Int32 "Id", { "primary" : true, "identity" : true }
    someVariable
    someMethod String("name", { "length" : 55 })
    someMethodWithABlock:
        return String("nickname", { "length" : 55 })


Hook up something to show that the transformation is working inside of CreateTable on MigrationBase:

protected void CreateTable(params Column[] columns)
{
foreach (Column column in columns)
Console.WriteLine(column);
}

Now try it out and see the results:

Id
Age
name
nickname

Yay, success.

Next time, we'll make it actually create the table in a database!

Here's the sample project.

If you have any questions or comments, don't hesitate to post them.