Building Expression Evaluator with Expression Trees in C# – Part 3
Introduction
In part two of this series we built an expression evaluator capable of parsing expressions with parentheses. In this part we are going to add support for expression with variables.
Adding support for single variable to expression evaluator
The simplest case is to make the following test pass:
[Test]
public void Can_Process_Simple_Variables()
{
decimal a = 2.6m;
Assert.That(engine.Evaluate("a", a), Is.EqualTo(a));
}
First of all we need to add a second parameter to Evaluate method. To keep the existing tests green we will also specify a default value for it. Running the test causes Evaluate method to throw an exception as it does not recognize variables. To avoid exception during parsing the expression when we encounter a letter we will read the variable name and as variables are similar to numeric operands store them on expression stack. To represent the variables in expression tree we are going to use Expression.Parameter Method which is a factory method for creating instances of ParameterExpression class. The method for parsing variables looks like this:
private ParameterExpression ReadParameter(TextReader reader)
{
var operand = string.Empty;
int peek;
while ((peek = reader.Peek()) > -1)
{
var next = (char)peek;
if (char.IsLetter(next))
{
reader.Read();
operand += next;
}
else
{
break;
}
}
var parameter = Expression.Parameter(typeof(decimal), operand);
return parameter;
}
This change solves the issue of parsing variables but running the test causes an InvalidOperationException as we are not passing any parameters to Expression.Lambda method. To fix this issue we need to keep track of all variables that we encounter while parsing the expression and pass them to Expression.Lambda method. As we need to pass the value of the parameter to the compiled expression tree we also have to change type of the delegate.
The Evaluate method now looks like this:
public decimal Evaluate(string expression, decimal variable1 = 0)
{
if (string.IsNullOrWhiteSpace(expression))
{
return 0;
}
operatorStack.Clear();
expressionStack.Clear();
using (var reader = new StringReader(expression))
{
int peek;
while ((peek = reader.Peek()) > -1)
{
var next = (char)peek;
if (char.IsDigit(next))
{
expressionStack.Push(ReadOperand(reader));
continue;
}
if (char.IsLetter(next))
{
var parameterExpression = ReadParameter(reader);
parameters.Add(parameterExpression);
expressionStack.Push(parameterExpression);
continue;
}
if (Operation.IsDefined(next))
{
var currentOperation = ReadOperation(reader);
EvaluateWhile(() => operatorStack.Count > 0 && operatorStack.Peek() != '(' &&
currentOperation.Precedence <= ((Operation)operatorStack.Peek()).Precedence);
operatorStack.Push(next);
continue;
}
if (next == '(')
{
reader.Read();
operatorStack.Push('(');
continue;
}
if (next == ')')
{
reader.Read();
EvaluateWhile(() => operatorStack.Count > 0 && operatorStack.Peek() != '(');
operatorStack.Pop();
continue;
}
if (next != ' ')
{
throw new ArgumentException(string.Format("Encountered invalid character {0}", next), "expression");
}
}
}
EvaluateWhile(() => operatorStack.Count > 0);
var compiled = Expression.Lambda<Func<decimal, decimal>>(expressionStack.Pop(), parameters).Compile();
return compiled(variable1);
}
The new test now passes but we have successfully broken previous test when we do not pass any variable. To make all tests pass again we will need to build the delegate based on number of parameters and pass corresponding variable if there is one. The simplest solution is to build the appropriate delegate based on number of parameters like this:
if (parameters.Count > 0)
{
var compiled = Expression.Lambda<Func<decimal, decimal>>(expressionStack.Pop(), parameters).Compile();
return compiled(variable1);
}
else
{
var compiled = Expression.Lambda<Func<decimal>>(expressionStack.Pop()).Compile();
return compiled();
}
This works but it only supports one parameter and this approach cannot be used when we do not know how many variables will be in the expression. Also, the Func delegate supports only 16 parameters and it would be very ugly to use it.
Adding support for multiple variables
A better solution is to build and expression tree that accepts an array of variables. This way when there is no parameter we will simply pass an empty array and when there are several parameters we will simply pass them. The delegate signature will be the same in all situations so we will not need any if else statements to handle different scenarios.
To help us with the implementation of new approach we are going to add these tests to our test suite:
[Test]
public void Can_Process_Simple_Variables()
{
decimal a = 2.6m;
decimal b = 5.7m;
Assert.That(engine.Evaluate("a", a), Is.EqualTo(a));
Assert.That(engine.Evaluate("a+a", a), Is.EqualTo(a + a));
Assert.That(engine.Evaluate("a+b", a, b), Is.EqualTo(a + b));
}
[Test]
public void Can_Process_Multiple_Variables()
{
var a = 6;
var b = 4.32m;
var c = 24.15m;
Assert.That(engine.Evaluate("(((9-a/2)*2-b)/2-a-1)/(2+c/(2+4))", a, b, c),
Is.EqualTo((((9 - a / 2) * 2 - b) / 2 - a - 1) / (2 + c / (2 + 4))));
}
The expression tree that we are going to build will have one array parameter. To access individual members of this array in the expression we will use Expression.ArrayIndex Method We will also change the Evaluate method to accept variable number of parameters by params keyword. Let’s look at the modified code:
public decimal Evaluate(string expression, params decimal[] arguments)
{
if (string.IsNullOrWhiteSpace(expression))
{
return 0;
}
operatorStack.Clear();
expressionStack.Clear();
var arrayParameter = Expression.Parameter(typeof(decimal[]), "args");
using (var reader = new StringReader(expression))
{
int peek;
while ((peek = reader.Peek()) > -1)
{
var next = (char)peek;
if (char.IsDigit(next))
{
expressionStack.Push(ReadOperand(reader));
continue;
}
if (char.IsLetter(next))
{
var parameterExpression = ReadParameter(reader, arrayParameter);
expressionStack.Push(parameterExpression);
continue;
}
if (Operation.IsDefined(next))
{
var currentOperation = ReadOperation(reader);
EvaluateWhile(() => operatorStack.Count > 0 && operatorStack.Peek() != '(' &&
currentOperation.Precedence <= ((Operation)operatorStack.Peek()).Precedence);
operatorStack.Push(next);
continue;
}
if (next == '(')
{
reader.Read();
operatorStack.Push('(');
continue;
}
if (next == ')')
{
reader.Read();
EvaluateWhile(() => operatorStack.Count > 0 && operatorStack.Peek() != '(');
operatorStack.Pop();
continue;
}
if (next != ' ')
{
throw new ArgumentException(string.Format("Encountered invalid character {0}", next),
"expression");
}
}
}
EvaluateWhile(() => operatorStack.Count > 0);
var compiled = Expression.Lambda<Func<decimal[], decimal>>(expressionStack.Pop(), arrayParameter).Compile();
return compiled(arguments);
}
You will notice that there are several changes from the previous version: There is only one parameter in the expression tree we are building, we are passing this parameter to the ReadParameter method and we are generating the same type of delegate for all cases.
The only remaining change is need in ReadParameter method so that it returns correct item from the parameter array when we encounter a variable. We also need to make sure that we return the same item when the same variable is used again the expression. For this we will need to keep track of the variables and return corresponding item from the ReadParameter method. Here is how it works:
private Expression ReadParameter(TextReader reader, Expression arrayParameter)
{
var parameter = string.Empty;
int peek;
while ((peek = reader.Peek()) > -1)
{
var next = (char)peek;
if (char.IsLetter(next))
{
reader.Read();
parameter += next;
}
else
{
break;
}
}
if (!parameters.Contains(parameter))
{
parameters.Add(parameter);
}
return Expression.ArrayIndex(arrayParameter, Expression.Constant(parameters.IndexOf(parameter)));
}
As you can see whenever we encounter a variable we add it to parameter collection if it is not already present and then build an array index expression using corresponding index.
Using expression tree visualizer we can see how the expression tree that we build using the expression evaluator for last test case looks like:
Conclusion
Our new implementation of expression evaluator now supports expressions that contain numbers as well as variables. In the next post we will add several overloads to make passing parameters easier and add a new overload for better performance when we need to evaluate same expression with different parameter values. Full source code will be made available on github in couple of days.