.Net, Azure and occasionally gamedev

Using roslyn to evaluate conditions

2019/02/09

Warning: Using Roslyn to evaluate and execute arbitrary strings in your program opens it up to remote code execution attacks.

In most cases there will be a better solution, such as a plugin system or parser generators.


Requirements

In my HomeApp solution I want triggers and actions. Each trigger has a condition attached that (when true) will execute the actions.

I want to read them from a configuration file so users can easily define them themselves.

Example:

sensor('temperature') > 30 || after('6 PM')

Conditions react to changes in sensor values and continuously reevaluate to determine if actions should be executed.

An example would be "turn the garden lights on after 8 PM when it's not raining":

sensor('rain') <= 0 && after('8 PM')

Exploring the roslyn scripting API

Normally parser generators are suited for the job however they can be quite verbose and tedious to write.

I decided to use roslyn instead as C# conditions are intuitive to write and offer everything I need.

The roslyn scripting API is also very simple to use:

// returns false
bool result = await CSharpScript.EvaluateAsync<bool>("1 > 2");

For my application I imposed a few rules to prevent the risk of remote code executions:

  1. The string to be parsed can only be read from a local config file. It cannot be sent via the internet
  2. The local config file must be manually created by someone and cannot be synced over the internet
  3. The application does not have write access to the config file
  4. The application has no other means of receiving the conditions from remote sources

This reduces the attack vector significantly and any attacker would first require system level access (at which point code injections in my config file are the least of problems).


As the example above showed, it is very easy to parse and evaluate strings to any desired type using roslyn.

Under the hood, it JIT compiles the expression as code in memory and then executes it using the host runtime.

In order to inject state, it is possible to define any arbitrary class and inject it as a global object.

Internally, roslyn will then expose all properties of this class as global variables.

public class Globals
{
    public int X;

    public int Y;
}

//..

var globals = new Globals { X = 5, Y = 3 };
// expression now depends on the input (true in this case)
bool result = await CSharpScript.EvaluateAsync<bool>(
    "X > Y",
    ScriptOptions.Default,
    globals);

This feature allows to evaluate expressions based on the current state of the application making it very flexible.

Using roslyn to execute conditions from strings

Setting up and using roslyn is quite easy. The official documentation has all the necessary details (namespace, etc.).

To get from "string in config" to "result of the expression" only a few steps are needed:

First: I denoted the string parameters with single quotes instead of the usual double quotes because the whole condition is stored in a json configuration file.

Double quotes would have meant escaping "sensor(\"temperature\")" which is just annoying to write and look at.

The fix is as simple as string replacement

condition = condition.replace("\'", "\"");

before sending it to roslyn.

I don't ever plan to allow char literals in the conditions, so this is fine.

Second: Providing the necessary methods to be used by roslyn.

By default, Roslyn requires the fully qualified method names in order to use static methods. E.g. to use Math.Max, you can always write:

System.Math.Max(5, 3)

Importing a class in roslyn is akin to "using static [class]" in regular C#: All static methods of the imported class are available everywhere in the code without namespace or class name identifier:

So I created a class in C#:

public static class ScriptFunctions
{
    // returns value of a sensor (e.g. temperature)
    // note that I defined the function lowercase (not C# idiomatic)
    // as I want the user to type all methods/names lowercase
    public static int sensor(string sensorId)
    {
        // GetSensor omited for now
        var sensorValue = GetSensor(sensorId).LastValue;
        return sensorValue;
    }
}

and imported it in roslyn via:

var options = ScriptOptions.Default
    // make all static methods in this class available globally
    .WithImports(typeof(ScriptFunctions).FullName);

This then makes it possible to use all static methods of this class without any prefixes:

string condition = "sensor('temperature') > 30";

condition = condition.replace("\'", "\"");
// result now depends on the actual value of the sensor
bool result = await CSharpScript.EvaluateAsync<bool>(
    condition,
    options,
    globals);

Roslyn also has a few more advantages compared to parser generators:

Any parsing error is handled automatically by roslyn via CompilationErrorException and will return the well known C# error messages (which are very clear compared to many parser generators).

For example, the Invalid method call (upper case 'S')

"Sensor('temperature') > 30"

will yield:

(1,1): error CS0117: 'ScriptFunctions' does not contain a definition for 'Sensor'

On top, setting breakpoints in the functions themselves works as expected and any roslyn executed code will break and allow debugging.

While this is a given for most parser generators as well, stepping through their code is often far more difficult due to the extra indirections. With roslyn it's only business logic and nothing else.

Roslyn globals workaround

The only real problem that I faced while using the roslyn scripting API was the globals object.

When injecting a class it makes all its properties available globally. But they are not accessible in statically typed C#.

Given the sample from before:

public class Globals
{
    public int X;

    public int Y;
}

//..

var globals = new Globals { X = 5, Y = 3 };

X & Y are now globally available. However, using them in any function inside the ScriptFunctions class will result in the "The name X does not exist" error.

The workaround I implemented looks like this in my C# class:

public static class ScriptFunctions
{
    public static Globals Globals { get; set; }

    public static int sum()
    {
        // code will compile
        return Globals.X + Globals.Y;
    }
}

// alter expression to manually set the static property from the global values
expression = "ScriptFunctions.Globals = new Globals {" +
    "X = X," +
    "Y = Y" +
    "};" + 
    $"return {condition};";
// inject globals instance into roslyn
bool result = await CSharpScript.EvaluateAsync<bool>(
    expression,
    options,
    globals);

By injecting the assignment code before the evaluation of the actual condition the static property is set correctly and contains the values provided by the globals.

This then allows the static methods access to those values so the logic can depend on the application state.

I've wrapped up everything so far in a github repo (link below) to demonstrate the usage.

Further improvements:

Ideally via a separate app domain/process to prevent the scripted code from accessing stuff that it shouldn't. This would complicate the solution a lot as some state (sensor data) needs to be transfered to the new app domain/process as well (not to mention that .Net Core currently has no concept of App domains).

I see two improvements here:

First, instead of a simple string assignment the use of nameof will make the assignment less error prone during refactorings.

expression = $"{nameof(ScriptFunctions)}.{nameof(ScriptFunctions.Globals)} = new ...";

Obviously, this reduces the readability a bit, but is at least safer for refactorings.

Second, whenever the globals object is extended with another property the string assignment would need to be extended to also transfer the state of the new property.

The easy fix would be another wrapper class:

public class Globals
{
    public Data Data { get; set; }
}

public class Data
{
    public int X;

    public int Y;
}

Then only a single assignment is needed (Data = Data) and all the properties of it are transfered automatically.


I've uploaded a demo implementation to github that showcases everything described in this post (without the improvements).

tagged as C# and Roslyn