Thursday, March 1, 2012

Algebraic data type interop: F# - C#

In a previous post I wrote about encoding algebraic data types in C#. Now let's explore the interoperability issues that arise when defining and consuming algebraic data types (ADTs) cross-language in C# and F#. More concretely, let's analyze construction and deconstruction of an ADT and how to keep operations as idiomatic as possible while also retaining type safety.

Defining an ADT in F# and consuming it in C#

In F#, ADTs are called discriminated unions. The first thing I should mention is that the F# component design guidelines recommend hiding discriminated unions as part of a general .NET API. I prefer to interpret it like this: if you can hide it with minor consequences, or you have stringent binary backwards compatibility requirements, or you foresee it changing a lot, hide it. Otherwise I wouldn't worry much.

Let's use this simple discriminated union as example:

type Shape =
| Circle of float
| Rectangle of float * float

Construction in C# is pretty straightforward: F# exposes static methods NewCircle and NewRectangle:

var circle = Shape.NewCircle(23.77);
var rectangle = Shape.NewRectangle(1.5, 2.2);

No, you can't use constructors directly to instantiate Circle or Rectangle, F# compiles these constructors as internal. No big deal really.

Deconstruction, however, is a problem here. C# doesn't have pattern matching, but as I showed in the previous article you can simulate this with a Match() method like this:

static class ShapeExtensions {
    public static T Match<T>(this Shape shape, Func<double, T> circle, Func<double, double, T> rectangle) {
        if (shape is Shape.Circle) {
            var x = (Shape.Circle)shape;
            return circle(x.Item);
        }
        var y = (Shape.Rectangle)shape;
        return rectangle(y.Item1, y.Item2);
    }
}

Here we did it as an extension method in the consumer side of things (C#). The problem with this is, if we add another case to Shape (say, Triangle), this will still compile successfully without even a warning, but fail at runtime, instead of failing at compile-time as it should!

It's best to define this in F# where we can take advantage of exhaustively-checked pattern matching, either as a regular instance member of Shape or as an extension member:

[<Extension>]
type Shape with
    [<Extension>]
    static member Match(shape, circle: Func<_,_>, rectangle: Func<_,_,_>) =
        match shape with
        | Circle x -> circle.Invoke x
        | Rectangle (x,y) -> rectangle.Invoke(x,y)

This is how we do it in FSharpx to work with Option and Choice in C#.

Defining an ADT in C# and consuming it in F#

Defining an ADT in C# is already explained in my previous post. But how does this encoding behave when used in F#?

To recap, the C# code we used is:

namespace DiscUnionInteropCS {
    public abstract class Shape {
        private Shape() {}

        public sealed class Circle : Shape {
            public readonly double Radius;

            public Circle(double radius) {
                Radius = radius;
            }
        }

        public sealed class Rectangle : Shape {
            public readonly double Height;
            public readonly double Width;

            public Rectangle(double height, double width) {
                Height = height;
                Width = width;
            }
        }

        public T Match<T>(Func<double, T> circle, Func<double, double, T> rectangle) {
            if (this is Circle) {
                var x = (Circle) this;
                return circle(x.Radius);
            }
            var y = (Rectangle) this;
            return rectangle(y.Height, y.Width);
        }
    }
}

Just as before, let's analyze construction first. We could use constructors:

let shape = Shape.Circle 2.0

which looks like a regular F# discriminated union construction with required qualified access. There are however two problems with this:

  1. Object constructors in F# are not first-class functions. Try to use function composition (>>) or piping (|>) with an object constructor. It doesn't compile. On the other hand, discriminated union constructors in F# are first-class functions.
  2. Concrete case types lead to unnecessary upcasts. shape here is of type Circle, not Shape. This isn't much of a problem in C# because it upcasts automatically, but F# doesn't, and so a function that returns Shape would require an upcast.

Because of this, it's best to wrap constructors:

let inline Circle x = Shape.Circle x :> Shape
let inline Rectangle (a,b) = Shape.Rectangle(a,b) :> Shape

Let's see deconstruction now. In F# this obviously means pattern matching. We want to be able to write this:

let area =
    match shape with
    | Circle radius -> System.Math.PI * radius * radius
    | Rectangle (h, w) -> h * w

We can achieve this with a simple active pattern that wraps the Match method:

let inline (|Circle|Rectangle|) (s: Shape) =
    s.Match(circle = (fun x -> Choice1Of2 x),
            rectangle = (fun x y -> Choice2Of2 (x,y)))

For convenience, put this all in a module:

module Shape =
    open DiscUnionInteropCS

    let inline Circle x = Shape.Circle x :> Shape
    let inline Rectangle (a,b) = Shape.Rectangle(a,b) :> Shape
    let inline (|Circle|Rectangle|) (s: Shape) =
        s.Match(circle = (fun x -> Choice1Of2 x),
                rectangle = (fun x y -> Choice2Of2 (x,y)))

So with a little boilerplate you can have ADTs defined in C# behaving just like in F# (modulo pretty-printing, comparison, etc, but that's up the C# implementation if needed). No need to to define a separate, isomorphic ADT.

Note that pattern matching on the concrete type of a Shape would easily break, just like when we defined the ADT in F# with Match in C#. By using the original Match, if the original definition is modified, Match() will change and so the active pattern will break accordingly at compile-time. If you need binary backwards compatibility however, it's going to be more complex than this.

In the next post I'll show an example of a common variant of this.

By the way it would be interesting to see how ADTs in Boo and Nemerle interop with F# and C#.

No comments: