Monday, June 8, 2009

Semantic MathML

I don't know what they were thinking, but there is a much better way to encode MathML in RDF. One can be tempted to assign an RDF property to every element in MathML, but that wouldn't be the OpenMath way. Since MathML3 is getting more and more dependent on OpenMath, it seems appropriate to encode as much as possible using this combined system. MathML3 is split up into several sections: Presentation, Pragmatic Content, and Strict Content. The last one requires only 10 XML Elements to be understood by MathML3 processors, namely: m:apply, m:bind, m:bvar, m:csymbol, m:ci, m:cn, m:cs, m:share, m:semantics, m:cerror, and m:cbytes. This provides for a great economy of thought, and a chance to easily define a total mapping from MathML to RDF. MathML3 already defines a mapping from MathML2 to Strict Content MathML3, so this is the only set of Elements we need to consider. These are the prefixes we will use:

@prefix ari: <> .
@prefix fns: <> .
@prefix sts: <> .
@prefix sm: <> .
@prefix m: <> .

These are the rdfs:Class's we will define:

  • sm:Content (all MathML Content)
  • sm:Number (for <cn/>, subclass of sts:NumericalValue)
  • sm:String (for <cs/>, subclass of xs:string)
and these are the rdf:Property's we will define:
  • sm:apply :: Property * sm:Content
  • sm:applyTo :: Property * (List sm:Content)
  • sm:bind :: Property * (List sm:Content)
  • sm:bindOp :: Property * sm:Content
  • sm:bindIn :: Property * sm:Content
  • sm:error :: Property sm:Error sm:Content
  • sm:errorWas :: Property sm:Error (List sm:Content)
Strict Content MathML3 can be translated with the following algorithm:
  • <m:cn> NUM </m:cn>
  • "NUM"^^sm:Number
  • <m:cs> TEXT </m:cs>
  • " TEXT "^^sm:String
  • <m:ci> Name </m:ci>
  • Use blank node identifier (like _:Name)
  • <m:csymbol> Symbol </m:csymbol>
  • Use URIs ( for <m:csymbol cd="arith1">plus</m:csymbol>, or the value of the definitionURL for MathML2)
  • <m:share/>
  • Use URIs
  • <m:cbytes> DATA </m:cbytes>
  • " DATA "^^xs:base64Binary
  • <m:cerror> Symbol Content* </m:cerror>
  • [] rdf:type sm:Error ;
    sm:error Symbol ;
    sm:errorWas LIST(Content) .
  • <m:apply> Symbol Content* </m:apply>
  • [] sm:apply Symbol ;
    sm:applyTo LIST(Content) .
  • <m:bind> Symbol Vars* Content* </m:bind>
  • [] sm:bindOp Symbol ;
    sm:bind LIST(Vars) ;
    sm:bindIn LIST(Content) .

This allows all of Strict Content MathML to be encoded in a set of RDF triples. In order to have the full semantics of MathML and OpenMath there would have to be an OM-interpretation, OM-entailment, and so on, just as there is with every other application of RDF. Although I have not worked out the details of this, an OM-interpretation of an RDF graph that contains these properties would have to treat sts:mapsTo special, and maybe map it to an appropriate rdfs:Class. As an example of how this might be used, this is how the Haskell code (\x -> x + 2) would be translated into RDF using the properties listed above as follows:

_:f sm:bindOp fns:lambda ;
sm:bind LIST(_:x) ;
sm:bindIn LIST(
[ sm:apply ari:plus ;
sm:applyTo LIST(_:x "2"^^sm:Number) ]).

Now that RDF has lambdas, the possibilities are endless!

1 comment:

  1. Hi Andrew, nice suggestion, indeed better than the previous attempts I have seen. Still I'm quite sceptical whether it's useful at all to break mathematical expressions, i.e. ordered trees, down into RDF triples. (I have blogged on that here: So my question is: Are you using your RDF representation for anything? How successful is this application?