Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
User Journal

Journal Quantum Jim's Journal: A Different RDF/XML Serialization

I think the Official RDF/XML Serialization has the right heart, but makes a few mistakes. Here's how I would design an RDF/XML serialization:

I'd start with having the root element be anything. It isn't important in the long run.

I'd use a stripped syntax to start with, then just iterate properties. That is, instead of:

<root> <type> <property> <type> <property>

I'd use:

<root> <type> <property> <property> <property>

This is because:

  1. 90% of the the time, when the rdf:type abbreviation is taken advantage of, is for shallow trees.
  2. Contrary to the term doctype, I believe that an element's name represents its behavior rather than type. Since properties describe something happening (this associates with that via the property), I think they are a good fit for element names.
  3. rdf:parseType is super annoying!

So how would you identify resources after the first property in the chain? Just add an rdf:about attribute of course for named nodes and rdf:id for blank nodes! Thus the following RDF/XML:

<e:root
xml:base="tag:example.com,2005-06-04:ex:"
xmlns:e="tag:example.com,2005-06-04:ex:"
xmlns:r="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<e:Person r:about="jimmy">
<e:address r:type="Address">
<e:street>123 Name Road</e:street>
<e:city>New City</e:city>
<e:state>ST</e:state>
<e:zip r:dt="usformat">12345</e:zip>
</e:address>
<e:wrote r:about="mybook"
r:type="Book"
e:pages="42"
>
<e:id r:dt="isbn>123-45-6789</e:id>
</e:wrote>
</e:Person>
</root>

That identifies the triples:

e:Jimmy
r:type e:Person ,
e:address [
r:type e:Address ;
e:street "123 Name Road" ;
e:city "New City" ;
e:state "ST" ;
e:zip "12345"^^e:usformat
] ,
e:wrote e:mybook .
e:mybook
r:type e:Book ,
e:pages "42" ,
e:id "123-45-6789"^^e:isbn .

If slashdot didn't have braindead indenting, that would look pretty and easy to read/parse. Note that I'm using r:dt to identify data types. It is too bad I can't use xi:type, but anyhoo. :-/ Now what if you want xml content to be interpreted as a literal instead of with the semi-striped syntax? Just add r:dt="the datatype". RDF predefines a datatype for XML documents; just use it! OR you could just start the element's content with non-whitespace text. So that you have (with the xmlns not shown>:

<e:doc>
<e:Body>
<h:div>
<h:p>This is <h:strong>strongly</h:strong> advised.</h:p>
</h:div>
</e:Body>
</e:doc>

That means the following:

[
r:type r:Body;
h:div "<h:p>This is <h:strong>strongly</h:strong> advised.</h:p>"^^r:XMLLiteral
]

Note no datatype was defined in this case, but it was added to the output. Also note that the entire element is included with the literal. This involves a little bit of backtracking, but not much. (An alternative is to say an element with any text document MUST be literal, but this could potentially involve too much backtracking.) That example is exactly the same as (ignoring the insignicant whitespace):

<e:doc>
<e:Body>
<h:div r:type="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
<h:p>This is <h:strong>strongly</h:strong> advised.</h:p>
</h:div>
</e:Body>
</e:doc>

Now what about elements with no content? In that case nothing special happens. Check out this example (again no xmlns for brevity):

<u:urlset>
<u:Page>
<h:link
h:ref="http://example.com"
h:type="text/pudding"
h:rel="food"
/>
</u:Page>
</u:urlset>

That means the same as:

[
r:type u:Page ;
h:link [
h:ref http://example.com" ;
h:type text/pudding" ;
h:rel food"
]
]

Now what if you really wanted an element name to be parsed as the r:type of the resource? Well, then just say r:type="'". Yes that is a hack.

To say that some elements, besides some predefined ones I haven't told you about (basically all the properties in the RDF/RDFS/OWL specs that link to resources instead of literals), link to a resource instead of a literal use the r:refs attribute on the root element (yes, I said it was ignored... in the graph):

<e:root
xmlns:e="ex:eee#"
r:refs="ex:eee#Prop1 ex:eee#Prop2 ex:eee#Prop3" />

And so on. Blank Nodes are identified by r:node, r:id is basically my equivalent to r:ID (not rectification), and r:arc is used for rectification. r:resource is the exact same thing as r:about in my langauge (and really, it is the same in the offical RDF/XML serialization).

Finally if you want to use "namespace" prefixes instead of full uris - even though that's evil, then you need to use the a different namespace for "r:", "http://www.w3.org/2005/06/04-rdf-syntax-abbr-ns#", and declare all prefixes in the attribute r:ns like so:

<e:root
xmlns:e="..."
xmlns:r="..."
r:ns="e ... prefix2 uri2 prefix3 uri3
f ... prefix5 uri5"
/>

You get the idea. There's more, but I'm tired. Hope it spurs your imagination!

This discussion has been archived. No new comments can be posted.

A Different RDF/XML Serialization

Comments Filter:

This file will self-destruct in five minutes.

Working...