Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
User Journal

Journal Journal: A Brief Introduction to RDF

A Brief Introduction to the Resource Description Framework
===========================================================

Note: The following is a draft.  Will revise and webify.  ASCII art is formatted for 80 character lines.

Definitions
===========
RDF is a language framework for describing directed graphs.  First, some definitions:

* A _directed graph_ is a bunch of nodes connected by arrows.
* The node at the beginning of an arrow is called the _subject_.
* The node at the end of an arrow is called the _object_.
* The arrow is called the _predicate_.
* The whole thing is called a _statement_.

Example Graphs in Pictures
==========================
Here's a graph composed of only one statement:

-----------   predicate    ----------
| subject |--------------->| object |
-----------                ----------

A subject node for one statement can be the object node of another statment:

-----------   predicate    -----------   predicate2    -----------
| subject |--------------->| object1 |---------------->| object2 |
-----------                -----------                 -----------

Node and Arrow String Values
============================
Now each node can have a string value:

* It could have no string value; these nodes are called _blank nodes_.

* It can also have a URI value; a URL is an example of an URI.  In fact, every "node" with the same URI value is actually the same node (i.e. they are not seperate nodes).

* Objects can also be any arbitrary string - including the empty string ("") - but subjects can only be URI strings.  These types of objects are called _literals_.

Note that every predicate MUST have a URI value; they can't be blank or arbitrary strings.  It is also a VERY BAD IDEA to let predicates end with anything other than the characters a-z, A-Z, or 0-9 (that is, the end of the uri SHOULD be a valid xml name).

URI Examples
============
Here are some examples of URIs that could be a subjects, predicates, or objects.  There is no way of knowing in isolation which is which.  Each URI is DIFFERENT (even if they point to the same thing on the web).

1. http://example.com/subject
2. http://example.com/subject/index.html
3. http://example.com#predicate
4. file:///C:/Files/old/Video%20Game/mario.jpg
5. http://www.google.com/search?q=mario

If the URIs are all predicates, then 1, 3, and 5 are formatted best.  Here are some real-world examples of precicates:

* http://purl.org/dc/elements/1.1/creator
* http://www.w3.org/2000/01/rdf-schema#isDefinedBy

Here are some example strings that could be literals:

* http://purl.org/dc/elements/1.1/creator
* Mary had a little lamb
* <strong><acronym>HTML</acronym> is cool</strong>
* 47

Datatypes
=========
Literals can also have a property called its _datatype_.  For example, the literal string "47" is also an integer, so it could be described by a datatype that means integer.  This isn't required, of course, and many literals have no defined datatype.

The datatype is like a predicate and can only be a URI.  Here are some real-life examples of datatypes:

* http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral
* http://www.w3.org/2001/XMLSchema#int

Review of Values of Nodes and Arrows
====================================
In review:

* A subject can either have:
  * No string value.
  * URI string value.

* A predicate MUST have ONLY a:
  * URI string value.

* An object can either have:
  * No string value.
  * URI string value.
  * Literal string value.

* A literal object can optional have:
  * A datatype

* A datatype MUST have ONLY a:
  * URI string value.

Graphs as Databases
===================
Graphs can be thought of as collections of statements, and statements can be thought of as subject/predicate/object tuples.  I tend to think of statements as English sentences and graphs as paragraphs or stories.  The analogy holds up pretty well.

Graphs can also be looked at as a tables in a database.  Here's an example (First set is the heading):

Subject
  Prediate
    Object
      (Datatype) .

http://example.com#subject1
  http://purl.org/dc/elements/1.1/creator
    "Jimmy"
      http://www.w3.org/2001/XMLSchema#string .

http://example.com#subject1
  http://purl.org/dc/elements/1.1/creator
    "John" .

http://example.com#subject1
  http://purl.org/dc/elements/1.1/creator
    http://example.com/subject2 .

http://example.com/subject2
  http://purl.org/dc/elements/1.1/creator
    "Jimmy"
      http://www.w3.org/2001/XMLSchema#string .

Conclusion
==========
There's a lot more, but these are the basics.
User Journal

Journal Journal: **Updated** Reading Files in Internet Explorer with JScript

My original article on Reading Files in Internet Explorer with JScript no longer seems to work with IE 6 SP 2. It seems that the behavior of node.nodeValue changed between revisions of Internet Explorer. Therefore use the following pattern instead:

function loadFile(url) {

var newFrame = document.createElement("xml");

newFrame.async = false;

return newFrame.load(url) ? getNodeString(newFrame) : "";

};

function getNodeString(node) {

return (node.nodeValue !== null) ? node.nodeValue : getChildrenString(node.childNodes);

};

function getChildrenString(children) {

var nodeString = "", numOfChildren = children.length;

for (var ii = 0; numOfChildren > ii; ii++) nodeString += getNodeString(children[ii]);

return nodeString;

};

That can be trivially turned back into a single function if you prefer. I think the new code is a little more elegant than the previous version. Here's a demo (that I may take down if I run out of online disk space). I haven't tested that with IE 6 SP 1, since I no longer have access to that version!

Oh, and I gladly dedicate this code to the public domain. It isn't as if I could copyright something that small and trivial anyway, IMHO.

P.S. Slash's support for souce code is still lacking....

User Journal

Journal Journal: I've Been Quoted!

Java.net's Patterns Community recently (12-12-2004) produced this horribly generic quote on their front page! Sounds like something a PHB would say, methinks.

JavaPedia: Patterns
One of the richest pages in the JavaPedia is the page devoted to Patterns. A recent post to the discussion suggests you "refactor toward that pattern or away from it depending on whether the pattern's consequences, strengths, and trade-offs are appropiate for the program."

I wonder who the idiot is who wrote that... Oh ya, it was me! Not the best quote of the world esp. taking out of context like that. Hopefully people will follow the link and read the rest of the paragraph. Since that quote is awefully generic, I'll elaborate:

E.G. Initially one JavaScript library I was developing used a Strategy for determining the method of loading external files (hard to do in JS). I didn't intend to use the pattern. It just came about after some coding, so I formalized the relationship.

Furthermore, the concrete strategy was selected by a Factory and a Little Language; it was a really small language and took about ten minutes to code and test. Again, the patterns weren't inserted on purpose, they just emerged. However, while it seemed a good idea at the time, it ended up complicating things. So I ended up using a simple if/then/else statement! The obvious solutions are just missed sometimes after a long time coding, I guess.

I hope that makes more sense now.

P.S. Just joking with you, Java.net admins! :-)

Addendum - I think I figured out why they used that quote. Java.net is currently pimping an excerpt of Refactoring to Patterns by Joshua Kerievsky. I swear that I never heard of it before today. Will check out after I finish Better, Faster, Lighter Java. (QJ 14-12-2004.)

User Journal

Journal Journal: Preview of Nemo - RDF in Saxon

Here's something I hacked together in the last two days (after a few weeks of learning). The project is called Nemo, since I just watched the excellent movie, Finding Nemo. :-) It is a set (well, just two right now) of Saxon extension functions that use the Jena framework to access RDF/XML datasources in XSLT stylesheets. This would allow much more flexible transformation involving RDF data, like the Dublin Core. You use RDQL to query datasources. The project is inspired by (but not based on) RDFTwig from Norman Walsh.

It isn't that impressive right now, but I have a lot to do (1st priority - get a domain so I can create a good package name). The program copyrighted by me, James Francis Cerra, and this preview should be considered under the GPL, Version 2 (get the license text from the link). I currently intend release Nemo under a less restrictive license later (after I put more thought into it). Here's the code for the two classes:

File: Nemo.java

import java.io.FileInputStream;
import java.io.FileNotFoundException;

import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdql.Query;
import com.hp.hpl.jena.rdql.ResultBinding;

public class Nemo {
static public RDFSequence find(String sourceURI, String queryString) throws FileNotFoundException {
Model model = ModelFactory.createDefaultModel();

model.read(new FileInputStream(sourceURI), "", "RDF/XML");

Query query = new Query(queryString);

query.setSource(model);

return new RDFSequence(query);
}

static public Object get(ResultBinding result, String val) {
return result.get(val);
}
}

File: RDFSequence.java

import net.sf.saxon.om.SequenceIterator;
import net.sf.saxon.om.Item;
import net.sf.saxon.value.Value;
import net.sf.saxon.xpath.XPathException;

import com.hp.hpl.jena.rdql.Query;
import com.hp.hpl.jena.rdql.QueryEngine;
import com.hp.hpl.jena.rdql.QueryResults;

public class RDFSequence implements SequenceIterator {
RDFSequence(Query q) {
initialQuery = q;
QueryEngine qe = new QueryEngine(q);
results = qe.exec();
currentItem = null;
}

public Item current() {
// Get the current value in the sequence (the one returned by the most recent call on next()).
return currentItem;
}

public RDFSequence getAnother() throws XPathException {
// Get another SequenceIterator that iterates over the same items as the original, but which is repositioned at the start of the sequence.
return new RDFSequence(initialQuery);
}

public Item next() throws XPathException {
// Get the next item in the sequence.
if (results.hasNext()) {
currentItem = Value.asItem(
Value.convertJavaObjectToXPath(
results.next(),
null
),
null
);
return currentItem;
} else {
return null;
}
}

public int position() {
// Get the current position.
return results.getRowNumber();
}

private Query initialQuery;

private QueryResults results;

private Item currentItem;
}

Here's an example on how to use it:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:jfc="java:Nemo"
extension-element-prefixes="jfc"
exclude-result-prefixes="jfc xsl"
>

<xsl:variable name="query"><![CDATA[SELECT ?x, ?y
WHERE (?x <http://www.w3.org/2001/vcard-rdf/3.0#FN> ?y)]]></xsl:variable>

<xsl:variable name="datasource" select="'datasource.rdf'" />

<!-- Try to make the output look half decent -->
<xsl:output method="text" />

<xsl:template match="/">
From:
<xsl:value-of select="$datasource" />
Query:
<xsl:value-of select="$query" />
Result:
<xsl:for-each select="jfc:find($datasource, $query)">
Name: <xsl:value-of select="jfc:get(.,'y')" />
URI: <xsl:value-of select="jfc:get(.,'x')" />
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

Hope you find that useful, and please excuse slash's lame support for inline code blocks.

User Journal

Journal Journal: RANT: Slashdotters are dumb!

If you want insightful comments based on practical experience and factual evidence, don't come to /. - as evidenced by the recent article on Yet Another RDF Browser. So many people just don't get it!

Take this fellow who thinks that XSL is gathering dust. I guess it isn't used much with document markup languages such as Docbook - Oh wait O'Reilly does. Idiots. :-{ I guess Mozilla and mozilla-derived browsers don't access bookmarks, browsing histories, persistant data through RDF - Oh wait Mozilla does. I guess too much metacrap will overwelm the semantic web - Oh wait RDF is just like a more advanced version of the WWW. I'm not the only one who saw that. I guess Google is doomed, right? :-/

If /.'s collective consciousness is this bad with things that I am well versed in, I wonder about the subjects that I don't know much about...

Programming

Journal Journal: Replacing QNames and Entities in XML Content

Danny Ayers recently posted article, Exorcising QNames.

First the background. Recently, the xml:gods have been co-oping XML Namespaces for abbreviating URIs in many RDf serializations and XML Schema languages. Unfortunately, experience is starting to show that using QNames in XML content many be a terrible antipattern. Even the xml:gods are starting to doubt whether or not to overload QNames.

Now Danny Ayers has presented an interesting alternative using an element to map the results from a prefix to a namespace. I must be too critical, as I think his idea is a little clunky, while his comments are generally supportive. For one, his idea doesn't allow the abbreviations on the root element of a particular scope. In that case, it must be placed outside the intended element:

<animals
xmlns="http://example.com/animals/"
xmlns:nicer="http://purl.org/stuff/nicer/"
>
<nicer:map
nicer:prefix="cls"
nicer:namespace="http://purl.org/stuff/colors/"
/>
<dog
xmlns="http://purl.org/stuff/dogs/"
paw="cls:golden"
>
<nose>cls:brown</nose>
</dog>
</animals>

The mechanism also just seems to be a replacement for entity declarations. Well then, why not use entity declarations! That SGML-derived syntax is built-into every conforming XML processor, by definition! Furthermore, one doesn't end up polluting the DOM very much with them. These seem like concrete benifits over his proposal. E.g.

<!DOCTYPE animals [
<!ENTITY cls "http://purl.org/stuff/colors/">
]>
<animals
xmlns="http://example.com/animals/"
>
<dog
xmlns="http://purl.org/stuff/dogs/"
paw="&cls;golden"
>
<nose>&cls;brown</nose>
</dog>
</animals>

But we all know that DOCTYPES are evil. (right? right?) And of course these "entities" are really meant as namespaces for xml content rather than shortcuts for replaced content; even though, they act as normal entities. If XML 2.0 is not going to be backwards compatible with XML 1.0/1.1, then I propose that the xml:gods at the w3c (asgard) combine these "weak entities" with xml namespaces. Use &name; for traditional entities and %name: for namespace entities (again, which act the same in effect if not in intent):

<?xml version="2.0" ?>
<animals
xmlns="http://example.com/animals/"
xmlns:cls="http://purl.org/stuff/colors/"
example="&cls:ex"
>
<dog
xmlns="http://purl.org/stuff/dogs/"
paw="&cls:golden"
>
<nose>&cls:brown</nose>
</dog>
</animals>

Of course, it is too late for that solution. :-(

XML 1.0/1.1 could still be used if another namespace and the % sign was overloaded as an escape marker. e.g.

<!DOCTYPE animals [
<!ENTITY ent "brown entity">
]>
<animals
xmlns="http://example.com/animals/"
xmlns:cns="http://example.com/contentnamespace/"
cns:cls="http://purl.org/stuff/colors/"
example="%cls:ex"
>
<dog
xmlns="http://purl.org/stuff/dogs/"
paw="%cls:golden"
>
<nose>%cls:&ent;</nose>
</dog>
</animals>

Then again, two independent escaping systems in one language is silly! Another downside is that a new entity resolver step has to read any of the DOM before handing it's off to the rest of the application. I think that those disadvantages would lead to any significant adoption. So we're stuck with XML ENTITIES for the immediate future.

P.S. Slashdot's comment system still sucks. Ignore any nonsense "&ampn bsp;" and white spaces in the code. Arrrrrrrr!

User Journal

Journal Journal: Response to Lach

This is a response to Lach's comments of my response to Simon Willison's blog on Presidental Candidate Endorsements. I am posting it here out of courteousness, since it goes off-topic. Read his comments for the appropiate context, else this seems like jiberish (well, even more jiberish than usual ;-).

Lach, I was confused by Simon Willison's intent. I agree with him that disappointment is a better description of his response. In fact I probably wouldn't have commented if that was his term. However, I want to respond to your comments and questions - even if just to clarify my thoughts.

  1. That's disapointment. If you feel profound sadness - i.e. depression - then it's depression.
  2. Dialectical Behavioral Therapy suggests that one's feelings are based on their thoughts. So if Simon was depressed then, from my personal experience, one possible reason would be wanted validation. Since he was not depressed, this is a moot point. (I should be calling him Simon Willison or Mr. Willison since I never met him, but oh well.)
  3. I never said that they were perfectly equal in good and bad points; however, I agree that I used the wrong term. I was trying to say that any reasonable analysis must conclude that different opinions - i.e. favorable or unfavorable opinions of a particular candidate - can each be formed from rational arguments. Which leads to...
  4. I'd like to state that I'm on the fence right now on choosing either Bush or Kerry. Not because I haven't givin this much thought, but because I agree with Bush on certain issues, Kerry on others, and neither on still more issues. So I am examining how important these issues are and how they interact. That is a hard analysis especially since some of my opinions are probably either wrong or poorly derived.

    However, my experience with Bush's supporters is very different from yours. Most seem to be normal people - neither overtly religious or of exceptionally below-average intelligence. (In fact, on average they seem to have average intelligence - in agreement with the definition of "average intelligence".) However, this has nothing to do with a positive opinion of Bush. To think that it does is a logical fallacy.

Hope that clarifies my thoughts.

User Journal

Journal Journal: Intellectual Masturbation

Dave Shea of Mezzoblue recently posted a blog about long-term love affairs and exploration. Doesn't that sound dirty? ;-)

Actually, he was ranting on how much he enjoys the pursuit of knowledge. Dave Shea Sites says he could spend hours upon hours reading triva from sites such as Wikipeida and the Internet-Movie Database among others. Give them a ring.

I can relate: I've spent hours of joy researching the bounty of information located not only on web sites but also in journals, magazines, and in the bliss of the library. However....

Warning: The following opinion is exaggerated for effect.

I don't think that kind of exploration is as noble and fun as he makes it out to be. It certainly feels enriching; however, reading such trivia doesn't help in my understanding of it. Not really. Nothing is really discovered - just taught through literature.

The kind of exploration that I really enjoy is though experimentation and trial. I don't just read about cooking but create a meal. I don't want to just learn about the circuit but wire the thing together and view the results. I don't just want the physics taught to me; I want to derive the equation myself. I don't just learn XUL - I write apps with it.

I find experimentation much more satisfying. Researching is important and rewarding; however, it is mostly useful to me as a guide, overview, or link for free association. I want to understand a thing and not just learn about it!

User Journal

Journal Journal: I am not a Troll! 12

I just don't understand /.'s moderators. Why was this post on Google's browser possible based on MSHTML moderated as a troll by someone? I sincerely meant everything that I said. I tried to be polite and not condescending - even proactively admitting a social blunder. I also backed up my analysis with some (admittly, anecdotal) evidence. I simply don't understand why it is being moderating like that. What am I doing wrong?

Several of my other posts in the last couple of weeks were moderated down for suspect reasons. I never try to hurt anyone's feelings. This is making me a litle paranoid. I am beginning to feel like an outsider in these discussions... (ironic from a "nerd's news portal.") Why is that?

Slashdot Top Deals

You know, the difference between this company and the Titanic is that the Titanic had paying customers.

Working...