Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
User Journal

Journal Journal: DOCTYPE declarations for versioning information

Over at Anne's journal there is a debate about using DOCTYPE declarations as versioning information. For example, the external subset for HTML 3.2 is different from the external subset for the HTML 4.01 family. There are also different external subsets for each "subversion" in the HTML 4.01 family. i.e. Transitional, Strict, and Frameset versions. Some people think this doesn't work. My opinion is that DOCTYPE declarations can be used to specify version information for the following reason:

  1. Say you have two documents with identical DOCTYPE decarations. Both the internal, external, and root element declarations are the same. Then you can say their syntactic doctypes are identical. The structure of each document must conform to the same SGML rules.

  2. The two documents could have different semantic doctypes though. That is, the meaning content in one document could mean something totally different in the other one even though they both conform to the same syntax rules in the DTD. For example, say the "rel" attribute is defined with character data content in the DTD. One document may specify that the rel attribute specifies a relationship and should be a character string; the other could specify that those attributes specify links and should be URI references.

  3. There could also be syntatic differences not captured by the syntatic doctypes. For example one spec may indicate that "name" and "id" attributes MUST have identical content if both are present; however, this is impossible to specify in a DTD.

  4. Therefore, the content of the external and internal subsets can not be used to differentiate between languages or versions of a single lanugage like HTML. There could be different syntatic or semantic meaning not captured by the doctype.

  5. External subsets are specified with either a Formal Public Identifier (FPI) or a URI reference to a DTD. However in practical applications a FPI uniquely identifies a resource just like a URI, so I'm going to assume both are the same for this line of reasoning, and I'll call both DTD names. DTD names have the following property: the owner of the name gets to determine what it means. For HTML's DTD names specified by the W3C, only the W3C gets to say what they mean, for example.

  6. Therefore, it is legal for a DTD names to indicate the doctype for a single language including semantic and syntatic requirements not captured in the content of the DTD. The meaning of the HTML 4.01 Strict doctype string is unambiguous even though the DTD's content may not specify all of the semantic or syntatic requirements of HTML 4.01 Strict.

  7. If you change the external subset's DTD name, it may or may not refer to the same language. Even if the external subset contains the same content, other requirements not encoded by the DTD could be different. Therefore, you can ONLY use the cannonical DTD names for unambiguously identifing the resource with third parties.

  8. Even when using cannonical DTD names, if you change the root element then you might no longer conform with the specification. For example, changing the root element while using HTML 4.01 strict's DTD name violates the global structure semantics of HTML 4.01 strict. Therefore, the document is not valid HTML even though it is syntatically valid SGML. Note that the DTD name still specifies the particular language you are using even though the content of resolving the DTD name is not enough to validate the document as HTML.

  9. If you add an internal subset, then the meaning of those changes is undefined even though the syntax of those changes specified as well as can be by the content of the subset. The content of the internal subset simply cannot capture the semantic or all the possible syntatic requirments you specify. Therefore if anything in the internal subset conflicts with the HTML 4.01 strict's external subset, or additional elements or attributes or attribute lists are defined, then the resulting lanugae is not HTML even though it is valid SGML for example.

  10. There is an exception to the above point. If the internal subset contains entity declarations with valid HTML content as its content (even though the entity by itself may not be valid HTML content), and those entity declarations don't interfere with the HTML DTD, then the meaning of those entities is clear (it is defined by SGML) and the specified syntax of the external subset is unchanged. Therefore, it is still HTML of the specified version for example.

Thus if you use the same internal subset content (with an exception), the same external subset declaration, and same root element declaration as the HTML language version you are declaring, then your document is HTML. If you change anything (with an exception), then it can never be unambiguously determined to be HTML by a computer.

If you can specify different languages using the above rules, then you can specify different versions of a language family using the above rules. Every version of a language family is a different language. They may share certain semantics, but they are not compatible except as explicitly defined by the language family's specification.

Therefore, you can use DOCTYPE declarations to specify version information. Q.E.D.

User Journal

Journal Journal: URI References and data: URI strings

Not to overload sw@w3, I'm going to propose answers to my questions on URI References in RDF. Any comments are greatly welcomed, since I don't know if I am correct!

1) RDF describes URI references - not URI strings themselves - in anticipation of IRIs (AKA a 'URIRef') URIRefs are always encoded in UTF-8 too. Correct?

Correct. (of course ;-)

2) My main question concerns converting a URIRef into a URI. Say we have the URIRef:

<data:,Hello, World>

Is that legal? Would that be converted into the URI:

<data:,Hello%2C%20World>

Since the comma is illegal in the URI after the first one?

Not entirely. The comma is legal in both the context and in RDF, so that URIRef would properly be converted to the URI:

<data:,Hello,%20World>

Note that I think (hope) that only the RDF semantics section applies and not the context in the scheme.

3) If there was an ambigious situation, how would it be represented as an URIRef? For example take the URI (yes, it is an unusual case where the name contains a slash - it is just an example):

<http://example.com/name%2Fslash/>

Would that be converted to the URIRef:

<http://example.com/name/slash/>

(I don't think so). But wouldn't the URIRef:

<http://example.com/name%2Fslash/>

be converted to the URI:

<http://example.com/name%252Fslash/>

That is all true, so the URI:

<http://example.com/name%2Fslash/>

Can not be represented as a URIRef. However that's not a problem, even though that means the scope of URIRefs is smaller than the scope of URI strings. RDF concerns URIRefs and not URI strings. So this is a non issue.

User Journal

Journal Journal: Howto: lose twenty pounds the wrong way 7

As of today I am 179.5 pounds - exactly twenty pounds less than I was two months ago! Yes, I was getting quite fat and I still have a lot to lose. However, I feel great and am finally down a weight class for the tourney. Realistically a target of 175 is required to improve my speed (less dead weight to carry around), but I'll try for 169 if I can. How did I do it?

Note: I am not a doctor. This is what I did and it could be harmful to you (or even me without realizing it). I am not a professional. Get professional advice and a doctor's help before attempting anything. Use at your own risk!

  1. First few weeks (lost 3-5 pounds for me):

    See a dietitian and eat according to how much you spend. You are a bank; to lose weight you must bounce your checkbook! :-p So keep track of what you eat and how much you exert through exercise (use machine or use estimates). Healthy food (fruits, vegetables) improves your "energy" so you can exercise easier. I used to go to Subway, but I stopped recently for other reasons. Limit your bread, cheese, and portions; increase your cereal/milk, fruits/vegetables, and number of meals. Eating only a few meals makes your body go into sleep-mode.

  2. Next weeks (lost 5-7 pounds more for me):

    1. Continue the previous stuff.

    2. Run run run. Do 25-35 minute workouts two-three times a week Use an elliptical if your joints hurt or you just don't running in general. You get a similar workout yet they feel much better! Use the "weight loss" workout - oscillating between high and low inclines - with some resistance (start low). Make sure you stay within your target heart rate zone. If you don't, then you will tire easily and quit.

    3. Lift weights. A strong upper body increases your metabolism and also helps make running easier. Make sure you don't hurt yourself and always use a spotter, but try to ramp up the weight to 90% or failure during reps. (Most people don't lift enough.) Try to do it two-three times a week between running days. It helps to work with a few friends, as it is safer and you push each other harder.

    4. Do a martial art one-two times a week. I joined a wrestling club at my university then did some introductionary jiu-jitsu at the one next door when the other club adjourned for the summer. You will suck at it if you never did it before (or are out of practice). That's OK!

  3. Next Weeks (lost 7-9 pounds):

    1. Continue the previous stuff, if you can. Remember to eat more as you exercise more. You need the energy, but your body will burn food more quickly rather than conserve it in fat stores. I modified my schedule to allow other stuff, but I want to add in more running/weight lifting.

    2. Train at your martial art school two days a week.

    3. Train at your martial arts club two days a week too (or go to your school). They won't be as good, but the sparring will give you a workout.

    4. Do your martial arts drills two more days a week in two minute periods. (i.e. two minutes one, two minutes rest, two minutes second, two minutes rest, etc...) Do it by yourself or with your group. This improves both your art and your conditioning.

    5. Friday is fun day!

That's what I did so far. Right now I'm working out six days a week with Friday off to party/have fun. I don't do everything as much as I can, and I'm far from perfect (missing days, not doing something, slacking off, overeating). The important thing seems to be to go back into the rhythm before falling into couch-potato mode.

Time is a problem, but I try to force exercising. I'm 10x more productive after working out, and my computer programming has increased in quality I think. Abstract algebra just doesn't kick in the fight-or-flight response! :-)

It might not work with you, and I don't think my workout is anywhere near either a scientific nor great thing. It seems to be working for me right now, so I hope it gives you ideas or inspiration to lose weight.

Note: I am not a doctor. This is what I did and it could be harmful to you (or even me without realizing it). I am not a professional. Get professional advice and a doctor's help before attempting anything. Use at your own risk!

Slashdot.org

Journal Journal: Drunk with Power. 12

/. loves me! In the last three months, I have recieved moderation powers about 14 times. In the last three weeks, about seven times. In the last week, three times. In fact I used up my moderation points this morning and recieved five more this evening! As my power grows so does my ego.

I just wish there were more interesting stories. The moderation of most stories is horrible if my judgement about topics that I know something about is any indication. However, I usually moderate only in those stories since I can't fairly judge the worth of a comment outside of my field. So people, I implore you to submit fascinating articles about engineering, physics, programming, science, the semantic web, etcetera!

User Journal

Journal Journal: Racial Slurs and Jiu-jitsu Tourney 6

On Racial Slurs:

While practicing jiu-jitsu with a friend, he referred to me with a racial slur at one point. I don't care personally. In fact, it made me fight that much harder when sparring. We're still cool, and I understand it was in jest. I don't think my friend is a racist.

The topic still preoccupies my brain however. I don't feel offended - more like puzzled. Why should I care? Why do I still think about it? Then there are also labels like 'nerd' and 'geek' that can also be used as pejoratives. (He didn't used them.) Why is it not a hate crime when we are bullied for liking science but other groups are? I can't think of any answers to these questions.

Jiu-jitsu Tourney:

Much to my surprise I found a summary of my match (Men's No-Gi Beginner) at the Kumite Classic. I'm rapidly forgetting what happended during the match, which usually happens since I'm too busy to comit it to memory! ;-) However, I was under the impression that I lost due to a key lock. I confess that I don't yet know what an americana is though - hey I only have a month and a half of experience so far!

User Journal

Journal Journal: On the Top Bands/Artists...EVER. At this moment. ever.

One the subject of Top Bands/Artists... EVER. An honorable mention must go to Jerry Goldsmith for The Enterprise and Leaving Drydock from Star Trek: The Motion Picture Soundtrack. With that pseudo-symphony Goldsmith defined the adjective bombastic. I must rank that work among the best classical ever with Bach, Beethoven, Mozart, and the like.

P.S. Evenso, I like rock too.

User Journal

Journal Journal: A Different RDF/XML Serialization

I think the Official RDF/XML Serialization has the right heart, but makes a few mistakes. Here's how I would design an RDF/XML serialization:

I'd start with having the root element be anything. It isn't important in the long run.

I'd use a stripped syntax to start with, then just iterate properties. That is, instead of:

<root> <type> <property> <type> <property>

I'd use:

<root> <type> <property> <property> <property>

This is because:

  1. 90% of the the time, when the rdf:type abbreviation is taken advantage of, is for shallow trees.
  2. Contrary to the term doctype, I believe that an element's name represents its behavior rather than type. Since properties describe something happening (this associates with that via the property), I think they are a good fit for element names.
  3. rdf:parseType is super annoying!

So how would you identify resources after the first property in the chain? Just add an rdf:about attribute of course for named nodes and rdf:id for blank nodes! Thus the following RDF/XML:

<e:root
xml:base="tag:example.com,2005-06-04:ex:"
xmlns:e="tag:example.com,2005-06-04:ex:"
xmlns:r="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<e:Person r:about="jimmy">
<e:address r:type="Address">
<e:street>123 Name Road</e:street>
<e:city>New City</e:city>
<e:state>ST</e:state>
<e:zip r:dt="usformat">12345</e:zip>
</e:address>
<e:wrote r:about="mybook"
r:type="Book"
e:pages="42"
>
<e:id r:dt="isbn>123-45-6789</e:id>
</e:wrote>
</e:Person>
</root>

That identifies the triples:

e:Jimmy
r:type e:Person ,
e:address [
r:type e:Address ;
e:street "123 Name Road" ;
e:city "New City" ;
e:state "ST" ;
e:zip "12345"^^e:usformat
] ,
e:wrote e:mybook .
e:mybook
r:type e:Book ,
e:pages "42" ,
e:id "123-45-6789"^^e:isbn .

If slashdot didn't have braindead indenting, that would look pretty and easy to read/parse. Note that I'm using r:dt to identify data types. It is too bad I can't use xi:type, but anyhoo. :-/ Now what if you want xml content to be interpreted as a literal instead of with the semi-striped syntax? Just add r:dt="the datatype". RDF predefines a datatype for XML documents; just use it! OR you could just start the element's content with non-whitespace text. So that you have (with the xmlns not shown>:

<e:doc>
<e:Body>
<h:div>
<h:p>This is <h:strong>strongly</h:strong> advised.</h:p>
</h:div>
</e:Body>
</e:doc>

That means the following:

[
r:type r:Body;
h:div "<h:p>This is <h:strong>strongly</h:strong> advised.</h:p>"^^r:XMLLiteral
]

Note no datatype was defined in this case, but it was added to the output. Also note that the entire element is included with the literal. This involves a little bit of backtracking, but not much. (An alternative is to say an element with any text document MUST be literal, but this could potentially involve too much backtracking.) That example is exactly the same as (ignoring the insignicant whitespace):

<e:doc>
<e:Body>
<h:div r:type="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
<h:p>This is <h:strong>strongly</h:strong> advised.</h:p>
</h:div>
</e:Body>
</e:doc>

Now what about elements with no content? In that case nothing special happens. Check out this example (again no xmlns for brevity):

<u:urlset>
<u:Page>
<h:link
h:ref="http://example.com"
h:type="text/pudding"
h:rel="food"
/>
</u:Page>
</u:urlset>

That means the same as:

[
r:type u:Page ;
h:link [
h:ref http://example.com" ;
h:type text/pudding" ;
h:rel food"
]
]

Now what if you really wanted an element name to be parsed as the r:type of the resource? Well, then just say r:type="'". Yes that is a hack.

To say that some elements, besides some predefined ones I haven't told you about (basically all the properties in the RDF/RDFS/OWL specs that link to resources instead of literals), link to a resource instead of a literal use the r:refs attribute on the root element (yes, I said it was ignored... in the graph):

<e:root
xmlns:e="ex:eee#"
r:refs="ex:eee#Prop1 ex:eee#Prop2 ex:eee#Prop3" />

And so on. Blank Nodes are identified by r:node, r:id is basically my equivalent to r:ID (not rectification), and r:arc is used for rectification. r:resource is the exact same thing as r:about in my langauge (and really, it is the same in the offical RDF/XML serialization).

Finally if you want to use "namespace" prefixes instead of full uris - even though that's evil, then you need to use the a different namespace for "r:", "http://www.w3.org/2005/06/04-rdf-syntax-abbr-ns#", and declare all prefixes in the attribute r:ns like so:

<e:root
xmlns:e="..."
xmlns:r="..."
r:ns="e ... prefix2 uri2 prefix3 uri3
f ... prefix5 uri5"
/>

You get the idea. There's more, but I'm tired. Hope it spurs your imagination!

User Journal

Journal Journal: Jiu-jitsu Lessons for Life 2

On a whim, I decided to enter the no-gi Kumite Classic this past weekend. I learned several valuable life lessons there:

  1. By not eating, you can lose up to 3.5 pounds in a day.
  2. It pays to be punctual; the line gets long quickly.
  3. Overestimating your enemy is just as bad as overestimating yourself.
  4. The best don't talk too much. Ironically I shouldn't be posting in my journal then.
  5. It hurts to lose, even when it was expected. It hurts alot.

So I lost my first and only match of the day after about 1.5-2 minutes. Hey, I couldn't tell the time - too busy grappling. That embarassment will fuel me for next year.

PS: I have four moderator points to use today. What are some interesting stories to apply my god-like powers of approval?

User Journal

Journal Journal: Kevin Rose, I'm sorry!

Kevin Rose, if you are reading, I'm really sorry for not using caches of your sites. I never realized how bad the /. effect could be until now. I apologize for being so rude.

User Journal

Journal Journal: First Real Fight (we're #4?)

For the last several weeks, I've been learning to grapple at the CMU grappling club. Well, I won my first real fight at the Cinco de Mayo Round Robin. What everyone at the club called fights were basically what I (former wrestler) call matches, since there was no striking allowed. But "fights" sounds more badass. :-P Now I just got to win twice next time.... ;-)

Slashdot Top Deals

The optimum committee has no members. -- Norman Augustine

Working...