Challenging the Ideas Behind the Semantic Web - Slashdot

Slashdot is powered by your submissions, so send in your scoop

×

Challenging the Ideas Behind the Semantic Web 144

Posted by ScuttleMonkey on Wednesday July 19, 2006 @01:39AM from the there-isn't-any-deception-on-the-internet dept.

mytrip writes to tell us that after a recent presentation to the American Association for Artificial Intelligence (AAAI) Tim Berners-Lee was challenged by fellow Google exec Peter Norvig citing some of the many problems behind the Semantic Web. From the article: "'What I get a lot is: "Why are you against the Semantic Web?" I am not against the Semantic Web. But from Google's point of view, there are a few things you need to overcome, incompetence being the first,' Norvig said. Norvig clarified that it was not Berners-Lee or his group that he was referring to as incompetent, but the general user."

This discussion has been archived. No new comments can be posted.

Challenging the Ideas Behind the Semantic Web

Search 144 Comments Log In/Create an Account

Comments Filter:

Problems w/ the Semantic Web (Score:5, Insightful)

by CTalkobt ( 81900 ) writes: on Wednesday July 19, 2006 @01:49AM (#15741458) Homepage

is the users.

Not the ones searching but the ones creating the content.

They'll be some idiot out there (like there is now) that will code his data in a way that guarantees that he gets the most page views etc. So often searched terms will turn up on search indexes and other ilk.

It's a loosing proposition unless you come up with filters but then they have their own set of problems.

Share
twitter facebook
Googlebombing (Score:5, Insightful)

by QuantumFTL ( 197300 ) * writes: on Wednesday July 19, 2006 @02:11AM (#15741498)

The biggest problem with the semantic web is spam. If you can trust the tags, it's a beautiful idea. If you can't, it's worse than useless - it's a waste of time. Google has the right idea, automatic extraction of semantics from content. If there's no real content, then (hopefully) that will be reflected in the semantic analysis.

Me, I estimate we're 5-10 years away from doing anything terribly useful with all of this stuff, but I can definitely envision the day when an internet without semantics seems as distant as an internet without Google.

Share
twitter facebook
Incompetence of users such as Slashdot editors... (Score:5, Insightful)

by rsidd ( 6328 ) writes: on Wednesday July 19, 2006 @02:14AM (#15741502)

Thanks for the illustration of what Norvig meant. How is "Google Director of Search and AAAI Fellow Peter Norvig" (original article) semantically equivalent to "fellow Google exec" (Slashdot summary)? The latter suggests that Tim Berners-Lee too is a Google exec, which would be news to him.

Share
twitter facebook
Filtered semantic webs might work (Score:2, Insightful)

by dr_pump95 ( 869367 ) writes: on Wednesday July 19, 2006 @02:18AM (#15741512)

Semantic webs (emphasis on plural) produced by editors such as those at /. or in the consumer-rated style of Digg, Del.icio.us etc might actually work. Trusting authors to do it right is a disaster, as Norvig suggests.

Share
twitter facebook
Always bet on the million monkeys (Score:5, Insightful)

by IvyMike ( 178408 ) writes: on Wednesday July 19, 2006 @02:26AM (#15741528)

It's really, really difficult to get people to follow rules. We're lazy, we're incompetent (yes), and some of us are evil. I still don't think I truly understand how RDF is supposed to work exactly, and it doesn't even seem like it will be fun to try.

On the other hand, it's really easy to release a million monkeys and let the create what they will. It's not so easy to sort through what they end up producing, but Google does a surprisingly good job of this.

It reminds me of the early days of the Web, when companies like CompuServe and AOL wanted to design and own all content. On the other hand, an internet server with httpd let anybody make a ~/public_html directory and put up whatever they wanted to. The million monkeys won that battle. I think they'll win this one, too.

Share
twitter facebook
Re:Googlebombing (Score:5, Insightful)

by Wastl ( 809 ) writes: on Wednesday July 19, 2006 @02:28AM (#15741537) Homepage

The "Semantic Web" is not about search engines, as you and many other posters seem to believe. It is about representing Web content in a structured, formal way that is more easily accessed by machines, going beyond simple presentation. This can be used for searching, but also for many other applications, e.g. integration, exchange, personalisation, ... .

Spam content on the Semantic Web is in no way different to spam content on the normal Web (well, except that it is formal). This also means that a search engine that is capable of working with Semantic Web data has exactly the same issues with trust as traditional search engines. Except that on the Semantic Web, trust can be expressed formally as well. Similar to the authorities in Google, whose outgoing links make a statement about the trustworthiness of other sites, an "authority" on the Semantic Web can make statements about the trustworthiness of other sites. However, these statements are explicit, and they could also be used to state that another site is *not* trustworthy.

Google has the right idea, automatic extraction of semantics from content.

Google does not extract any semantics from content. It merely analyses the linking between websites and connects that with keywords. No semantics here.

Sebastian

Parent Share
twitter facebook
Blaming the user is never right (Score:4, Insightful)

by robolemon ( 575275 ) writes: <nertzy@noSpaM.gmail.com> on Wednesday July 19, 2006 @02:30AM (#15741540) Homepage

From http://www.7nights.com/asterisk/archive/2004/03/do nt-blame-the-users [7nights.com]

Blaming the users for anything should raise a huge red flag that you've got some usability problems.

Maybe the Semantic Web should aim to be useful to people rather than require people to be useful to it. There has to be a better way than trying to educate droves of people to a problematic and vulnerable design.

Share
twitter facebook
Re:Semantic web is currently fragile technology (Score:5, Insightful)

by znu ( 31198 ) writes: <znu.public@gmail.com> on Wednesday July 19, 2006 @02:36AM (#15741550)

The full semantic web scheme really ignores a lot of what the Internet has taught us about what technologies succeed. It's not about grand visions and long specifications, it's about simple stuff that solves real problems of limited scope. Look at RSS, for instance; it's about the simplest thing which could do the job it does.

I think we'll eventually realize most of the benefits of the semantic web, but it won't be a result of a grand vision imposed from the top down and implemented all at once. It'll probably be though increasing adoption of microformats [microformats.org], which don't try to classify and specify everything, and are implemented entirely using existing web standards.

Parent Share
twitter facebook
Re:Blaming the user is never right (Score:2, Insightful)

by QuantumFTL ( 197300 ) * writes: on Wednesday July 19, 2006 @03:00AM (#15741588)

Blaming the users for anything should raise a huge red flag that you've got some usability problems.

Bollocks! The fact that flying an F22 is probably fatal for untrained grandmothers does not mean it has "usibility problems" - not every task in life is meant to be done by idiots, and the more effort is put into idiot proofing software, the less is put into reliability, functionality, and extensibility for the rest of us. Some things are too hard for a segment of the population to do, and ontologically tagging complex relationships between data entries may simply be beyond the average user. That's not a bug, that's a challenge.

There's too many generalizations like "blaming the user is always wrong" and "security through obscurity is not useful" that are incorrect under many conditions, and /. posters and moderators seem to be doing their best to propagate these. People have finite time, money, intelligence, knowledge, skill and experience. Not everything can be easy enough for everyone to use.

Parent Share
twitter facebook
A bad example: FreeDB (Score:1, Insightful)

by h_benderson ( 928114 ) writes: on Wednesday July 19, 2006 @03:30AM (#15741637)

Exactly. Equally worse to the malicious content creator is the incompenent one.

As an analogy, look at FreeDB. It should be obvious that the CD information database loses much of its worth if entries are not double-checked for errors before submittal. Yet, there is so much crappy entries in FreeDB that it's just not funny anymore.

Ergo: Don't count on people to adequately tag or label content. It won't work.

Parent Share
twitter facebook
I See Value in the Semantic Web (Score:2, Insightful)

by n is prime ( 989698 ) writes: on Wednesday July 19, 2006 @04:08AM (#15741728)

Even if we are inherently lazy, and even though some people seem to be generally against the idea, it doesn't make any sense to me not to employ this and experiment with it. Norvig is an AI guru, and his ideas on the Semantic Web may be interesting, but Google is not against the idea. Google's GData looks to me like a primitive Semantic Web. Even if only 10% of web masters adopt the system, querying to find a set of results that have been tagged as certain meta-data can come up with some interesting results. If the results are interesting enough, more webpages will include meta data tags. Also, being inherently lazy argues for not spending time writing tags all over your code, so why would anyone take the time to sabotage the system. While I understand the difficulties of the spamming problem, there are plenty of cookies on the internet anyway. I think the same inherent problem in the Semantic Web exists with PageRank. In PageRank what happens is a web page will say the same words over and over to acheive a higher ranking in the semantic analysis of the page, and thus the page will be a top result when entering a query with related words. But I think PageRank works pretty well overall. Google's next step with PageRank is to filter all the spam sites that just say the same words. Security in the Semantic Web would also be to filter those sites with obviously spammy RDF or OWL tagging. Overall the Semantic Web is a cool project that could lead to really smart searches, with axioms involving how different meta-tags are related to each other. I'm in favor of the new technology.

Share
twitter facebook
RDF Ability vs. RDF Techincal Complexity (Score:3, Insightful)

by Anonymous Coward writes: on Wednesday July 19, 2006 @04:09AM (#15741733)

The idea of RDF is applicable to much more than public innerweb content. I've spent the last 7 months researching and developing an RDF backed system for my company's core products. Everyone should think of the value of RDF beyond the scope of trust, and then it becomes easy to realise methods of simple non-web implementation. We can all spend the next 5 years pondering how we're going to figure trusted content providers for RDF web data, or we can just start developing apps for sources which understand themselves as trusted (ie. data input from an individual, employees of a company, and any group where the individual must be accountable for their actions). Whats more important than the blind trust of sources, is data verfication. There are ways to run data input from one user by another user, without doing it in an infringing, demanding way, for validation. I'd like to go into detail of exactly what I mean by all this, but I don't want to violate any portion of my NDA or tip off industry competition (I know that sounds retarded, sorry). If RDF does gain popularity, I can say it will from within the private sector, not the public. Genious implementation may bring RDF to the public sector, but thats not something I would say is guaranteed to happen.

Current technical obstacles to creating any RDF applcation: The matter of complexity of its integration into DB backed systems (popular methods), and instatiated class marshaling within not-so-object oriented languages. The technical design and implementation of a standards compliant RDF system has been extremely difficult for me. I don't think it would ever be possible to get RDF data represented nearly as minimally as you could with simple relational tables (although formally no more bloated than bloaty XML). RDF also creates many long linked relationships; this tends to create some serious performance issues in querying the data. Lastly, I hate XML, and you can't always correctly export from RDF to XML (capable type to incapable type) in a correct manner.

Share
twitter facebook
Too complicated (Score:1, Insightful)

by Chris Graham ( 942108 ) writes: on Wednesday July 19, 2006 @05:51AM (#15742000) Homepage

There is no way that regular people, even the majority of intelligent educated people, are going to be able to use it. It's a ridiculous pipe-dream. Think how hard it is to get people to understand broken-down logical arguments where everything is already layed down for them, and now imagine trying to make them understand how to conceputalise their own data domains and define their own relationships. Maybe 2% of people could do it properly, and then 1% of those would end up in a profession that would use the skills.

When programmers write software for general use we have to think how to make things easy multiple levels below the level we have to think at. The vast majority of people are not able to think technically, and do not have patience - and that's because most people in this world find it uncomfortable to do anything that isn't centred around a social or emotional act.

Developers find users can't do programming, so the programming language becomes a graphical interface. The users can't navigate the graphical interface via a structure based on logic, so the screens get built into an icon based organisaion with a well-defined 'workflow'. The user can't think logically about how to use the graphical interface, so help is written to explain how it works and what it can do. The help is too general so specific examples are given. There are too many examples and the user can't be bothered to read them, so a colleague stands next to them and they learn to mimmick their colleague.
This isn't an extreme situation - it is typical of the vast majority of users. Now think about the inherent technical complexities of OWL and RDF, and imagine people actually using it for real problems? There's no way to hide what is a purely logical and structural framework for organising extensive data, behind pretty pictures and simple examples.

Share
twitter facebook
Re:Are you just another Anti-Semanticist? (Score:3, Insightful)

by Crayon Kid ( 700279 ) writes: on Wednesday July 19, 2006 @06:00AM (#15742024)

I don't think he's "another Anti-Semanticist". He's just saying that the whole semantic Web concept is based on this: that people will classify content properly and in good faith. Let's be fair, what are the chances of it not being abused? And if so, doesn't it mean that the semantic Web is doomed from the start?

Think of all the things that were fouled by abuse. Email was a very sweet thing until it got perverted by spam. Newsgroups too. If the possibility for abuse exists, it will happen.

Parent Share
twitter facebook
Re:Semantic web is currently fragile technology (Score:3, Insightful)

by Crayon Kid ( 700279 ) writes: on Wednesday July 19, 2006 @06:12AM (#15742057)

Look at RSS, for instance; it's about the simplest thing which could do the job it does.
But may I point out, in addition to your comment, that such technologies have fared well as long as the human element is closely involved with them. RSS, social bookmarks, tags, microformats.

On the other hand, Tim Berners-Lee seems to stress the fact that the semantic Web is all about AI doing content classification for us. So I think it's time we remember the old joke, "artificial inteligence is no match for natural stupidity". Or for human malice, I should add.

I see a problem in all this AI involvement. It's a single point of failure of sorts, if you will, similar in a way to the one involved in precisely identifying people's identity: the more you trust an automated system, the more badly you'll be burned if the system is abused into reporting the wrong thing.

The theory is wonderful, so's the Web, the Internet, computers and so on. But they are used by people. I have a hard time believing people will behave and resist the temptation to abuse this system just like they have abused countless others before.

Parent Share
twitter facebook
Re:Incompetence of users such as Slashdot editors. (Score:4, Insightful)

by TrappedByMyself ( 861094 ) writes: on Wednesday July 19, 2006 @06:40AM (#15742132)

Bingo! You've just proven that the incompetence spreads beyond MySpace.
The problem with the semantic web movement is this: You have the web guys from the W3C who got famous by building kinda crappy, but effective technology (HTTP, HTML, etc...) going goo goo gah gah over PhD Ontologists from the AI community. They team up and build these great things that the average person (including the people who think they are really really smart, like the Slashdot editors), has no chance in hell of using effectively. What'll happen, is that eventually there will be useful Semantic content and Intelligent Agents doing great things, but that work will be done by a select few. The unwashed masses will still be the domain of Google.

Parent Share
twitter facebook
Re:Semantic web is currently fragile technology (Score:4, Insightful)

by Bogtha ( 906264 ) writes: on Wednesday July 19, 2006 @06:52AM (#15742161)

Tim Berners-Lee seems to stress the fact that the semantic Web is all about AI doing content classification for us.

I don't think I've seen him stress that in the sense that the users are dissassociated from the process. The Semantic Web is all about representing things like tags, microformats, etc, in a generic way.

For example, if comment moderation was defined in terms of a relationship between a person, a comment, and an opinion, that doesn't mean a computer would be moderating comments, it just means that the same mechanism could be applied across multiple websites, without having to build moderation into the websites themselves. You could mod Dvorak -1, Troll, and everybody who lists you in their FOAF file using a browser that supports it, would see that moderation.

Just because the focus is on making the software smarter, it doesn't mean that it's about replacing user opinions with computer opinions. In fact, the majority of Semantic Web stuff I've seen have been all about codifying user opinions to make them more accessible to computers, and thus, more easily exposable to the end-user in a useful way.

Parent Share
twitter facebook
Re:Blaming the user is never right (Score:2, Insightful)

by file terminator ( 985503 ) writes: on Wednesday July 19, 2006 @07:25AM (#15742251)

Are you just trolling, or are you really serious?

The poster said that the links are next to each other. Unless you have seen the site in question, I don't think you are in any position to bash its layout.

There are people I seriously think shouldn't be on the Internet. Heck, there are people I think shouldn't even own a computer. Besides IT-related issues, there are also people I don't think should be allowed to drive a car, use a credit card, raise children, have dogs, etc.

An interesting aspect is that many of these people, especially when it comes to technology but also seen in other areas, somehow think that others are somehow obliged to help them with their activities. Have kids, but expect others to raise them. Get in debt, but expect others to lend you money to deal with the bills. Get a computer, but expect others to hold you by the hand whenever you use it. (Caveat: There are things you can't control, and those you can. If you get in debt because you are between jobs and need a place to live and food to eat, that's one thing. If you get in debt because you must have a better car than your neighbour, that's another.)

I dunno about others, but around the 20th time I show someone how to copy a file, it already feels old. At some point you start wondering whether the user really cannot learn how to copy a file, or doesn't want to learn. In either case, you're screwed.

The same goes for userfriendlyness. I'm all for userfriendlyness. If a task inheritedly is so simple that it can be generalized into "click here" and the designer does just that, great! (I'd argue that there is a class of tasks that can't, and also yet another class that can, but only at the expense of configurability--take a set of defaults and assume they'll be good enough for everyone.) But even then, you'll see users that will say it is too difficult--they might not know where they saved the program, they might not yet have figured out how to run any other program than Internet Explorer, they might feel intimidated by the button, they might refuse to run the program because they read somewhere/someone told them that they should not run programs they got from the Internet.

And that's where userfriendlyness falls apart, and that is when the battle is picked. You can either strive to improve on userfriendlyness forever, aiming to make it automatic and/or intuitive for 100% of your userbase, a goal that will never be reached, essentially wasting all your time for that final fraction... or you can set your percentage lower (exactly how low is up to you) and spend more time on actually developing your program.

And quite frankly, barring a completely horrid homepage, if a user can find the "Forums" link but not the "Download" link right next to it (or maybe not understanding that "download" means "get program")... that user might not be the kind of user you want to spend time supporting.

Parent Share
twitter facebook
Complex? Opportunities for spammer? Don't think so (Score:3, Insightful)

by CaptSolo ( 899152 ) writes: on Wednesday July 19, 2006 @07:38AM (#15742297) Homepage Journal

As I pointed out in the previous comment [slashdot.org] authoring data on the semantic web is no more difficult than authoring RSS or XML.

Yes, figuring out for the first time how to represent your data in RDF (or XML for that matter) can be difficult. Imagine if everyone was trying to come up with an RSS standard on his own instead of using RSS export functionality of his content management tool. That's why we need good guidelines how to publish information on the semantic web. And RDF export functionality (plugins) similar to what RSS plugins are doing.

As for opportunities for spammers and mischief - don't think so.

Why? - If you look at the Semantic Web "layer cake" [wikipedia.org] you will notice such technologies as digital signatures, encryption and trust being part of the scheme. They allow to identify the author of data and ensure he is what he claims to be. There is nothing wrong with your application if it only accepts signed and trusted data. And there is nothing preventing authors of the data from signing the contents. Since the semantic web is a new technology and we already know about problems that spam and misuse can present it is more not less prepared to fight spam.

Note1: Semantic web should be viewed as an integral part of the existing web, not its opposite. Might even be that it can provide an additional layer that will help to combat spam and other problems you mention here. Who knows.

Note2: Spammers will always try to come up with new exploits. We all have to be prepared for this and think how to close the holes they are using. But saying that newer (a further development of existing [web]) is necessarily more opportunities for spammers in wrong.

Parent Share
twitter facebook
Re:A bad example: FreeDB (Score:5, Insightful)

by kthejoker ( 931838 ) writes: on Wednesday July 19, 2006 @08:22AM (#15742422)

Ugh, this is the major misconception of proper Semantic Web implementation.

There are two user types of Semantic Web materia: the individual user and the group.

The individual user only cares about context. It's like a Proustian adventure for him. If he tags Slashdot as "blatherscyte" because that's how he views it, then that's valid. If he tags it as "cmdrTaco" because he is stalking Rob, then that's valid, too. And if he tags it as "monkey" because one time he was petting a monkey while he viewed the site, then that's valid, too. It's like the old saying, "Whether you think you can or think you can't, you're right." There are no wrong semantics for the individual user, because it is his context alone which defines the usefulness of a tag.

For this reason, the individual user should be allowed to tag freely and without limits, and also be able to edit or remove tags later.

----

Now for the group, they have a different goal. Context does them no good, because they don't have the same context. Their goal then is consensus. Take your problem at FreeDB. The simple solution is to let people vote on the accuracy of disputed tags. Or flag ones they view as incorrect, and then review those that meet a certain threshold for flagging. Basically, you want the group to filter out things that don't apply to the group, WHILE maintaining individual context. You don't delete the tags that the group has rejected - you just hide them from the person who has come to view the group tags.

I think this dichotomy of group vs. individual is what has gotten us into trouble with the Semantic Web. To use one example, I think delicious' big mistake was to show you "popular" tags for a given link. What that does is encourages you not to create your own tags, but instead just piggyback on popularity. Over time, this creates homogeny, which is great for the group, but not for the individual user. Sure, they can probably find that link again in a minimal amount of time, but if an individual tag might help them find it faster, but they shunned individual tags for groupthink, so much the worse for them.

And on the flipside if you don't provide proper weighting and trust metrics into your tagging system, you are opening yourself up to not only abuse and inappropriate behavior, but also to the "incompetence" mentioned in the article, which is not so much incompetence as a zero-filter. It's like reading Slashdot at -1. It's kind of a touchy-feely way to look at it, but in Web 2.0 thinking, it's bad to delete content; just filter it out instead. It's bad to censor opinions from the software side; let each user do their own stifling. Give the users complete control over the content, and they will find models that work. It's that simple.

The main problem with the Google guy's point is that philosophically, Google is more groupthink than individual user, because they're a search engine. They value consensus over context. In the future, perhaps they will value context a little bit more than they do. Until then, they have to stand where they stand, because they can't let context into their system. They've tried some clunky mechanisms to do so (Personal Search, anyone?) but until they get it right, the Semantic Web won't have any value to them.

Parent Share
twitter facebook
Re:It's really, really difficult... (Score:3, Insightful)

by Bogtha ( 906264 ) writes: on Wednesday July 19, 2006 @09:42AM (#15742848)

The representation isn't the problem. The problem is agreeing what the the relationships mean.

That problem is not the problem that RDF addresses. It just gives you the tools so that you can concentrate on solving that problem instead of worrying about all the crap underneath. It's like XML doesn't address semantics, it just gives you tools so you can focus on semantics without worrying about parsing.

What does "#friend" mean? Does it mean the same thing to program X as it does to program Y? How can you tell?

You read the specification for the vocabulary you are working with. For example, here's the FOAF specification [xmlns.com].

To use the XML analogy again, the XML specification doesn't tell you what particular element types mean, because that's outside XML's scope. You read the specification for the XML document type, e.g. XHTML, to find out what an element type means.

who gets to decide what #friend means, and whether this is a global or local definition?

You're forgetting that #friend is just shorthand for a URI. It's not a literal string "friend". If Slashdot choose to expose their friend data with URIs like http://slashdot.org/rdf/#friend [slashdot.org] that doesn't have any bearing on the meaning of Friendster's data if they use URIs like http://friendster.com/rdf/#friend [friendster.com]. They are two separate URIs with two separate meanings that the owners of the domain have chosen.

I know, the next thing you are wondering is how this is of value if everybody makes up their own URIs. Well the answer is, if they want interoperability, they don't just make up their own URIs. Just like people using XML get together, agree on concrete definitions and write specifications like XHTML, the same things happen with RDF vocabularies, people get together and decide what they think #friend should mean, write a specification like FOAF, and use the same URIs.

These are questions that I've never heard answered in any believable manner.

Ignore all the hype from PHBs, this isn't about computers magically understanding arbitrary documents. This is about expressing relationships in a standard way. Of course you need some way of agreeing on what relationships mean, which is why people write specifications. RDF doesn't solve that problem, it's outside RDF's scope. RDF is much smaller and more focused than you think, it's not magic.

Parent Share
twitter facebook
Semantic Web is just backwards (Score:3, Insightful)

by snowwrestler ( 896305 ) writes: on Wednesday July 19, 2006 @11:20AM (#15743520)

It's trying to impose structure on something that is not very structured--human thought. Even the use of the word "semantic" points out the futility of the exercise, as it indicates language and changes in meaning--not structure.

Semantics is a human discipline--it is focused inward, not outward. Likewise the proper place for semantic technology is in the client, not the content. Building "semantic web sites" makes no sense. Google is absolutely right on this one--Web sites should simply be what they are, and it is up to the client to assign meaning and remember connections. Google provides a variety of tools that help people do just that.

Why should I have to tag everything I read online? I don't tag things I see in real life. I just remember them and make connections in my mind. If we want computers to be actually useful to us as assistants and not just stupid tools then they will need to begin to operate the same way. That is a very tough problem, yes. But it is the way we are headed, and the "semantic web" is IMO just a bad hack until we get there.

Furthermore the idea of trustworthyness and authority online is ridiculously complicated. I can't think of a harder problem in all of AI. It's much harder to determine if someone knows what they're talking about, or if they are trustworthy, than it is to simply identify the topic of an article. And we're still struggling with the latter.

Parent Share
twitter facebook
Re:A bad example: FreeDB (Score:4, Insightful)

by kthejoker ( 931838 ) writes: on Wednesday July 19, 2006 @01:05PM (#15744411)

That was the entire point of my post! The group benefits from standardization, but the individual suffers. The Semantic Web is an attempt to give power back to the individual user. Subjectivity is a crucial element of the system, and sanitized, standardized, NPOV systems deny the individual subjectivity.

Delicious is very smart in that it left the *option* for customised tags, but they are clearly saying by implication that the best tags are the ones everyone else is using. My point being that the idea of a "standardized vocabulary" is antithetical to the ideals of the Semantic Web. We don't want a democracy of ideas; we want a free market of ideas!

Think of the concept "funny." Let's say I asked you to go to 100 different random sites and tag them as funny or not funny. Let's say that of the sites you listed as funny, it was clear you enjoyed witty, New Yorker-style humor, and not fart jokes. But let's say 99 other people did the same thing, and they did the opposite: they clearly enjoyed the fart jokes, and hated the New Yorker wit.

Now if you asked this seeded engine for a recommendation of a new, 101st site that was funny, should it give you fart jokes, or New Yorker style? This is the power of the Semantic Web. What's funny to you, isn't funny to everyone else. Why should you be punished for that? And if a total n00b comes to our engine for a recommendation, they get the fart jokes page, because it assumes they're like everyone else. But if they start marking those sites as not funny, eventually it'll figure out they're more like you, and start giving them sites that you like.

Now, will delicious ever do that? Of course not, because it doesn't offer any discrimination to you on the word funny. You get the democratic version of funny. Fart Jokes for all. And that's what "standardization" has to offer. So, no, you can keep that; I want the Internet to understand who I am, and what I like, not what everyone else likes. And if they HAPPEN to coincide, that's fine, so much the better - things are popular because of the people, after all - but they shouldn't have to.

Parent Share
twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Related Links Top of the: day, week, month.

413 commentsChatGPT Leans Liberal, Research Shows
347 commentsAmazon CEO Says 'It's Probably Not Going To Work Out' For Employees Who Defy Return-to-Office Policy
327 commentsHotel Owners Start To Write Off San Francisco as Business Nosedives
323 commentsChina is Building Nuclear Reactors Faster Than Any Other Country
315 commentsChina is Calling in Loans To Dozens of Countries

Thus spake the master programmer: "After three days without programming, life becomes meaningless." -- Geoffrey James, "The Tao of Programming"