Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Education The Internet

Detecting Conflict-Of-Interest on the Semantic Web 34

CexpTretical writes "At the 15th International WWW Conference in Edinburgh Scotland, Refereed Track on Semantic Web accepted many thorough and interesting academic papers on semantic web research on subjects related to where the Web is in the Semantic Web? One such paper nominated for Best Paper Award, Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection hits on the whole subject of validation and/or verification in the brave world of so called "Web 3.0" topologies/frameworks/architectures. The paper describes a "Semantic Web application that detects Conflict of Interest (COI) relationships"."
This discussion has been archived. No new comments can be posted.

Detecting Conflict-Of-Interest on the Semantic Web

Comments Filter:
  • by jlowery ( 47102 ) on Friday December 08, 2006 @03:10PM (#17165502)
    Perhaps the harder problem is detecting any interest in the Semantic Web.
  • their campaign contributions, former colleagues etc.
    That would make for an interesting web application and an interesting election year...
  • by Ranger ( 1783 ) on Friday December 08, 2006 @03:20PM (#17165620) Homepage
    Calling it a conflict-of-interest is really a matter of semantics. The conflict arises when people see the words semantic web. They are someowhat interested but but are conflicted in not admitting they don't know what the word semantics means and are too embarrassed to look it up.
    • by eln ( 21727 )
      The Internet means never having to feel embarrassed about looking things up. For example, let's say I answered your post by saying, "Semantics is the study of the relationships between various signs and symbols and what they represent." Clearly, you would have no way of knowing whether I went to dictionary.com and pulled a random definition out, or if I just knew that off the top of my head.

      • by Saikik ( 1018772 )
        I didn't RTFA, but I think he was talking about Anti-Virus

        This web thing sounds like the perfect way to catch a lot of nasty bugs.
  • by radtea ( 464814 ) on Friday December 08, 2006 @03:29PM (#17165722)
    This is an excellent paper that highlights many of the issues that will be encountered as the naive realists promoting the semantic web hit the hard fact that data quality is poor and identitification is hard. From the paper's conclusions:

    The goal of full/complete automation is some years away. Currently, quality and availability of data is often a key challenge given the limited number of high quality and useful data sources. Significant work is required in certain tasks, such as entity disambiguation.

    As a practical tool the Semantic Web has all of the problems that no-fly lists have. People share names with each other and one individual may appear under multiple names. Datasets are radically incomplete, and an awareness of the possible uses to which data may be put will encourage the less scrupulous amongst us to deliberatly devalue datasets by including misleading or incomplete information.

    Even without deliberate poisoning of the data, it is doubtful that standard vocabularies will be used in sufficiently consistent ways by various institutions and individuals to create homogenous (and therefore useful) datasets. For example, people who do multi-centre cancer trials expend an enormous amount of energy on data curation and auditing, which includes actual site visits to institutions and periodic audits of data, as well as centralized control of what gets into the final database. And this is for data collected by cancer centers and cancer docs who are nominally committed to following precise protocols and have been given training in what the fields in the various forms are supposed to mean. Yet centres can and do get delisted from studies due to lack of compliance.

    The same thing can be seen in nominally standardized data formats like MAGE-ML and its cousins: industry-standard XML-based languages for marking up genomic datasets. There are specific elements that are intended for particular pieces of data, but a depressing amount of the time companies decide to put the really important stuff in a catch-all element, because "it's easier" than understanding the well-documented and clearly defined format.

    Likewise, medical images created in DICOM format by major equipment manufacturers not infrequently have clear and blatant violations of the DICOM standard, despite over a decade of effort to ensure a reasonable level of compliance. And these are not subtle violations, but missing required fields, or incorrect data in required fields ("because all our images are 512x512 why should we have to fill in the width and height all the time? It's easier to just leave them zero.")

    People are stupid and lazy. I know I am. And we use the same words to mean different things, and different words to mean the same thing. The Semantic Web requires people to be smart and hardworking, and to use standardized vocabularies in standardized ways. Decades of failed or at best partially successful data exchange protocols strongly suggest that these requirements will not be fulfilled.
    • Allow me to sum up your post: Semantic Web - Sounds great. Good luck with all that.
      • by maxume ( 22995 )
        The upshot is that it more or less inspired 'Web 2.0'; people realized that some data was better than no data, and that correctness didn't matter as much as a few people(rdf) wanted. So along came delicious and flickr and so forth, and there was the brief period of excitement-chasing where 'folksonomies' replaced ontologies and blah blah blah, but people pretty much said, 'No, I just like to be able to look at all my vacation pictures at once' and things died down, but flickr really is a better photo manage
    • People are stupid and lazy. I know I am. And we use the same words to mean different things, and different words to mean the same thing. The Semantic Web requires people to be smart and hardworking, and to use standardized vocabularies in standardized ways. Decades of failed or at best partially successful data exchange protocols strongly suggest that these requirements will not be fulfilled.

      A quite standardized vocabulary actually exist in Wikipedia (markup language, templates, categories).
      Here is a list o
    • Re: (Score:3, Insightful)

      So there's 9 authors on this paper. Which one are you and which one is the submitter or the article?
  • People that know the most about something have an interest in it *gasp* Smart people in X like to hang around other smart people in X, sometimes even goto conferences *whoa*

    This is old news in academia, and yet things still work pretty darn well there. That is because reputation is important, and as soon as you do something unethical or even just stupid, you're toast.

    If a field gets too "imbred" as far as their research/reviewing goes (e.g. a group always present at workshops at 3rd rate conferences, and th
  • It is far more widely useful to view this as a means of finding people who have common interest and may actually be collaborating. Imagine building a database of relationships, appointments, investments, etc that concentrates on the rich and famous, businessmen, politicians, and others with power and then running these algorithms on those relationships. Imagine examining those relationships in the context of subjects (relationship strengths should differ depending on the subject through which the relation
  • Even though I looked at the paper, I still have no fucking idea what this is about.

    I vote that next year the Best Paper Award go to "Looseleaf Paper" because it has both holes AND lines.
  • This was interesting enough, I guess, as a really high-level description of a process that could be used to build a semantic web.

    That said, the researchers picked the domain very carefully - to guarantee a positive-looking result, I guess - but I don't see how this could scale to the web in general, a place where, well, nobody knows you're a dog.
  • ...but in the article, did you notice their example, "Swoogle"?

    Being just curious enough to add a .com after that, I then wondered if they chose that site on purpose. Doubtful, but positive reinforcement DOES work at times...

  • It's all very well these hucksters [amazon.com] peddling [amazon.com] the semantic web [wikipedia.org] to funding bodies who don't know any better [europa.eu], as long as they don't start pretending it's anything other than the new FIPA [fipa.org] - a collection of committees generating specifications that the world will continue to ignore [w3.org].
  • The semantic web assumes everyone in the world will play nice and publish his data using standard schemas.

    This is estimated to happen soon after Microsoft will switch to a POSIX standard operating system, the RIAA will support buying musing in Ogg Vorbis format, and Sony and Microsoft will agree on a common Blu-DVD format, and airline companies will really tell you how the compute their ticket prices. And the rupture.

    Seriously... the idea is beautiful in theory, but in practice people do not want their data
    • Re: (Score:1, Interesting)

      by Anonymous Coward
      They've done a good job shutting pricegrabber and froogle down, haven't they? No, wait ...

      Seriously, some people do want their data to be available, and if that makes them more competitive, why not?
  • had their own Interwebs. Good for them!

"An open mind has but one disadvantage: it collects dirt." -- a saying at RPI