Super-Fast RDF Search Engine Developed 144
The Register is reporting that Irish researchers have developed a new high-speed RDF search engine capable of answering search queries with more than seven billion RDF statements in mere fractions of a second. "'The importance of this breakthrough cannot be overestimated,' said Professor Stefan Decker, director of DERI. 'These results enable us to create web search engines that really deliver answers instead of links. The technology also allows us to combine information from the web, for example the engine can list all partnerships of a company even if there is no single web page that lists all of them.'"
Official DERI Website (Score:3, Informative)
DERI [www.deri.ie]
TMA: Too Many Acronyms (Score:3, Insightful)
Re:TMA: Too Many Acronyms (Score:5, Funny)
OMG: Oh my God!
WTF: What the fuck?
BBQ: Barbecue.
HTH
Re: (Score:2)
I also wondered why anyone would need a search engine to go through Steve Jobs' notes. But... "RDF" could be "Radio Direction Finding".
Wiki has several more suggestions [wikipedia.org]. The one I think this thread is about is at the bottom of the list.
Re:Official DERI Website (Score:5, Funny)
Re: (Score:1)
Re: (Score:1)
but not the "yars2" world-record-busting supergadget.
This could be huge (Score:5, Interesting)
Next up: Ontology spam (Score:5, Insightful)
Re: (Score:2)
Re: (Score:2)
For a while, yes. But as long as there is a cash-per-page-view market, the onslaught of adverspam will reach every corner of the web. It can't be stopped as long as there is money to be made there.
Certainly the big "pure knowledge" sites will defend themselves, as Wikipedia does, but that is an ar
Re: (Score:1)
Re: (Score:2)
Agreed. As long as there are bullshit artists in the world, they will find ways of expressing themselves.
I don't think so. I think that liars work at
Re: (Score:2)
And I'm sure that next generation search engines will create clever ways of detecting and punishing ontology spam (e.g., noting the dissonance between the text and the tags)
Re: (Score:2)
Re: (Score:2)
transparency ==> translucency ==> opacity
Or, to put it in website design terms: "It's not blue enough.
Re: (Score:2)
Re: (Score:2)
But... does that make a parrot an ornithological ontologist?
Re: (Score:2)
Re: (Score:2)
I'd say consistent ontology is a bigger challenge (though also one that doesn't need to be anywhere near completely solved for all kinds of useful applications to exist.) Trust mechanisms built on RDF aren't really all that big of a challenge: trust relationships are fairly basic, straightforward relationships of exactly the type RDF was designed to express from the outset, after all
Re: (Score:2)
Re: (Score:3, Insightful)
Re: (Score:2)
Re: (Score:1, Insightful)
Re: (Score:2)
Re:This could be huge (Score:4, Insightful)
Re: (Score:1)
AND is it public? Can I hook up to it and send some queries just to see for myself how fast it is?
Re: (Score:1)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2, Interesting)
Re: (Score:2)
until somebody comes up with the Dewey Decimal System for all knowledge, it won't mean much
it's coming: http://metadata-stds.org/19763/index.html [metadata-stds.org]
Last year I attended an excellent seminar track on content and knowledge at this : http://www.xmlsummerschool.com/ [xmlsummerschool.com] - and one of the speakers had a great example - he wanted to be able to search for a guitar amp speaker cabinet that would handle the 100w (that's RMS) output of his Marshall amp, and fit in the boot/trunk of his car - I forget the make/model, let's say it's a Ford whatever... anyhow, the point is that the semantic search app would need to
Why would I want to search... (Score:2)
Re: (Score:2)
Re: (Score:1)
Links! (Score:4, Insightful)
I need both: answers *and* links! Many times when I search the web, I don't know for sure what am I searching for, let alone being able to ask specific question...
Re: (Score:2)
Re: (Score:2)
RDF could do very useful things, like throwing up a disambiguation question at the top os the results page when you've not made it clear what you want, or filtering out the plague of typosquatter/content free price comparison/'be the first to write a review of this item' sites, but so could a bit more intelligence built into Google.
Re: (Score:2)
Re: (Score:3, Funny)
No thanks, I don't need Clippy in my search engine.
Re: (Score:2)
In your example, I'm guessing you might find the option to filter dow
Re: (Score:2)
However, I think contextural disambiguation questions like what you're suggesting are already served by "search within results" queries. Proposing likely criteria for narrowing down the results would be, I think, a disservice. It pigeonholes sites, but worse than that, pigeonholes searches. This leads to easy gaming of the search system -- SEO would cause pretty much every site to make sure it's associated with the typical disambiguation terms, thus removing the utility
Re: (Score:2)
The terms wouldn't be 'typical disambiguation terms', as they would be generated freshly from the content of the pages that appear in the searc
Re: (Score:2)
So, I think what you're suggesting is that the search engine prompt those terms to help people narrow their search? Didn't Ask Jeeves try this and miserably fail -- and if
Re: (Score:2)
When they come up with a computer that will be able to ask questions for me, then I will be impressed
Search solved. World hunger next. (Score:4, Funny)
Re: (Score:1)
Having solved the problem of search, and providing a breakthrough product that has consciousness to what was previously mere series of tubes
This breakthrough makes it possible to use the Interweb as a tube of artificial intelligences capable of answering such questions as "Who is Neuromancer?" and "Why is the number 42 so important, anyway?" as well as organize a successful revolution by moon colon
Re: (Score:2)
We've already got a cure for cancer.
Hype (Score:5, Insightful)
Yet another
Re: (Score:1)
Yet another /. post bitching about /. articles, yet adding absolutely no value of their own.
Seriously. Do you have anything to add to the discussion or were you simply karma whoring?
Re: (Score:2, Insightful)
RDF? (Score:4, Funny)
almost boundless (Score:2)
I'll prove him wrong (Score:4, Interesting)
This is without a doubt the greatest invention in the history of time!
There, I just proved the professor wrong. Muahaha.
Re: (Score:2, Funny)
Some people can overestimate the importance
Re: (Score:2)
Gotta think logically...
contradictory (Score:1)
Cannot be overestimated (Score:5, Insightful)
The importance of any event can be overestimated and quite often is overestimated. It is called hype.
When speaking of XML, XHTML and semantic WEB then the word "overestimated" fits just nice.
If this was not the case then HTML should long have been dead and the whole WEB should have been based on pure XML with meaningful tags.
-- Do not read me, I am a stupid tag
Re: (Score:1)
I've seen this sentiment regarding "HTML" vs "XML" on Slashdot so often; let me set the record straight:
Many sites use XML on the back-end, either as an interchange format with a DB, or to store and to generate HTML. I would dare say that *most* web-based applications of XML generate HTML, rather than XML, as the final output format. Outputti
Re: (Score:2)
Could be interesting, but missing details (Score:5, Interesting)
What kind of queries are they running? There are several different RDF query languages (think of SeRQL, RDQL, N3, SPARQL, etcetera) and some of them support quite complex queries. Quickly finding the answers to a simple query like is just a matter of an indexed lookup and not very special. But, like in SQL, much more complex expressions can be generated that require complex index operations on the query execution level. Having implemented an RDF database that supports SPARQL queries an order of magnitude faster than the software the W3C uses for their experiments (which, admitedly, doesn't have performance as a prime requirement), I know that it's possible to do simple things fast, but the interesting part is handling RDF queries that don't easily map to efficient database operations.
Which brings me to the most important point: where is their detailed report? Can I get the software somewhere and perform my own tests? The article is too vague to draw any conclusions about what their RDF database does, and how good it is. I'd love to read up on it, but I can't seem to find the information.
Here's the Tech Report (Score:5, Informative)
We have a Technical Report available at http://www.deri.ie/fileadmin/documents/DERI-TR-20
From the abstract:
"We present the architecture of an end-to-end search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collected from many disparate sources on the Web.
In particular, we study distributed indexing methods for graph-structured data and parallel query evaluation methods on a cluster of computers.
We evaluate the system on a dataset with 430 million statements collected from the Web, and provide scale-up experiments on 7 billion synthetically generated statements."
Re:Here's the Tech Report (Score:4, Insightful)
Mod parent up! (Score:1)
Re: (Score:2)
Also, what has your team done with utilizing GPUs as graph-parsing engines?
I know just enough about triple-stores and RDF to know this really is big news. Congrats.
use google to search their site (Score:2)
site:www.deri.ie technical report 2007 4 20
SUPER Speed (Score:2, Funny)
Re: (Score:2, Informative)
If you're going to steal a joke, you need to make sure to replace all references to the original. Find / Replace works great for this.
Two things... (Score:2)
Second, the problem with "the semantic web" if you're relying on people providing the metadata themselves, is the reliability (trustworthiness?) of the person creating the metadata. There's a reason the meta name="keywords" tags aren't a significant factor if at all in any of the major search engines' ranking systems.
Re: (Score:1)
Second: There are other ways to get metadata - eg., via SIOC (see URL:http://sioc-project.org/>. But true, trust is an issue. And some people in DERI Galway are working on ranking algorithms on top of the search engine.
Web of Data (not just metadata) (Score:1)
One of misconceptions about the Semantic Web - that it's only about metadata when in fact it's about a Web of Data, e.g., currently locked in in databases, blog engines or social software sites. (related: SemWeb FAQ entry [w3.org] on "Does the Semantic Web require me to manually markup all the existing web-pages ... ?")
A very, very si
sounds fishy (Score:3, Interesting)
Of course a search based on meta data is going to be faster and more accurate, but only when the meta data is correct. We've had this since the beginning of the interweb; people would load up their pages with bogus meta data just to generate search traffic. Because of this dishonesty, search engines have had to resort to other methods of evaluating and indexing pages (for example, based on actual content).
I don't see any difference between this new RDF and that old stuff.
Re: (Score:1)
RDF is just a way to express knowledge. In answer to "any difference between this new RDF and ..." you may take a look at the W3C Semantic Web FAQ [w3.org] (published very recently).
Now, like you said what we find depends on what we feed into search engines and on the engines themselves. To this regard it's work for better search engines and ranking algorithms, and the work described here is an important step in this path. There's a link to a technical report and more details posted (by a developer) in another Sla [slashdot.org]
Re: (Score:2)
Save the hype (Score:2)
Re: (Score:1)
Of course building a web of data is more demanding - the infrastructure is far more complicated.
But we have made tremendous progress over the last years - to the point where currently structured data coming from applications like Wikis, Mailing Lists, Bulletin Boards can, should and will be integrated. And progress
Developer on this project (Score:3, Informative)
Re: (Score:1)
Re: (Score:2)
I'm using Firefox under Windoz and I could not access the article either. It's a bad URL.
Re: (Score:1)
go to www.deri.ie
click on "World Record 7 Billion Triples"
scroll down on the resulting page click on the word "here" in the last sentence.
get busy reading
I suspect that this is not a browser-related problem but a server-problem. The link in the OP and the one mentioned here is the same.
Fixed URL (Score:2, Informative)
2 all: remove the ending slash '/' from the URL above, it will work then.
Correct link: http://www.deri.ie/fileadmin/documents/DERI-TR-200 7-04-20.pdf [www.deri.ie]
Re: (Score:2, Informative)
This is great and all (Score:2)
Reality Distortion Field (Score:2)
RDF is a bad idea (Score:2, Insightful)
Re: (Score:1)
So once you have the source recorded one is able to do trust computions with the graph and its source - eg., using pagerank like algorithms. Some sources can be assigned a low trust value, others can g
Re: (Score:2)
Then the fast search engine is not really proving its speed on the real problem... only a problem sub-set. I see now I was missing the part about the quad. (None of the linked materials talked about them either I don't think) Thank you for educating me on that point.
Re: (Score:2)
I did have the opinion that the only way to provide a truly useful search system was to create an engine that could read and understa
What the hell is RDF? (Score:2)
Overestimation (Score:2)
I agree. In my estimation, this could well foretell the cure to AIDS, cancer, world hunger, war, and genital warts.
42 (Score:1)
The first answer will be 42.
Re: (Score:1)
The first answer will be 42.
That, it turns out wasn't the hard part, it's figuring out the query!
-Jason
Re: (Score:1)
If that is the answer to "Life, the Universe, and Everything", then ALL the answers should be 42.
Give me a couple minutes and I'll write the code for you search engine.
Re:Great!! (Score:5, Informative)
Re: (Score:3, Informative)
Re: (Score:1)
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
Re: (Score:1)
...and all they get is the usual, automated Slashdot moaning.
At least you've realized the truth, this being ./, everyone owns a spambot that searches for keywords in stories. You get this particular crapflood when the bots realize that something with the words "revolutionary" and "internet" (and possibly also "semantic")has appeared. Everything here is just markov chain output. Welcome to the future.
Also, there's an intelligent thread further up where the lead researcher posted that no one seems to be responding too :(