Slashdot Log In
Seven Search Engine Evolutions for '07
Posted by
CowboyNeal
on Sat Dec 16, 2006 10:13 AM
from the new-and-improved dept.
from the new-and-improved dept.
eldavojohn writes "I found a short but interesting list of predicted evolutions of search engines that will most likely be implemented in 2007. While some are vague and obvious like a better human interactive experience, there are others that are worth looking into like alternative means of indexing and using semantics — not keywords — for matching documents. The author of this list is Dr. Riza Berkan, also the author of 'Fuzzy Systems Design Principles.'"
This discussion has been archived.
No new comments can be posted.
Seven Search Engine Evolutions for '07
|
Log In/Create an Account
| Top
| 72 comments
| Search Discussion
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Still waiting for the following paradigm shift. (Score:4, Interesting)
(http://www.nightlifemagazine.ca/ | Last Journal: Thursday March 24 2005, @12:46PM)
I'm tired of having to sift through hundred of SEO rigged garbage sites and/or blogs to find what I'm looking for. When I'm looking for something i'd like to find it. Not some new shittier variation of it.
Good times.
cant be wrong (Score:2, Funny)
(http://www.etrangementmoelleux.info/)
Slashvertisement (Score:3, Insightful)
Even if you could radically change the way a search engine works, you then face an even bigger task: Forcing users to radically change their searching habits to fit your search engine.
And what the hell is "QDEXing"? Google reveals nothing, therefore we can conclude it does not, in fact, exist.
Re:Slashvertisement (Score:4, Insightful)
In 2007? (Score:4, Informative)
Ah, let me just tag this article 'semanticweb'... there, much better now...
As early as 2007? Now I don't really believe that.
It may get partially implemented, and probably only in English.
Maybe Chinese as well.
Most of the other languages will have to wait for quite a while beforehand...
Not to say semantic search is a bad idea or anything... I, for one, would like to see some image-, audio- and video-search based on some kind of semantics, not tags and names... but that'll just have to wait.
So important, MS put it into Vista (Score:1)
(http://www.overclockingwiki.org/)
Let me be the first to say... (Score:4, Funny)
(Last Journal: Thursday December 09 2004, @09:25AM)
Seven Spam Evolutions for '07.. (Score:1, Informative)
Can someone tag the article "spam"?
Change number 6: done (almost) (Score:1)
Ok, someone could say it's the perfect way to permit abuses and lot of work has still do be done, but it's a smart proposal to start from. Don't you think?
http://www.yoople.net/ [yoople.net]
Get your own house in order (Score:5, Interesting)
Semantic Searches? (Score:3, Funny)
(http://www.insidebet.com/)
Google is here to stay (Score:1)
(http://www.jasonrippel.com/ | Last Journal: Tuesday March 06 2007, @11:20AM)
The way forward (Score:1)
(http://www.britwood.co.uk/)
Google directory. Bringing you the future today.
Evolution, you say? (Score:2)
(http://www.creimer.ws/ | Last Journal: Friday January 26 2007, @12:40PM)
At least one is already being done (Score:1)
Doesnt ask.com give you this functionality already?
This is Great (Score:1)
(http://www.myspace.com/over_engineered | Last Journal: Tuesday November 28 2006, @11:20AM)
That was a pretty good article, even though most of the stuff on there was pretty obvious (for most of us /.'ers) to begin with.
I think it was only inevitable that internet searching focuses more on the "type as you speak" initiative rather than the older term-by-term searching of the past. This would be great for us, but I really see that the benefits would cater more to the average man/woman who already has a difficult time searching because they are using "the wrong terms."
I really think that Google will be the first search engine to implement most of these changes, since their user-base and R&D is already above the roof. I think that Microsoft will also implement this soon with Live, since a sizable portion of their research teams are testing searching based on semantics as well.
A lot of this is available now (Score:4, Informative)
results equivalent to running multiple queries about the
meaningful variations of the same topic.
5. The first time a search engine will let users evaluate answers
on the spot by displaying uninterrupted and coherent text
snippets, often letting searchers forgo having to click through
to links and saving time.
Both of these have been available for a couple of years: e.g. searching on the single query "semantic web" using CQ web [q-phrase.com], reveals clusters such as these:
fuzzy sets
fuzzy systems
neural networks
set theory
soft computing
aritifical intelligence
control systems
expert systems
And each one of which is linked to a specific page of results using sentences instead of snippets, e.g. for artificial intelligence:
1. This paper will present the foundations of fuzzy systems...noteworthy objections to its use with examples drawn from current research in the field of artificial intelligence.
Fuzzy Systems - A Tutorial [austinlinks.com]
2. The most obvious implementation for the fuzzy logic is the field of artificial intelligence.
Fuzzy Logic [ufl.edu]
3. Ultimately it will be demonstrated...fuzzy systems makes a viable addition to the field of artificial intelligence and perhaps more generally to formal mathematics.
Fuzzy Systems - A Tutorial [austinlinks.com]
4. The paper gives examples of the fuzzy logic applications with emphasis on the field of artificial intelligence.
Fuzzy Logic [ufl.edu]
5. A collection of articles and other technical resources for artificial intelligence.
PC AI - Fuzzy Logic [pcai.com]
7 Things Hakia Will Promise but Fail to Deliver (Score:2)
(http://strathearns.org/wds)
So maybe they'll implement decades-old tech? (Score:5, Insightful)
(http://home.cfl.rr.com/diehardanddragon/)
mark
Weighted sorting is all I want (Score:5, Interesting)
(http://www.emenoh.com/ | Last Journal: Monday April 17 2006, @10:08PM)
Give me a slider control that instantly filters the results... ie: have the first 100 results waiting for me with 20 showing, then let me adjust the weight of my keywords until I get the list I am looking for with individual items falling off or being added to the list as I adjust the controls.
It Is About Time (Score:4, Insightful)
Bring it on NOW.
Press Release (Score:2)
(http://www.bobgregg.com/)
maybe a search octopus instead of a puppy (Score:2)
(http://asecular.com/)
About freakin' time. (Score:1)
Hakia results are less relevant than Google (Score:1)
(http://nektra.blogspot.com/)
Don't make Adversing with Slashdot guys help when you have nothing new to offer.
"Answer Engines" are doomed (Score:2)
(http://pornel.net/)
This is already a problem to some extent - Nielsen wrote about this in 2k4 [useit.com].
Seven predictions in Web search '07 commented (Score:2)
(http://www.iccs.inf.ed.ac.uk/~s0239229/ | Last Journal: Wednesday May 23, @03:28AM)
working in the field:
> 1. The first time a search engine will have an alternative to
> indexing; new technology like QDEXing will be developed.
Indexing is a pre-requisite for fast access of retrieval results.
Even distributed peer-to-peer indices that are a very attractive
idea suffer from bad performance due to the absence of a monolithic
index owned by an organization with huge bandwith.
> 2. The first time ontological semantics will be used that will
> enable a search engine to perceive concepts beyond words and
> retrieve results with meaningful equivalents.
The problem with applying ontology based search is the disambiguation,
i.e. the mapping from natural language words (terms) to the unambiguous
nodes of the ontology (concepts). Automatic disambiguation needs to be
pretty good in order to help in search, but this is unfortunately still
an open research problem.
> 3. The first time that search results will include highlighted best
> sentences as a result of semantic analysis rather than bolded
> keywords as a result of finding incidences.
This prediction seems to mix presentational issues (bold layout) with
processing issues. The problem with the former is that flagging a whole
sentence bold perhaps isn't well liked, as it could already have been used
with current technology. The problem with the latter is what exactly is
meant by "semantic analysis" - "deep" automatic natural language processing is
still a very costly operation and may not be an option as early as 2007 to
be applied to a whole Web index.
> 4. The first time that a single query will bring a gallery of
> results equivalent to running multiple queries about the
> meaningful variations of the same topic.
We would not notice this, since it would be carried out internally.
However, this processing intensive step could be (preferredly) replaced
by result-equivalent change in the ranking algorithm.
> 5. The first time a search engine will let users evaluate answers
> on the spot by displaying uninterrupted and coherent text
> snippets, often letting searchers forgo having to click through
> to links and saving time.
Giving answers is certainly an emerging trend, cf.
http://www.infonortics.com/searchengines/sh05/sli
But it may last longer than one year to become pervasive.
The repeated mention of snippets seems to suggest that the author of
this set of predictions has found fault with snippets and considers
this a priority, whereas most people - at least on desktop PCs - seem
to be okay with the way results are summarized today.
> 6. The first time a search engine will have a dialogue utility that
> will help point out best answers or suggest a Gallery for a more
> engaging human-like search experience.
Further work in interactive search is certainly ongoing, in some sense
a dialog feature is already operative, as real search engine logs show
that users keep re-phrasing and refining their searches all the time
to converge to the results they desire.
> 7. The first time a search engine's data will grow by detection of
> new knowledge rather than by detection of new pages. Search
> engine growth by knowledge will be the new direction for the
> industry for 2007.
This depends on a universally accepted notion of knowledge, and how to
measure/acquire it automatically. Perhaps one of the strengths of modern
search engines is that NO commitment to any kind of theory of knowledge
has to be made, it works - for better or worse - because all it needs
are strings.
Dragging Results (Score:1)
(http://www.norwinter.com/)
The way forward is to allow people to reorder their results and to delete spam results. This way we'll have a search engine that actually learns what people want and acts appropriately. Sites like Digg and Reddit are on to something in this sense. They use 'swarm' technologies to determine what is most relevant in a certain narrow category at a certain time.
Just like another commenter mentioned there is already something like this: Yoople [yoople.net]. A couple of months back I wrote that Google's Searshmash [playingwithwire.com] secretly was playing around with something like that too.
Re:Wildcard search , will it ever be available??? (Score:1)
(http://douglasheld.net/)
Re:Wildcard search , will it ever be available??? (Score:1)
(http://wyman.us/)
Many search engines implement "stemming" as a means of covering the most common requirement for sub-word query predicates. Stemming allows both "cat" and "cats" to be treated as the same word. A system that provides stemming will often use "stems" as the smallest unit of data indexed rather than words. Thus, in the index, there would only be one entry for "cat" which represents both the words "cat" and "cats."
Of course, character-based search exists today for small corpora -- grep is the most well know system, I think. But character-based search on a large corpus is a very difficult problem that, using today's algorithms, would require massive computing capacity to provide to even a small number of users. However, in some specialized applications, this can be justified. One technique for reducing the resource cost of character-based regex on large collections is to index bigrams, trigrams, quads, etc. (i.e. use multi-character sequences as the base "symbol" or smallest unit indexed. ) However, such ngram based systems still don't deliver the serving capacity that one would expect from something like a Google or Yahoo!.
One "mixed" strategy that has sometimes been employed is to support a more complex query that has a "first part" which is word-based (as in traditional search engines). This first-part is used to select a subset of all the documents in a collection. Once the candidate-result-set is produced, one would run a character-based regex (similar to grep) over the results. Of course, you can probably see that you can easily hack up such a two-phase search yourself by sending a word-based query to a search engine and then doing the character-based regex on the result pages.
I fear we are stuck with word-based search for the foreseeable future. While folk often ask for character based regex, the reality is that it is simply too hard to implement today. Also, the size of the market that has a real need, as opposed to a desire, for this capability is much smaller than one might think....
bob wyman