Follow Slashdot blog updates by subscribing to our blog RSS feed


Forgot your password?
The Internet

Better Search Engines 137

prostoalex writes "Scientific American is seeking better Web searches. They report on all sorts of innovations happening outside the Google-Yahoo-MSN zone that the press is usually reporting on, including GPS-enhanced searches from University of Maryland, Shape Retrieval and Analysis from Princeton, musical search engine from New Zealand Digital Library Project, and some of the projects that A9 and have been working on."
This discussion has been archived. No new comments can be posted.

Better Search Engines

Comments Filter:
  • by Jailbrekr ( 73837 ) <> on Tuesday January 25, 2005 @06:32PM (#11474029) Homepage
    If we can whitelist sites, and reduce the total number of advertisments cluttering the search, the existing search algorithms would work quite nicely.

    It is a pipe dream, I know. :(

  • by fembots ( 753724 ) on Tuesday January 25, 2005 @06:33PM (#11474034) Homepage
    a user can record a query by playing notes on the system's virtual keyboard. Or he or she can hum the song into a computer microphone.

    I tried that, but I was so out-of-tune the search engine returned all songs from Britney Spears.
  • What I want (Score:5, Interesting)

    by Anonymous Coward on Tuesday January 25, 2005 @06:34PM (#11474049)
    Is just some better work done on recognizing essentially similar documents. Like, if I perform a search, and 40% of the returns are the same wikipedia article copied to different sites, it would be nice if the search engine could only show me one (wikipedia). Or, like, if I'm searching for some kind of error I got while using Linux. Most of the returns I get will be various old Linux mailing lists, but only some of them will be relevant to my problem. There must be some way the search engine could logically organize them for me so that I could more clearly identify that block of returns that is most applicable to my problem of the moment.
    • What you can do in that case is take a random quote from the article that is distinct and use a -"exact unique text here"
    • Re:What I want (Score:4, Informative)

      by me at werk ( 836328 ) on Tuesday January 25, 2005 @06:49PM (#11474231) Homepage Journal
      CopyScape [] can do the recognizing of copied stuff, but it's purpose is only finding website plagarism. This, however, would definately find all the wikipedia forks unless it's a really old copy and the page has had a major rewrite.

      If google could integrate copyscape into their search, you would be happy.
      • And if it was an old fork, and a major rewrite, I'd appreciate the opportunity to compare the two versions. Also, if the algorithm didn't eliminate one of the two results in this scenario, then by definition they're two separate results, each containing unique information that the other does not have. Giving you one copy of the old Wiki and one copy of the new Wiki seems like exactly what such an algorithm should do, in this scenario.
        • You seem to have missed the history page. You can see past versions of the page, and usually comments give hints as to when it had a major rewrite.
          • I couldn't care less about the Wikipedia "history page", in this context. We're discussing what a good search algorithm should do, in the case that it gets two different Wiki pages, with different content, on the same topic.

            My conclusion is that the search algorithm should return both pages in its search results, not only making me aware of the change over time, but giving me a quick and easy way to compare and contrast the two versions.

            Obviously, if I was restricting my search to the canonical Wiki of t
    • Re:What I want (Score:3, Interesting)

      by Anonymous Coward
      What I want is a button that lets me resrtict my search for a thing to either a review of the thing, a forum/blog discussing the thing, places to buy the thing, or specs/datasheets on the thing. So many times I type in a product name only to get two dozen "find prices/read reviews on X" -- none of which actually have reviews ("be the first to review X!") or even more than a couple of not-so-great prices. A filter could be done by creating a statistical fingerprint of the page.

      I also want to be able to so
      • Re:What I want (Score:2, Informative)

        by HugeFatty ( 745805 )
        I agree that the things you have listed are problems, and that they'd sure be nice to solve. I just wanted to address one of them for now, as I have been trying to deal with it myself.

        The hidden text problem that you mention is a surprisingly hard problem to deal with, as there are so many ways to do it.

        You have:

        • The <font> tag
        • CSS (several ways, such as the :hidden property, changing the colors, using the z order, etc.), both internal and externally linked (for which the search engine must downlo
    • Re:What I want (Score:2, Interesting)

      "Like, if I perform a search, and 40% of the returns are the same wikipedia article copied to different sites, it would be nice if the search engine could only show me one (wikipedia)."

      Like, I agree. I have done some searches and simply find the same text on page after page. It would be nice if the search engine could provide some sort of heirarchy. It could say here is the authoritative source and here are all the sources that qoute it.

      I did say it would be nice, but it really isn't necessary, or it wo
    • Google already identifies similar or exactly identical results. Sometimes it returns a message saying that it has suppressed similar results like the one it is displaying.

      So you might say that they have to improve their similarity detection algorithm, but I'm quite certain that they are working on that already.

      A related problem is to find parts of a page that are "just" menu structure, like links on the left or on the right that are less important than the actual content. That information could then be us
  •! (Score:3, Funny)

    by joshsnow ( 551754 ) on Tuesday January 25, 2005 @06:36PM (#11474075) Journal
    and some of the projects that A9 and have been working on

    I want a search engine with a Genie-Jeeves. Imagine: I snap my fingers, smoke streams from my monitor, materialising into Jeeves, complete with tray, glass and a bottle of that beer I couldn't quite bring to mind when I clicked the search button...
    • I snap my fingers, smoke streams from my monitor, materialising into Jeeves...

      I once snapped my fingers and smoke streamed from my monitor. Unfortunately, it didn't materialize into Jeeves, and my monitor never worked after that.
  • From The Daily WTF []:

    I want a website directory, like a yellow pages, or Yahoo. I want any web user to be able to add a link, under the relevant categories available,,real estate,travel,games etc. I would like the links to be approved before they appear. I want the search results displayed in the following fashion: A URL text, or URL image, with a little description underneath. I want the following tools - top 50 searches, most popular links, a search facility. A space across the top of the p
    • I want a big house, I want lots of money, I would like for everyone to bow to me as their supreme ruler, I want the girls to like me, I want super powers... You See, We Never Get What We Want...
  • metadata (Score:2, Interesting)

    by subrama6 ( 157306 )
    as we get into video search and the like, aren't searches dependent on the quality of the metadata associated with the item? i just tried, and was impressed that typing in "bauer" got me stills from recent episodes of 24. but surely that's based solely on the fact that "bauer" was a tag for the still. at that point, why is new search technology impressive? it's the metadata that makes it possible. am i missing something?
    • Most people are too lazy to enter good metadata in webpages, even assuming a good, standard way of doing so. However I see some future in sites like GlobalSpec [] where your search is narrowed down until you know exactly what type of product you're looking for, and then extra metadata for that class of products is exposed in the search (usually in the form of sets of checkboxes and ranges of metric-aware values) -- allowing you to search by thread size on bolts, but not on other equipment where it's not releva
  • Clusty = Innovative (Score:5, Informative)

    by int2str ( 619733 ) on Tuesday January 25, 2005 @06:38PM (#11474114)

    Asides from the horrible name, clusty (a clustering search engine) is very innovative and easy to use. I hope more search engines will adapt similar technology soon.

    Link to search engine []

  • GPS-enabled search (Score:3, Interesting)

    by jxyama ( 821091 ) on Tuesday January 25, 2005 @06:39PM (#11474122)
    GPS-enabled search would be excellent, as more and more people probably will adopt accessing the web on their cell phones. (already happening in japan, afaik.)
  • by Humorously_Inept ( 777630 ) on Tuesday January 25, 2005 @06:41PM (#11474136) Homepage
    It has been available as a service on mobile phones for something on the order of two years. The same thing, called TuneTracker, is available in Canada now under the MuchMusic brand. Put your phone up to the mystery tune and you'll get the song title and artist's name back in an SMS message.

    I'd like to see a search engine that can intelligently filter results for the word "review." When I search for a product review, I do not want some hole-in-the-net online store's product page with a link to 0 customer-submitted reviews.
  • by Anonymous Coward by up and running?
  • > GPS-enhanced searches from University of Maryland, Shape Retrieval and Analysis from Princeton, musical search engine from New Zealand Digital Library Project, and some of the projects that A9 and

    Jeeves, what sort of music would you recommend to my friends if I told you I was listening to a sculptor singing the plaintext of the curvy-shaped thingy that talks about 38 57' 6.5" N 77 8' 44" W to the tune of The Hymn of the Soviet Union?

  • by Anonymous Coward
    Personally I use the BBC Search engine. Not only does it seem to provide relivant results, it also has recomended links (info here [] ) which are editorially selected.

    The site seems to return far less porn probably due to the fact they "use a combination of technology and regular human checks to detect and block offensive websites. We aim to be the safest search engine in the UK"

    Also slashdot is the first return for "IT News" under the web tag.
  • by Anonymous Coward
    I need a search engine that lets me search for information vs products vs forum posts vs whatever else is on the internet.

    I get frustrated when I'm trying to research a new technology and most of the search results are for commerce sites.
  • From the article "letting your wireless PDA, for instance, pinpoint the nearest restaurant"

    Even if there isn't some kind of Windows software to do this with GPS, I can do it right now by punching in a few digits on mapquest.

    Sure, it's restricted to there sponsors or somesuch. But I don't see this as being any different.
  • by rueger ( 210566 ) * on Tuesday January 25, 2005 @06:46PM (#11474196) Homepage
    Nice article which summarized many of the problems with contemporary search engines.

    My experience is that a few years ago you could type say "baked gorgonzola" [] into Google and be sure to get a useful result pretty near the top. These days though what you want is likely to be on page three or four, after a dozen links to price comparison sites.

    There really is no such thing as a quick Google search any more. It almost invariably involves multiple formulations of your query, and probably trolling through at least two or three pages of results.

    Whether that's because of Google, or the sheer volume of content on the web, or sites that capitalize on Goggle's weaknesses is something I don't know.
  • Sure, sure... (Score:3, Insightful)

    by susano_otter ( 123650 ) on Tuesday January 25, 2005 @06:58PM (#11474320) Homepage
    And the moment any one of these other technologies becomes at all useful, except in certain limited applications, the technology will be acquired by one of the search engines that everybody actually cares about (coughGooglecough), and the functionality will be added to their Internet search solution.
  • Synonyms (Score:1, Interesting)

    by e4ward ( 731937 )
    Odd, just before reading this story I wanted to google for:

    searchterm1 searchterm2 bogus

    and how I would have liked the search engine to actually search for:

    searchterm1 searchterm2 (bogus OR fake OR spurious OR wrong OR specious OR ...etc)

    by being able to specify a qualifier on bogus eg, bogus:synonyms

    • It's available! (Score:5, Informative)

      by ByteMangler_242 ( 618623 ) on Tuesday January 25, 2005 @07:39PM (#11474738)
      You can do this in google: searchterm1 searchterm2 ~bogus The tilde will look for synonyms. You can see which ones hit back by reading the bold results which are neither searchterm1 or searchterm2. I use ~howto and ~cheats often.
    • searchterm1 searchterm2 ~bogus

      ~ is the google synonym operator. To play with it try ~word -word, so you only get the synonyms.

      Ah, google is there anything it can't do?
  • Does that mean I will be able to search for porn with 38DD's?

    Did I say that out loud?
  • Enhancements to normal search engines are great and will always be important, but better is to go beyond that to searching, indexing and retrieving actual information. Services like AskJeeves [] and company originally promised true question answering and other, more experimental, projects like UW's Know-It-All [] promise to operate over information, not webpages.

    Perhaps these are just very generalized search engine enhancement...but I think it's a new way of thinking that will become very important over the n

  • I have been using it less and less.

    Trying others. Google is way too spammed with commercial sites in their finds.

    Their technology is very low tech to me.
  • I've been playing with Flickr, since it was mentioned on Slashdot a few days ago.

    Flickr has a really nice API for retrieving images. I used Perl and ImageMagick to build a database that provides this amusing tool for searching images by color: r. php

    And a related project:

    - Jim
  • Anyone here who's a scientist ever try to use "google scholar"? Unfortunately, it's not very good. What I'd like to see (as an Astrophysicist) is some way to do a search that combined results from difficult-to-navigate scientific sites, such as NASA's ADS abstract service [], the Spires HEP database [], and the [] preprint database. Finding what you need on these individual sites is often a pain, and to be able to search a compilation of them would sure be nice for me...
    • Part of your problem has been solved: NASA's ADS can search at the same time as it searches its own database. Near the top of the page you can pick these databases to query:
      Phys ics/Geophysics
      ArXiv Preprints

      Collect them all!
  • [shameless plug] We make File Journal, which has a lightening fast search and displays the results in an Outlook 2003 style interface []. This means you can group results by folder, date, file type etc. It can find (and restore) delete or renamed files too. Link in my sig if you want to take a look...
  • I can just do a search using the stuff from Commujism [] to find more real pr0n!
  • When I get pages and pages of crap that we all know are ads, I wish I could just check a box, block this domain from future searches.

    Click on enough of them and a user might just see search results similar to circa 96
  • ... some method of telling the search engine that the link is dead.
    All it needs is a tick-box beside each link.

    If we all co-operated about it, the quality of searches would be improved considerably for all of us.

    Google: Please patent the idea on my behalf, I think & hope it's sufficiently trivial, yet innovative & revolutionary, to impress the USPO.

    PS: I mean co-operation, not the tick-box.
    • I like that idea, but seeing as how much Google's ad system is being spammed these days, the system probably wouldn't last long, with competitors blocking out each other's pages with bots and if it comes down to it, trained monkeys clicking away at a terminal.

      it was the best of times, it was the blorst of times? you stupid monkey!
  • by Nobody You Know ( 750014 ) on Tuesday January 25, 2005 @10:49PM (#11476344)
    The number one search engine feature that would make my life infinitely easier would be precise proximity operators in search engine syntax.

    (For those who don't have a clue what I'm talking about, LEXIS-NEXIS, among others, allows you to run searches like foo w/5 bar (the word "foo" within 5 words of the word "bar"), or even foo pre/5 bar (the word "foo", followed, within five words, by the word "bar". Good proximity engines allow you to search not only within x words, but also to order terms, to specify root words within terms, etc.)

    It would be great to have people reviewing and whitelisting page results, but that takes human interaction. Implementing precise proximity operators, though, can give you nearly the same benefits without any of the human cost.

    Many people here have suggested eliminating ad text from search results, but if history is any indication, any algorithmic system that we can come up with to do so will be circumvented pretty quickly. The one way to fix this is to allow me to say that I want the word "modperl" within 10 words of "solaris", rather that just specify any page that contains both terms. That will get rid of 95+% of ads right away.

    Surely, with all the bright people at Google, this is something that they can figure out pretty easily.
  • What I want to know is why, although I am sitting on South Island New Zealand, when I type in do I end up in Canada?
    • because most of our visitors are international, they get much better response times from our Canadian mirror. (And also then we don't pay as many exorbitant data charges).

      Our server here assumes that clients that don't resolve to *.nz are from overseas, and redirects them to the mirror. (Yes, I know this isn't perfect. It's not my fault.)
  • People are the problem. What's needed is for people to get off their butt and learn to exploit the technology to it's full capability. These people could learn how to use Google more efficiently if they read something like Johnny Long's Google Hacking. Link: acker0704.pdf []

    However this will never happen because the average joe is inherently lazy so we'll have to spoonfeed all the techno-numpty's with technological updates until they stop complaining.

    • Dunno when you geeks will try to look at computers as a tool for the common man. What you are suggesting is similar to a doctor saying "don't manufacture medicines, because if the people do exercise and keep their body healthy, they wouldn't need most medicines". Well that just doesn't happen. You probably are a couch potato yourself.

      Btw, I consider myself to be a geek, but I don't look at computers with a narrow mind. I don't use text consoles when I can get more done using GUI based tools. Make the compu
      • We are talking about search technology not medicinal technology. Therein lies your fallacy.

        Search technology is supposed to help find valid information from the query one enters. The better the query entered the more efficient the search.

        I read an interesting essay once from New Scientist that stated that in an information age our questions are more important than our answers. Our society has become a culture of answers. Look at pop-culture for an example. We have an abundance of quiz shows, facts on d

  • I think I can speak for everyone else in the entire world when I say that *nothing* sounds as worthless as the forementioned items.

    Maybe I'll write an article about how the Oldsmobile is a fantastic find.

  • This one connects you with people searcing for similar keywords. I guess the idea is to have another set of helping eyes.
    site: []

1 Angstrom: measure of computer anxiety = 1000 nail-bytes