Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
Microsoft

Inspecting MSN Search 345

Posted by CmdrTaco
from the understanding-the-heart-of-the-beast dept.
ins0maniac writes "I compared Yahoo, Google and MSN's image search. I noticed that, MSN's search had images from only a few sites. I searched for keywords britney spears and randomly checked few pages upto page number 20 and found that the 400 images were only from 3 domains :| 5in9.com, celebritypicturesarchive.com and nabou.com. This is totally weird as it doesn't seem like a search engine, but a collection of few online galleries." There's a number of other interesting notes in the entry about the new search engine. Also, Britney.
This discussion has been archived. No new comments can be posted.

Inspecting MSN Search

Comments Filter:
  • by JaffaKREE (766802) on Wednesday February 02, 2005 @09:23AM (#11550074)
    I already have all 400 of those.
  • by bigtallmofo (695287)
    This is a standard Microsoft tactic. It shouldn't surprise anyone.

    1. Launch a web site in a particular genre but don't actually have any real functionality
    2. Distribute a press release
    3. PROFIT!!

    • by PocketPick (798123) on Wednesday February 02, 2005 @09:55AM (#11550309)
      Your logic is true for more than just webpages. It spans basically Microsoft's entire software library. Balmer's arrogantly stated that it "one mistake" was that it didn't get involved in the 'search' industry earlier, but anyone who has followed Microsoft's trail can tell you that thier late to the table more often than not. And even when they are on time, the product is often a faulty or damaged good that doesn't operate at the level of other competitor products.

      Ex.
      -IE debacle, where Microsoft played catch-up to Netscape and other existing browsers after failing to neglect thier need in earlier years.
      -Direct3D, which played second fiddle to OpenGL for years in usability and features till Microsoft finally began adopting parts of OpenGL's paradigm for computer graphics.
      -The modern desktop GUI. A product of Apple in many respects, but later was adopted by Microsoft.
      -Powerpoint, Visio and other 'Office' products. They were created by other companies, and then consumed by Microsoft.

      And the list goes on and on. Today thier trying to same with hand-held media players (derived from the success of iPods), search technologies (coming from Yahoo, Google, and other succesfull search/advertisement ventures), spyware detection and many other Microsoft 'Innovations' that are soon to hit the market.
      • Yep, this is very true throughout Microsoft's history. Whoever is modding you down is a sadly misinformed MS toadie.
      • by drsquare (530038) on Wednesday February 02, 2005 @10:51AM (#11550880)
        Well, I think this demonstrates that in order to be successful you don't need to be the first to do something, but the first to do it successfully. They might not have come up with a lot of their leading products, but in the end they came up with something that beat the competition in the market, i.e.:

        Internet Explorer: Played catch-up to Netscape, caught it up, then overtook it. Now it's the world's widest-used and most well-known browser and Netscape was beaten into obscurity.

        Direct3D: Might have been behind OpenGL, but they took the qualities of OpenGL and made a product that at least matches it on features and blows it out of the water in regards to market share.

        Modern Desktop GUI: Yes they were playing catch-up with Apple, who in turn got the concept from Xerox, but they worked on the idea and now they have practically the whole desktop market saturated so much that even a possibly technically-superior free operating system struggles to get a foothold.

        Office products? Yes they may have been created by other companies, but Microsoft took them, and all 'Clippy' jokes aside, they turned it into a very decent product and it's dominated the market, and the 'other companies' are languishing on the sidelines.

        You may like to bash Microsoft for taking on other people's ideas, but what company only sells things they've entirely invented from scratch? Apple didn't invent MP3 players, Google didn't invent search engines, I don't see you bashing them, the originators of most technologies are dead and buried because they didn't do anything with them.

        In the real world, if you invent something, unless you patent it or implement it successfully, no-one cares that you invented it.
        • by Hatta (162192) on Wednesday February 02, 2005 @11:01AM (#11551021) Journal
          It would be ok if microsofts success with "borrowed" ideas was because they implemented them better than anyone else. But they don't. They're successful because they abuse their monopoly status. And that's worthy of bashing.
          • by menkhaura (103150)
            Allright, but how did they get monopoly in the first place, having as bad products as they have?
            • by mkldev (219128)
              One lucky initial contract and a lot of predatory marketing?

        • by Gr8Apes (679165)
          Egads, this is a +5 insightful troll? Well, here goes nothing.

          MS didn't beat the competition, the competition was unfairly bludgeoned into defeat.

          IE: bundled with OSes. Unfair tying when you're a monopoly.

          Direct3D: bundled with OS. Marketing BS, promises, and incentives to developers, who fell for it.

          "Modern desktop GUI": you're kidding, right? Next and OS/2 were the most advanced, arguably even by today's standards. (Haven't seen Be)

          Office Products: Again, you must be kidding. They used threats of inc
    • by reporter (666905) on Wednesday February 02, 2005 @09:58AM (#11550333) Homepage
      The search engine at Micro$oft (M$) currently has indexed about 1 billion web pages [pcworld.com], but Google has indexed several times that amount. Given time, M$ will eventually index more pages. Eventually, M$ will catch up.

      The current barrier to entering the market for search engines is low. The technology is relatively simple as the multitude of search-engine companies will attest.

      The advantage that M$ has, over Google, is its huge R&D budget. M$ labs is the modern-day equivalent of the venerable Bell Laboratories, which is shriveling under the management of Lucent. M$ has plucked numerous professors from the computer science departments at top universities by offering incredibly high salaries.

      • reporter(666905) said:
        "The advantage that M$ has, over Google, is its huge R&D budget. "
        --
        Today's news said Google had raked in money which exceeded by many times its expectations - to the tune of several millions in advertising revenues alone. And it has a share base of more than a billion. So money is not a problem as far as Google is concerned.

        reporter(666905) also said:
        "M$ has plucked numerous professors from the computer science departments at top universities by offering incredibly high salaries
      • The advantage that M$ has, over Google, is its huge R&D budget.

        Not really. Their advantage is that they have a monopoly consumer OS's. They will bundle their search with their OS, which is already bundled with their browser and their media player. Most users will never think to install something better, or go to google when their is a search already built in. It does not have to be as good as Google. It does not have to be close. It only has to be functional enough to barely work for most people

        • When they add the search button to only link (or automatically, same thing) to m$earch, then it's over for google. At least for all the people who don't know any better.

          M$ will make it one click eaisier to use m$earch in stead of any other search, and it will matter.

          But google will be here for quite a while.

          I seriously couldn't do my job without google. It is by far, the best tool I've ever had. I tried about 10-30 searches in m$ (all of which gave me the info on top 3-4 of page 1 in google), I had to go
    • by pointyhairedmba (698579) on Wednesday February 02, 2005 @11:12AM (#11551162)
      The notion of a "fast follower" is well known in the business world. You let other companies develop new technology and *most importantly* educate the market with their dollars. Then you enter the market as a fast follower with your product where you have learned from others' mistakes and successes. In many industries, it's actually an advantage to be a fast follower. For example when the cost of educating the market is so large as to suck off cash from other critical activities.

      Finally, MS has never really been known as an industry leader. They are a huge marketing machine. There's nothing wrong with that, you just have to realize that you don't have to be a market leader to be a success. I think that classic "tech" people often forget this.

  • by DaneelGiskard (222145) on Wednesday February 02, 2005 @09:24AM (#11550082) Homepage
    ... that's why I love science. You can find the best reasons to do the weirdest things ... ;-)
  • A revenue stream.. (Score:5, Interesting)

    by phuturephunk (617641) on Wednesday February 02, 2005 @09:25AM (#11550091)
    ..Is a revenue stream. The galleries in question probably pay for dominance. Yeah, this seems contrary to a full free search, but at least the results are on subject.

    The real task, it would seem, would be to find a way to have the engine return the proper pictures for the proper searches (so typing in Daddy's birthday doesn't result in pictures of some 50 something dude banging some barely legal chick with a party hat on.)

    Stuff like that.
  • search filtering (Score:2, Insightful)

    by Anonymous Coward
    So, did you turn it off before the search? I did.
  • Errrr.... (Score:5, Insightful)

    by JamesD_UK (721413) on Wednesday February 02, 2005 @09:27AM (#11550101) Homepage
    So from one single query to the MSN search engine we're meant to draw some form of conclusion? Could it just be that the search engine has determined these domains to hold the best results and just returned these images?

    Other searches [msn.co.uk] don't appear to be similar. I'm guessing that perhaps these companies have paid for higher placement on the example used in the article?

    • search for "linux" (Score:5, Interesting)

      by CausticPuppy (82139) on Wednesday February 02, 2005 @09:32AM (#11550145) Homepage
      ...and the very first link on the page (under "sponsored sites") is:

      www.microsoft.com
      Windows outperforms Linux: Industry case studies and test lab results provide insight into the advantages of the Microsoft®...

    • Yeah, since it is a new service, maybe it just hasn't crawled that many sites yet? This is exactly the same as the arguments when the Xbox came out. "There are only 50 games for the Xbox and and hundreds for PS2!" Yeah, it's was Xbox's first day and the PS2 had been out for a long time. It's just typical slashdot MS bashing.
      • Yeah, since it is a new service, maybe it just hasn't crawled that many sites yet?

        I guess if you consider about a year as "new." Surely in a year it has managed to crawl more sites, otherwise it is a pointless search engine, as it will always be out of date and behind the times.
    • Could it just be that the search engine has determined these domains to hold the best results and just returned these images?

      Should the search engine determine which domain has the best pictures, or should I be able to get *all* the results and determine that myself?

      I do agree that basing a conclusion on a single search result is a bad idea (and that the cache has a ways to go), but if your reasoning is true that's a good reason not to use MSN's search. Along the lines of "determining the best domains,"
  • by Anonymous Coward on Wednesday February 02, 2005 @09:28AM (#11550115)
    I searched for "britney spears nude goat dildo sparcstation" and didn't find a single thing.
  • by gambit3 (463693) on Wednesday February 02, 2005 @09:29AM (#11550120) Homepage Journal

    I'm going to have to perform this experiment myself.

    In the interest of the truth, you know.
  • nothing like having an "excuse" to search for britney spears images, eh? :P

    we /.-ers certainly hate her music but apparently she's not as painful to look at. :D

    msn search may not be as good as google/yahoo, but the prominent-cleavages-to-image-number ratio is quite high for all three search engines. who's complaining? :P

  • I seem to recall (Score:5, Insightful)

    by way2trivial (601132) on Wednesday February 02, 2005 @09:30AM (#11550127) Homepage Journal
    discussions that- if google put adwords on the image search results, they were potentially crossing the line of using copyrighted works without permission- to turn a profit - perhaps MSN is only image searching/displaying where they have been given permission to display copyrighted images...
  • mirrordot link (Score:5, Informative)

    by linhux (104645) on Wednesday February 02, 2005 @09:30AM (#11550128) Homepage
    http://mirrordot.com/stories/5defdb2c0e9cac7c89624 a2594f96717/index.html [mirrordot.com]

    mirrordot doesn't seem to have archived all the images yet though...
  • by TexTex (323298) * on Wednesday February 02, 2005 @09:31AM (#11550137)
    For research, I checked out some of those pictures returned by the Britney search.

    Many of the thumbnails displayed aren't the same picture that's retrieved when you click on the link. So, their cache must be outdated already. When I'm browsing thumbnails, I expect...no I demand...my search engine to return the appropriate photos!

    • When I'm browsing thumbnails, I expect...no I demand...my search engine to return the appropriate photos!

      You know, it's a wonder nobody has started spoofing image thumbnails by returning a different image when a Googlebot comes by.

      Surfer: Mmmmm... Hot, nude bored housewives...

      *click*

      Website: Hello.jpg!
  • Expectations (Score:4, Insightful)

    by FullMetalAlchemist (811118) on Wednesday February 02, 2005 @09:31AM (#11550138)
    I don't really expect anything from MSN search at this point, it will require some major fine-tuning to become really powerful.

    On the other hand, I don't expect any reviews of MSN search to be any good so early on either. Simply because, if you're a googler or some other search engine user, you like what that one offers for a reason; switching is hard.
    • Switching isn't hard if the initial product sucks.

      I used to use Altavista and I hated it with a passion. It would return reams of junk sites stuffed with meta keywords. As soon as I discovered Google I jumped ship immediately simply because Alta Vista stunk and Google didn't. And it still doesn't.

      Of course when Google works so well, switching is going to be hard. After all, what does MSN Search (or A9.com for that matter) do that I need? Sure they might have some unique features, but to take A9.com as a

    • Those of us who've been around a while know the well-worn pattern:

      (1) MS sits on arse for years doing no innovation while another company produces an innovative, excellent, useful product and spends several years refining it and making it even better

      (2) Start to take notice as another company starts to get a lot of limelight in some mainstream market "space" it never occurred to you to enter

      (3) Announce intention to compete.

      (4) Spend the next couple of years with half-hearted attempt to play catch-up, p

  • by Shardin (696999) on Wednesday February 02, 2005 @09:31AM (#11550141)
    I'm no MS supporter, but do you think this might be because the new search engine has been crawling the web for a fraction of the length of time Yahoo and Google have been crawling the web?
    • do you think this might be because the new search engine has been crawling the web for a fraction of the length of time Yahoo and Google have been crawling the web?

      But it's only catalogued three domains. What, is it searching depth-first?
      • Then again this is based on the submitter "randomly checking a few pages up to page number 20."

        Not very scientific.

      • The MSN bot has been singlehandedly responsible for about a fivefold increase in my site's visits stats. My site is of no interest to anyone, and very rarely changes at all. Google rarely spiders me, and can find anything I've got on my site.

        I don't know about depth searching, but there's definite;y something wrong with msn's search strategy.

    • by ceeam (39911)
      Actually they all claim to _totally_ refresh their DBs in 2-7 days, IIRC. How does it matter for how long they have been doing this then?
      • by Otter (3800)
        Their page search DBs, yeah. But Google's image search updates on the order of months, not days. (Remember when they didn't have Abu Ghraib images for a while and Taco decided it was Crushing of Dissent by Karl Rove?) Presumably they update so slowly for a reason, one that might apply to MSN as well.
    • I'm no MS supporter, but do you think this might be because the new search engine has been crawling the web for a fraction of the length of time Yahoo and Google have been crawling the web?

      Perhaps, but they still have been crawling the web for months... should be plenty of time.
    • Seems unlikely, as the MSN search has been under development and actively crawling for something like a year now at least. I've seen its bot crawling my own websites many months back.

    • msnbot has been quite aggressive in the past month crawling my sites more often than googlebot or Slurp (Inktomi/Yahoo)

      I don't think "they haven't been doing it as long" counts for much when you're talking about a company with tens of billions in its warchest and a targeted mission to topple competitors.
  • Slashdotted (Score:5, Interesting)

    by mreed911 (794582) on Wednesday February 02, 2005 @09:33AM (#11550156)
    The original article has been /.'ed already, but there's a cogent point to be made:

    Unless the images are titled, tagged, annotated, etc., there's no good way to index them.

    If I just throws a bunch of images up on a web site, there's not good technology, other than some pretty advanced facial recognition stuff, that can determine who, or what, a particular picture represents.

    Change the resolution, color depth, etc. and I change the checksum for the image, so the index fails to recognize that one picture is the "same" as another, just resized, etc.

    I see a lot of that on Google's image search - but can't find a way around it, either.
    • Re:Slashdotted (Score:3, Informative)

      Change the resolution, color depth, etc. and I change the checksum for the image, so the index fails to recognize that one picture is the "same" as another, just resized, etc.

      So resize the image to a standard max size and depth (256x256 max size jpeg with retained aspect ratio), then hash the individual luminance data into a thumbprint that can be compared. Checking for dupes becomes easier and similarity checks are doable.

  • it looks like... (Score:5, Interesting)

    by jxyama (821091) on Wednesday February 02, 2005 @09:39AM (#11550207)
    MSN image search is returning results where the image filename actually contains "britney" and "spears." as far as i saw on the first page of the results, all the files have "...britney_spears....jpg" name. if such is the algorithm being used, this severely limits the number of possible hits.

    this is contrary to google image search where it's not simply searching for filenames. google search seems to understand that images of britney spears need not have "britney" and "spears" in the filename.

  • Too New. (Score:5, Insightful)

    by Deathlizard (115856) on Wednesday February 02, 2005 @09:40AM (#11550211) Homepage Journal
    The MSN Search right now is too new to get an accurate reading on how it is going to ultimately perform.

    Google has been around for years spidering sites where MSN Search has only been around for a few months.

    The real test is going to be a year from now, when it's had more than enough time to spider a good portion of the web. Even Google's search paled in comparison to Altavista at first until at least 6 months passed. After a year passed its searches were much better since a good portion of the web was spidered by it.

    At this point in the game, It would have to be an absoletly amazing site to take Google out, and I don't think MSN Search is the site thats going to do it.
    • Google crawls over most of the web every other day. It does NOT take a year to spider the whole web... a month or two at most. If it took a year to spider the web, the results would be to out of date to be useful anyway.
  • by Sophrosyne (630428) on Wednesday February 02, 2005 @09:41AM (#11550213) Homepage
    http://search.msn.com/images/results.aspx?q=kelly+ ripa+camel+toe&srch_type=2&FORM=QBIN [msn.com]
    I'm sorry, but this is where I draw the line-- it's completely unusable
  • by traffi (800888) on Wednesday February 02, 2005 @09:47AM (#11550254) Homepage
    Searching for 'bill gates' [msn.co.uk] in MSN returns the page Bill Gates As Mabus [mabus.biz]. Apparently this project is dedicated to finding the human manifestation of the anti-Christ.

    None of the first 10 results (searching from the uk) return his homepage.

    Searching with Google [google.com] turns up Bill Gates' Web Site - Home Page [microsoft.com].

    Which means: Stick to Google.
    • Likewise, I did an image search on msn.com for " William H. Gates, III [msn.com]", which only returns 11 images of Bill. There's also four pictures of some older guy who's apparently Bill's dad, and three pictures of Gates Hall, a building named after either of the two (some kind of law school).

      Performing an MSN Image Search for "Bill Gates [msn.com]", returns 2,134 images from a variety of web servers, although newsimg.bbc.co.uk seems to be the most popular server that offers images. They apparently didn't screen these, beca

  • MSN Search [msn.com]

    while the first few result are still remotely related to what I expected (sex offender registries, sex - by teens for teens), the ninth link is cool:

    Microsoft Corporation
    The entry page to Microsoft's Web site. Find software, solutions, answers, support, and Microsoft ... Last Updated: Monday, January 31, 2005 - 12:00 A.M. Pacific Time Manage
    * www.microsoft.net


    I'm amazed how stupid and desperate these guys there must be.
  • To Turn off SafeSearch:

    goto:
    http://search.msn.com [msn.com]

    click settings:
    [Which will bring you to:]
    http://search.msn.com/settings.aspx?ru=%2f&FORM=SE HP [msn.com]

    Try not to get confused and think you're using google...

    On the third section from the top click "off"

    You'll find the "Save" button in the lower right hand corner if you scroll down.

    I was going to read through the source code and post a GET link which would turn it off for you... but I'm not about to read through that code at 8:45 in the morning
  • This guy's nameservers are down. It's not that the webserver is down; you can browse it by the IP address listed in his whois information. It's that the webserver has a default Apache start page as its default and his domain as a vhost, but none of his nameservers are up to resolve requests for his domain.

    I'm amazed not only that so many posts were made "about" the story from various diagonal points of view, but without anyone actually browsing his site. It's even more interesting that his story got pos
    • you can add the hostname and theip to /etc/hosts (linux) or search for hosts in hidden and system files on your windows drive and add it there, that will give you a way to resolv the hostname to the ip and properly access the vhost
  • "I personally dont think that Britney would have more than 10k pictures online (or may be offline too?)" (from the review)

    At last count I have 2,347 pictures of Britney Spears.

    And I don't even like her music...:-)

    I wonder how many photos of the Corrs MSN can find...I've got 2,080 photos of them...

    How about 1,228 of Salma Hayek?

    1,406 of Angelina Jolie?

    1,083 of Carmen Electra?

    24 of Chelsea Clinton? Waitaminnit, WTF?

  • by Phoe6 (705194) on Wednesday February 02, 2005 @10:00AM (#11550357) Homepage Journal
    Try to provide a Feedback. It does not proceed. I tried to provide MSN a Feedback about there is noway to get to the main page after searching. Pops up a Windows BOX containing Submission information and gives up. btw, I use Firefox and I bother not to check for the same on IE.
  • An error occurred while attempting to run a script on this page.
    http://www.msn.com/ line 149:
    TypeError: Attempt at calkling a function that expects a HTMLDocument on a Window.
  • Just did a quick comparison of search.msn.com, google.com and www.yahoo.com. Here are my results:

    Search term: microsoft sucks
    Google: results about 862,000
    Yahoo: results about 762,000
    MSN: results about 1,856,364

    There's a joke in there somewhere dying to get out.
  • What a wierd result. I don't think it's a "sponsored link" effect, though. It looks, instead, like the ordering algorithm clusters sites, so that sites with lots of pictures of Spears show up near the front, and sites with fewer images show up later. If you hack the query to look at pages around 200, you find many more sources on each page.
  • Britney? (Score:2, Informative)

    by Anonymous Coward
    That just isn't a best example for research, since this term is very competitive, so the results are bound to be heavily manipulated by search engine optimizers. One cannot draw any meaningfull conclusions from such "research".
  • Found 1,474 images...

    Including some I never saw before...

    And yes, a lot of the images link to 404 pages, but I've seen that on Google, too.

  • I don't think it does fuzzy searching as well.

    Looking through various search queries in Google and MSN I noticed that Google finds images in pages that barely make a mention to the keyword (and does it accurately). MSN on the other hand the pages have more references to the keyword I am searching for.

    I'm not sure why this is, I guess it's just the alor. they use to index.

    I'm betting MSN will improve a bit, it takes a while to index the net to the level that Google did. It takes a long while.

    I'll be s
  • by Punchinello (303093) on Wednesday February 02, 2005 @10:41AM (#11550726)
    I searched for keywords britney spears and randomly checked few pages upto page number 20 and found that the 400 images were only from 3 domains :| 5in9.com, celebritypicturesarchive.com and nabou.com.

    Perhaps this guy didn't know the default setting on MSN search is to group results by domain. Maybe he should try agian with this setting off if he wants to see more variety of domains providing Britney pics.

  • by maggard (5579) <michael@michaelmaggard.com> on Wednesday February 02, 2005 @10:48AM (#11550833) Homepage Journal
    What continues to surprise me is how image searches ignore the information embedded in images. EXIF & IPTC (& NewsML) all have fields for author, caption, longitude & latitude, keywords, etc. Yet none of the search engines appear to pay any attention to these.

    Many pictures include this sort of search-rich information, either from the camera or added manually, using cataloging software. Google's Picasa 2 [picasa.com] freeware (Windows only) embeds it's key words just so. Microsoft Research's excellent freeware (Windows only) World-Wide Media eXchange [slashdot.org] tools do the same for geo-coding photos. There are numerous other tools that can do the same, leading to a significent set of internally 'tagged' material.

    So, why aren't the search engines taking advantage of this? They're already loading the images and creating thumbnails, how much extra work is it to extract any additional information in the file and use that in it's indexing too, especially compared to the potentially increased accuracy?

  • Personally I'm just pleased I'm 1st for 'fuck microsoft' [msn.co.uk] :)
  • by ClickNMix (218488) on Wednesday February 02, 2005 @11:03AM (#11551057) Homepage
    This just looks like a bug, plain and simple - If you go to settings, there is an option to group images from the same site - checked by default - but taking it off has no effect, so if one site such as in this case has ALOT of images, its going to be a long way before you get onto the next site. Which you can pretty easy.

    Everything about this article is just based on one dumb luck search, and not alot else it seems. Sure it's Microsoft, so it's easy to get all het up, where as if Google made the same mistake, everyone would be much more likely to try figure out what the real deal was.
  • by sameerdesai (654894) on Wednesday February 02, 2005 @11:08AM (#11551116)
    Reminds me of a joke.

    A scientist is conducting experiments on cockroach behavior. First day he cuts off one leg of a cockroaach and shouts walk. The cockroach is able to walk limpily. Second day he cuts off the second leg and shouts walk. The cockroach is still able to move around. Third day he cuts off his third leg and shouts walk. The cockroach tries hard to move and is able to do that. Fourth day he cuts off his last leg and shouts walk and obviously cockroach is unable to move. The conclusion: When you cut all the four legs of a cockroach the cockroad goes deaf!!!

The only possible interpretation of any research whatever in the `social sciences' is: some do, some don't. -- Ernest Rutherford

Working...