Forgot your password?
typodupeerror
The Internet

Wikipedia's Search Engine Plan 102

Posted by CmdrTaco
from the just-because-you-can dept.
jasonoik writes "Wikia, the commercial company founded by Wikipedia's Jimmy Wales, reveals plans for a new, editable search engine. They say that the goal of the project is to get 5% of the search market. The service does not yet an official release date. The article also leaves open the possibility that the search results may contain ads, and concludes by listing figures of the web advertisement market." Update: 03/11 17:24 GMT by KD : Wikia and Wikipedia are separate companies.
This discussion has been archived. No new comments can be posted.

Wikipedia's Search Engine Plan

Comments Filter:
  • by Anonymous Coward on Sunday March 11, 2007 @10:22AM (#18307390)
    ...which sounded delicious.
  • by Moryath (553296) on Sunday March 11, 2007 @10:24AM (#18307396)
    "Do No Evil" became "Be as corrupt and evil as possible."

    An "editable search engine"? Great, now even MORE of the searches I run will pop up ads for v14GR4 and enhancements for body parts I don't possess, nevermind those linkspam sites that just insert the entire fucking dictionary in metacode.

    You searched for: Bill Gates
    you got: 400 pictures of penises, vaginas, and one picture of a penis covered in something that looks like it came out of the OTHER opening.
    • Re: (Score:3, Funny)

      by Anonymous Coward
      enhancements for body parts I don't possess

      ...you.. YOU'RE A GIRRRRRLLL!!!

      400 pictures of penises, vaginas, and one picture of a penis covered in something that looks like it came out of the OTHER opening.

      ...I love you!
    • by shudde (915065) on Sunday March 11, 2007 @10:46AM (#18307518)

      You searched for: Bill Gates you got: 400 pictures of penises, vaginas, and one picture of a penis covered in something that looks like it came out of the OTHER opening.

      The system works.

    • by vertinox (846076) on Sunday March 11, 2007 @12:53PM (#18308242)
      An "editable search engine"? Great, now even MORE of the searches I run will pop up ads for v14GR4 and enhancements for body parts I don't possess, nevermind those linkspam sites that just insert the entire fucking dictionary in metacode.

      True, but to be fair I wish you could have some sort of voting system based off unique IPs.

      Every time I do a search for something, chances are I'll come across a site or two that is listed that is totally crap, spam, or blatantly used some sort of method to get hits with the search.

      If I could only vote "This is spam!", "This is crap!", "This has nothing to do with the search query!" , and "Ban this site from all search engines for all time!" then I think we would see prevalent results more than not.
      • Re: (Score:1, Insightful)

        by Anonymous Coward
        I bet there are more zombies that can outclick you.
      • by jackv (1068006)
        Is Google intentionally allowing this to happen or is it the sheer task of indexing billions of pages which creates this problem. If I search for something very specific , like a system error code, then usuall the results are very good. If on the other hand , I search for something that has many associations , like Bill gates , then really you need a more controlled search. Which is, of course where Yahoo originally were very good , with the controlled search
    • Re: (Score:1, Insightful)

      by UbuntuDupe (970646) *
      "Do No Evil" became "Be as corrupt and evil as possible."

      Actually, it became "Don't be evil, unless necessary for the greater advancement of the human race." Just a heads-up.
    • by Tubusy (806092)
      It's for that very reason that this may come under "it's a crazy idea, but it might just work". People are often outraged by the crap that comes up on the major search engines and contributing citizens would work within their field of interest to keep a good signal-to-noise ratio going. In that sense, it is just like the WP - sure it's open to abuses but if enough people get involved civilisation might just be born. Enough people will be determined by the need, and good will, of a critical mass of users; ag
  • by dreamchaser (49529) on Sunday March 11, 2007 @10:25AM (#18307400) Homepage Journal
    Just imagine what all those malcontents out there with too much time on their hands will do with this! It could be truly amusing.

    Not *everything* works best when edited by the hordes.
    • while my post just above got modded "troll"?

      Someone gave a wikinazi mod points. :(
    • by SolitaryMan (538416) on Sunday March 11, 2007 @12:14PM (#18308018) Homepage Journal

      Just imagine what all those malcontents out there with too much time on their hands will do with this! It could be truly amusing. Not *everything* works best when edited by the hordes.
      This is *exactly* what has been said about Wikipedia first. With things like this, you have to *try* to know for sure, so while this idea *may* not work, it definitely worth trying.
      • by osu-neko (2604)

        This is *exactly* what has been said about Wikipedia first. With things like this, you have to *try* to know for sure, so while this idea *may* not work, it definitely worth trying.

        Are you suggesting we using an empirical methodology to discover the worth of an idea rather than just talking out our asses about why it certainly will or won't work without having tried it? You do know this is Slashdot, right? To any possible question, the answer is immediately and painfully obvious to anyone with half a b

      • by McFadden (809368)
        It's a nice idea, but before they try building an entirely new search engine, why don't they fix the one on the wikipedia site. It's absolutely fucking useless, and incapable of even the most simplistic fuzzy search. Spell the name of a person wrong (which is entirely possible if it's an obscure or foreign name) by as little as one letter, and you're likely to get zero matches.

        Quick example: The president of South Africa is called Thabo Mbeki. He's the president of a country, so he'd rank as someone y
  • Fucking inaccurate (Score:5, Informative)

    by Ignorant Aardvark (632408) <cydeweys AT gmail DOT com> on Sunday March 11, 2007 @10:30AM (#18307438) Homepage Journal
    Wikia is not the "company" behind Wikipedia. The Wikimedia Foundation, which is a non-profit foundation, is what's behind Wikipedia. Wikia is a totally separate for-profit company that is run by Jimbo Wales.
    • by tomstdenis (446163) <tomstdenis.gmail@com> on Sunday March 11, 2007 @10:34AM (#18307460) Homepage
      Bah, you and your facts. Obviously you're not a Wikipedia editor. Feel the wikiality flow through you.

      Tom
      • Bah, you and your facts. Obviously you're not a Wikipedia editor.

        Damn right I'm not a Wikipedia editor. I'm a Wikipedia administrator. And it pisses me off to no end when we get lumped together with Wikia, which we really and truly have absolutely nothing to do with other than sharing the same wiki software (of course, there's thousands of other sites out there that also use MediaWiki).
        • Re: (Score:3, Funny)

          by tomstdenis (446163)
          Maybe there should be a WP article about this? :-)

          I can feel my Karma burning ...
          • by solevita (967690) on Sunday March 11, 2007 @12:00PM (#18307918)
            Don't worry, I'm sure there's some 24 year old somewhere who's been lecturing at top universities on the subject of something-or-other for about 20 years now, they're so enraged by this whole incident (hence the foul language) that they're typing a page up right now. Then they'll make themselves a nice little badge for their user page that reads something like "Justice Squad: Defender of Wikipedia", talk on MSN for a bit and wonder what it would be like to talk to a real girl.

            At least I think that's how Wikipedia works.
        • by Joebert (946227) on Sunday March 11, 2007 @12:03PM (#18307950) Homepage

          I'm a Wikipedia administrator.

          Right, I suppose next you're going to tell us about your PhD & other certifications right ?
        • And it pisses me off to no end when we get lumped together with Wikia, which we really and truly have absolutely nothing to do with other than sharing the same wiki software
          And the same founder?
        • by Eivind (15695)
          Not thousands. Hundreds of thousands.
    • Re: (Score:2, Funny)

      by matt me (850665)
      >Fucking inaccurate
      This is Slashdot, you insensitive clod!
    • Wikia is not the "company" behind Wikipedia.

      As if the editors need to be told this by an Ignorant Aardvark [slashdot.org]!
    • by suv4x4 (956391) on Sunday March 11, 2007 @11:08AM (#18307626)
      "Wikia is not the "company" behind Wikipedia. The Wikimedia Foundation, which is a non-profit foundation, is what's behind Wikipedia. Wikia is a totally separate for-profit company that is run by Jimbo Wales."

      Your requirements for a news service are too stringent: they at least got the names kinda matching kinda nice. Plus maybe they meant behind Wikipedia in a more physical and sarcastic manner.

      Plus, they seem to be in the middle of some sort of reorganization there, every article is from a new, different department. It must be hell to do this AND still run the site without interruption.

      I want to applaud the Slashdot team for their professionalism: guys, we're behind you.
    • Or whatever other term you want to use for bullshit, misleading, false claims of disparity between a corporation and a corporate shell game.
    • by owlnation (858981)

      Wikia is not the "company" behind Wikipedia. The Wikimedia Foundation, which is a non-profit foundation, is what's behind Wikipedia. Wikia is a totally separate for-profit company that is run by Jimbo Wales.

      No. Not really. It's only separate for admin and accounting purposes. Ultimately Jimbo Wales is the driving ego behind both of these. I know many have claimed here that Jimbo is more distanced from Wikipedia than the media reports - this is however, clearly untrue.

      Jimbo is as much hands on in both o

      • by Eloquence (144160) on Sunday March 11, 2007 @01:19PM (#18308404) Homepage

        As an elected Board member of the Wikimedia Foundation, I can assure you that your opinion is incorrect. The Board of Directors of the Wikimedia Foundation has 7 members, of which Jimmy is one. He is the Chair Emeritus, which is a title we have given him to recognize his historic role, but which does not have any legal powers or responsibilities associated with it. The Chair of the Foundation is a nice French woman named Florence Devouard; I am the Executive Secretary.

        As a tax-exempt 501(c)(3) non-profit organization, the Wikimedia Foundation also maintains a strict conflict of interest policy. So, Jimmy is not permitted to make any propositions which would advance the corporate interests of Wikia, and has indeed been completely excluded from discussions where his involvement in Wikia was relevant. (This, of course, also goes for any other corporate interests Board members may have.) In this way, Jimmy actually has less influence to promote Wikia as a Board member than he would have as a mere community member.

        Jimmy retains some community influence specifically in the English Wikipedia, but that influence is not legally anchored. He speaks frequently to the English language press, though Florence has also done a lot of interviews lately. People seem to construct from this all kinds of bizarre conspiracy theories which have no basis in reality. This is a shame, because the WMF is truly committed to making the world a better place, and needs all the support it can get.

      • Yes, except the community of News Limited editors don't frequently tell Rupert Murdoch to get knotted ...

        Wikimedia is not top-down at all. It's nonprofit politics all the way through. (Anyone in academia or the nonprofit sector should be recoiling in horror right now.) I'm occasionally amazed that somehow enough of the politics has been gotten past to get a useful web encyclopedia actually written.

        • Re: (Score:1, Insightful)

          by Anonymous Coward
          The encyclopedia is and always has been written by people who are either unaware of, or do not care about, the politics. Surely, as such an experienced contributor, you must be aware of that by now.
    • Yes. It's going to be interesting to see if Wales reports this conflict of interest. It should be reported on IRS Form 990 [irs.gov], under "Relationship to Other Organizations". That's where, if you're involved with both a for-profit and a non-profit in the same area, you have to report it.

      Form 990 is a public record. GuideStar [guidestar.org] has them all on line, although you have to register there.

  • All those bloggers-for-hire that are starting to find themselves unemployed suddenly have new embeded job opportunities.
  • ... is looking for missing 'have' words ...
  • wikiality (Score:4, Funny)

    by User 956 (568564) on Sunday March 11, 2007 @10:45AM (#18307510) Homepage
    Wikia, the company behind wikipedia reveal plans for a new, editable search engine. They say that the goal of the project is to get 5% of the search market.

    According to Wikipedia, that goal of 5% will triple in the next six months.
    • just an update, I've changed the entry in Wikipedia so that it's now expected to grow at 150% in the next six months!
    • According to Wikipedia, that goal of 5% will triple in the next six months.

      FYI, that's a Colbert reference. He tried to have mentions of the white elephant population tripling in 6 months added randomly to WP.
  • by Weezul (52464) on Sunday March 11, 2007 @10:46AM (#18307512)
    Maybe they're first project should be: make wikipedia's internal search work correctly! It can't even handle the most basic miss-spellings now.

    If your serious about this, don't compete with google, instead partner with google and make a wiki.google.com provide google's own search results & ads, but filtered and processed in various ways, which are handled by the wiki.

    For example, you want to give only unique sites/hits but this may depend upon the host's url.
    • by Rosco P. Coltrane (209368) on Sunday March 11, 2007 @10:52AM (#18307540)
      Maybe they're first project should be: make wikipedia's internal search work correctly! It can't even handle the most basic miss-spellings now.

      You know, I've never had problems with the wikipedia search engine. More often than not, I enter something I'm looking for and it finds the correct article 95% of the time, with the spelling corrected and the missing words inserted. Of course, I have a vague idea of how what I'm looking for is spelled in the first place, perhaps I'm helping the search engine, but really so far I'm really not disappointed with it.

      At any rate, flip through a real paper encylopedia and you'll find the "search engine" (the thesaurus) to be a real pain compared to anything Wikipedia can come up with, therefore I guess for an encyclopedia, I'm happy enough with it.
      • by suv4x4 (956391) on Sunday March 11, 2007 @11:15AM (#18307660)
        More often than not, I enter something I'm looking for and it finds the correct article 95% of the time, with the spelling corrected and the missing words inserted. Of course, I have a vague idea of how what I'm looking for is spelled in the first place, perhaps I'm helping the search engine, but really so far I'm really not disappointed with it.

        Everybody can do a search engine that works with the occasional typo. Real search engines know what I mean when I'm not even close [google.com].
        • Since porn etc searches make up a considerable % of Google's searches it probably makes Google the largest porn portal site by far. "Feeling Lucky isn't called that for nothing!

          Editable searching could be quite useful. From the search criteria you can guess the type of porn the person wants and direct them accordingly. Afer all they might type in "lawn mower" but you really know that deep down they want some shaved chick porn.

          • by suv4x4 (956391)
            Since porn etc searches make up a considerable % of Google's searches it probably makes Google the largest porn portal site by far. "Feeling Lucky isn't called that for nothing!

            Actually I remember google starting as the search engine of choice for people looking up code samples/tutorials and warez.

            I guess porn was in this number too.

            It's indicative of how a product becomes popular, by picking on the lowest possible common denominators and growing from there. I guess warez and porn are those denominators.
          • by Weezul (52464)
            You are a genious! But why editable? Why can't porn.google.com just turn all search results into porn?

            Infact, wikiporn [wikiporn.org] could make finding esoteric porn much easier.

            See also: http://www.urbandictionary.com/define.php?term=wik iporn [urbandictionary.com]

      • by zCyl (14362)

        You know, I've never had problems with the wikipedia search engine. More often than not, I enter something I'm looking for and it finds the correct article 95% of the time, with the spelling corrected and the missing words inserted.

        If you compare the success rate of wikipedia's search engine to that of using google with "searchterm site:wikipedia.org", you'll find the google one far more successful. It corrects spelling, prioritizes articles by significance, and usually does a much better job of listing th

        • You actually hit upon one of my uses for Google:

          Being a quick spell checker.

          If there is a word that I am writing and I don't want to bother with trying to look it up in a dictionary or can't think of the proper spelling, I'll punch it into google and ignore the search items themselves, other than to see how many other people suck at spelling as bad as I do and even published content with the misspelling.

          What is surprising is how many times even deliberate misspellings still turn up content on the Google sea
    • by shudde (915065)

      The parent has a good point..

      I'm currently messing around with turning Mediawiki into a basic CMS. Search has been a lot more effective at returning usable results since I changed over to Omega [xapian.org].

      • Re: (Score:2, Informative)

        by David Gerard (12369)
        There's a lotta extensions to help with CMS-like stuff - have a look around http://mediawiki.org/ [mediawiki.org] and ask on mediawiki-l and http://mwusers.com/ [mwusers.com] .

        Extensions are good because you can track the main releases and help make existing extensions to this end more robust, which is the secret open sauce.

        • by shudde (915065)

          Yeah I'm doing it for fun though (well my version of fun)... so I prefer just hacking away at mediawiki itself.

        • Do join the mailing list, though, and see if you can't get fixes into the main tree ... Mediawiki is desperately short of developers.

          And in Wikipedia ... the devs are the ones with the real power.

    • Re: (Score:3, Informative)

      by binaryspiral (784263)
      The "search" function you use on their website is a known weakness because it relies on MySQL to perform the actual search. They didn't spend a lot of time developing it into something more useful than a basic word finder.

      Even Wikipedia [wikipedia.org] recommends using an external search provider for speed and customization of search topics.
      • Actually, no - the default MediaWiki search is the crappy MySQL text search. The Wikimedia projects actually use another text search, written in Mono and based on Lucene. Which sucks ass a little less. Marginally.
    • But that would just make too much sense
    • I agree completely - the default wiki search needs major, major work. If they get this search software working and add it to Mediawiki, it'll be a major improvement. As a standalone search engine, however, I don't see the point.

      What's the advantage of having user-editable search results? Anyone can submit sites to Google already. I don't know the exact statistics, but I'd imagine that most sites that aren't complete trash end up getting accepted - my site is a jumble of code I put together to learn PHP a

      • by Dogtanian (588974)

        In a search engine, though, how can anyone say whether Mr. Bennet from Heroes or Mr. Bennet from Pride and Prejudice is more important?

        Your search returned results for two different subjects:-

        "Mr. Bennet" from "Heroes" (Click link for all results on this subject)
        (Top 5 results follow)
        [blah]
        "Mr. Bennet" from "Pride and Prejudice" (Click link for all results on this subject)
        (Top 5 results follow)

        Complete results list follows:-
        [blah]
        Displaying the top 10 results from each category:-
        [etc]

    • I search Wikipedia with Google...
    • It can't even handle the most basic miss-spellings now.

      It seems to work just fine for me... [wikipedia.org]
    • I searched for "white hous" -- this is what I got back:

      #1 Maui Interscholastic League
      #2 The Hospital (TV series)
      #3 Edith Matilda Thomas
      #4 Song Xian

      wtf is this shit?!
    • don't blame wikipedia for your inadequacies with the english language. first learn the difference between they're their and there, then your and you're and maybe also realize that 'misspelling' is a word.

      to stay on topic, i never really need to use wikipedia's search function. google typically lists the wikipedia page on my search topic as one of the first links in the results. plus i have my obligatory google search out of the way to continue my research post-wiki browsing.
  • Well, you'd better hope no one tries to search for a webcomic on this thing.
    • by Dogtanian (588974)

      Well, you'd better hope no one tries to search for a webcomic on this thing.

      No, on the contrary. A fanboy of said obscure webcomic will try to include it and make it a prominent result on all related searched, even if 99.999% of people aren't likely to be searching for it. For example:-

      You searched for "BBC". Results in order of importance follow:-

      #1 RESULT:- "Brian Robert Coleman", usually known as Bri, initials BRC, but in volume 3, episode 24, his friend once called him "BBC" by mistake because someone told him Brian's middle name was "Bob".
      #2 RESULT:- "Bob Brown cafe", a

      • by Dogtanian (588974)
        (#1 was supposed to read something like...

        "Brian Robert Coleman", a character in the video gaming comic strip "Furry vs. Obscura", usually known as Bri....)
  • With the inevitability of it having funding by advertising, there's a chance the search results will be more biased towards returning links to companies that pay more, yes I know Google work like this with their officially sanctioned adverts on the top & side of the search results but what's to stop companies editing the main results to bias towards them?
  • Disambiguation (Score:5, Insightful)

    by Sukhbir (961063) on Sunday March 11, 2007 @11:34AM (#18307768)
    The thing that really rocks about Wikipedia's search is the Disambiguation function. Even Google does not have something like this.
    • Re: (Score:3, Insightful)

      That has nothing to do with Wikipedia's search functionality. People are required to manually build a disambiguation page, collate entries, and redirect others.
    • by zobier (585066)

      The thing that really rocks about Wikipedia's search is the Disambiguation function. Even Google does not have something like this.
      Yeah actually they do [google.com.au] (that was the quickest example I could think of, not the best).
  • google --> site:wikipedia.org [put your search term]
  • ... but as it ages it becomes more difficult to so quickly find what you are searching for.

    there is an upside and down side to what is proposed.

    The upside is that you might get better results, the downside is that might not get any result as to what you are searching for, unless.....

    It really all depends on how the programmers and users map all the possible findings.
    I'd imagine that some sort of thesaurus like plan of classification and tabular synopsis of categories could allow all to be found by providing
  • I'm hopeful (Score:2, Insightful)

    by sm62704 (957197)
    Google has become less and less relevant. Way too often I google for a search item, and that item isn't anywhere in the results page at all.

    So... this ain't my day. I tried to find a very good example of this, so I put, in quotes, the name of what I thought was a little known group even when they were still together 35 years ago and googled ["joe byrd and the field hippies" lyrics].

    Damn, Google must have fixed it. The last time I googled for that I got tons of lyrics sites, none of which had Joe Byrd. This
  • by ckedge (192996) on Sunday March 11, 2007 @12:13PM (#18308002) Journal

    They might not realize it, but they already have 50 percent of the search market. At least 50 percent of the "Intelligentsia" search market.

    Fifty percent of the stuff I used to "look up" through a google search - I now get through wikipedia. You just have to be smart enough to know that the info you are looking for is most likely in wikipedia. And it most often is. Especially since wikipedia is so open - they've got articles for tons and tons of things that no mainstream encyclopedia would ever touch. I no longer use "fan sites" or "episode guide companies" for the episode guides of TV Series, they're all in wikipedia, and the layout and presentation is even better.
    • Precisely (Score:3, Insightful)

      by metamatic (202216)
      Wikipedia is already a search engine, because the no original content rule means that it doesn't contain anything that isn't summarized from somewhere else, usually somewhere on the web.
    • I can see it now... Google acquires Wikipedia, news @ 11

      I don't know about 50%, but with me they've easily attained 5-10% of my searches.

      Adeptus

    • You make a good point, but this is mostly true only in english.
      In other languages you get much less from the wikipedia.
    • by PhotoGuy (189467)
      I find WikiPedia's search not as good as google (fixing typo's and such), so I tend to do most of my searches with google, adding "wiki" as a keyword, and the relevant wiki articles typically shows up as the first matches. Works well.
  • by brion (1316) on Sunday March 11, 2007 @12:32PM (#18308122) Homepage
    ...at least it would get corrected. ;)
  • by TheCreeep (794716) on Sunday March 11, 2007 @12:41PM (#18308178)

    The service does not yet an official release date.
    So when will it do an official release date?
  • Damn... (Score:2, Interesting)

    by dohzer (867770)
    ... I thought they were going to fix Wikipedia's search function. :(
  • by cyberianpan (975767) on Sunday March 11, 2007 @04:03PM (#18309570)
    Problem is this will require a small band of fanatics to do the editing. Now for the "central/core-cultural" stuff that you might expect in an encyclopedia this model may work but web searches are probably more long tail/niche. Not sure that the editing group could ever be representative. Furthermore the risk of bias on small sample size gets even larger. Some of the bias mightn't even be conscious: e.g. exhibiting a preference for a rigourous page over a "dummies guide" (which might be more popular/widely useful).

    Much better would be a behaviour based search engine that inferred when users were un/happy with results- e.g. user doesn't come back for more searches or click more links on existing return.Also even say if a user does a "poor" search firstly & then uses "clearer" terms then engine ought in future suggest the "clearer" terms as alt search or even return some of the results. Indeed even better the engine might "cluster" you with other similar users & retunr more relavant results (e.g. effectively inferring that you prefer rigourous complete guides rather than dummies intros).

    This would be simpler & actually rely on the wisdom of masses rather than some central command editors, in fact this type of thinking was behind PageRank.

  • . . . a much more promising commercial spinoff of Wikipedia which I profiled in a recent blog post [xodp.org].

Work continues in this area. -- DEC's SPR-Answering-Automaton

Working...