Searching For Google's Successor 282
weink writes "A new generation of scrappy search engines is emerging to challenge the dominance
of mighty Google
. An
article
at Wired
News
lists up-and-coming search engines, WiseNut
, Teoma
, Lasoo
, CURE
, and Vivisimo
. Take a look, and give them a try. But I still say that nothing is better
then the almighty Google
."
AllTheWeb (Score:1)
digitoday.no [digitoday.no] (Norwegian only) today reported about further enhancements of the AllTheWeb search engine. I have tried to do my best in translating some of the article into English;
- Fast will soon change to a new and improved crawler which will find three times as many web pages. That way, Fast will soon cover the whole Internet.
Fast estimates that the web today consists of billions of page, but by removing duplicates and "garbage" the number will decrease dramatically. They estimate that their search engine will cover 1.8 billion web pages before christmas.
One of the biggest improvements is the ability to index dynamic pages. Dynamic pages are web pages you can only access by pushing a button, choosing something from a menu, or filling out information in a form.
The whole article (in Norwegian) can be read here [digitoday.no]. I'm not a translator, and my English is pretty bad, so you are warned.
Google has already proven itself (Score:5, Insightful)
The thing that has most impressed me about Google isn't its technology, but the restraint and good sense they've shown in the Internet community. While every other search engine has tried a go at the portal route, Google has focused on simply being a search engine. They've continued to add features that improve the user's experience at the same time other engines sell their results to the highest bidder.
Some of the most annoying companies in existance came about because they pulled a massive version of bait and switch, they adopted a consumer friendly strategy for the short term but changed when they got big enough to destroy the competition. Google has done remarkably little despite their impressive potential marketing position. Companies like this is where our business should go, it is our power as consumers to make decisions like this.
My point is that if/when something better than Google comes along, you should think twice before changing your homepage. When choosing a company, it's not just who provides the best product in the short term, you have to take into account long term as well.
Re:Google has already proven itself (Score:2)
Meanwhile Google's pages are no more cluttered than absolutely necessary -- even the source is plain and simple.
Yahoo sorta follows this philosophy, but not strongly enough.
Re:Google has already proven itself (Score:3, Insightful)
Fact is, even the company itself may not know; sometimes the switch comes because the company didn't make it trying to execute Plan A, and Plan B is to sell out to advertisers or some acquisition. Anyone have either logic or heuristics to offer as far as trying to navigate these situations?
You are exactly right. There is no way to know what a company is going to do in the future, the best you can do is try to predict their future actions by looking at their past actions. In my mind Google has maintained an impressive consumer focus and generally done right by their users, this gives them positive points. A new company has no past to judge by, so I can't judge them positively or negatively. But the important thing to note here is that I'm judging both companies on their actions, not what they say. So in the end it doesn't matter what the new company says is in their future, Google still wins my business because they've proven themselves to me.
I brought up this whole line of reasoning because it seems like a lot of Slashdot readers are very anxious to "get ahead" or at least get in on the ground floor of the next big thing. We're riding the technology wave here, we don't want to stick with what's proven, we want to move ahead. It would be a shame to lose a good product just because we were taken in by some marketing hype. It happens enough in the real world, let's hope it doesn't happen here.
wisenut? no thanks (Score:2, Interesting)
Candidate Roundup (Score:5, Informative)
Teoma - needs to crawl a lot more before it becomes a viable alternative. Obviously it can find the easy stuff, but most people (I hope) don't use search engines to find the easy stuff. Results are easy to read, and categories meaningful and well placed. Phrase match is kinda cool, because you get to put back in your common words that Google disallows ("and", "the", etc).
Lasoo - lousy spelling looks terrible, even if it was intentional. Aside from that, what makes this different to Mapquest.com plus a Yellow Pages? I know which I'd rather use.
CURE - this search engine has reached its user limit so I'm not allowed to search. Boy, is that going to be popular
Vivisimo - is a metasearch engine, whatever the FAQ begs you to believe. If you like em, then sure, but speaking personally, they are of no particular use to me.
Google still rocks my world, with cacheing, fast fast oh so fast searching, and relevance that beats the crap out of everything ever. Rock on.
Heh. Google is still tops in my book. (Score:2)
Lasoo doesn't load
Vivisimo plain sucks. Nasty interface. Long load times.
Wisenut isn't bad, but it certainly isn't good.
Teoma has promise, but the searches tend to take a long time on arcane subjects. No easily accessible advanced search functions.
I won't even begin going into CURE. How dare they slander the 80s dark pop/goth/electronic group with an interface that cheesy. Nix the graphics and bring up the friggin' search box without the glitz.
Thanks, but no thanks, guys.
All I want from a search engine... (Score:3, Interesting)
Re:All I want from a search engine... (Score:3, Interesting)
Why does Google not include certain pages? (Score:2)
I also noticed that one of my pages didn't make it into Google and I'd really like to know why. It's linked from the top page and there is nothing different from the other pages. I linked to a PDF file on that page (also on my site) which also didn't get included. Unfortunately I don't have access_logs, so I cannot tell for sure whether the page got spidered at all. I'd really like to know what I'm doing wrong.
Better hardware than Google (Score:2, Interesting)
Why? Well. They have developed special hardware to do their search. And it's damn fast (that's where they got the name, I guess). However, the software running on their hardware isn't as good as Google, and I really wonder why...
My conclusion: The software Google is using should have run on AllTheWeb's hardware. That would have been one hell of a search engine.
No I don't like it, either...
Not PC (Score:2, Funny)
Researcher's Search Engine Other than Google (Score:3, Informative)
is Citeseer [nec.com]. It's popular among researchers since you can directly peek into papers...
Google also celebrates holidays! (Score:2)
Cache, Dmoz directory, PDF, Deja/usenet... (Score:5, Insightful)
Site slashdotted? Hit the cache
Want to see a dmoz.org directory? See it page ranked.
Doing science research? Find the answers in indexed PDF files.
And the list goes on...
Not to mention they do the right thing advertising wise, run on linux. Bring on the upstarts, but they'd better be prepared for a good bit of starting to knock down google.
Re:Cache, Dmoz directory, PDF, Deja/usenet... (Score:2)
Re:Cache, Dmoz directory, PDF, Deja/usenet... (Score:2)
Re:Cache, Dmoz directory, PDF, Deja/usenet... (Score:3, Insightful)
As Google is actually making money off their operation, they are more likely to keep constantly improving their technology. I believe that this will help keep them in the forefront of the plethora of search engines out there.
Re:Cache, Dmoz directory, PDF, Deja/usenet... (Score:2)
who would have thought?
Re:Google insufficient (Score:2)
Google was most exciting to me... (Score:4, Insightful)
I do hope these other engines (many of which I've tried, and they ain't bad) offer up some competition, because a monopoly is bad even when the monolopy provider is so good. But in the meantime it's great to finally see a product suceeding so well based entirely upon its merit.
I found another feature of google's yesterday (Score:3, Informative)
Yesterday, I found a new feature that I enjoy. Try typing 'link:' into the Google search. It tells you all the sites that link to that site.
I know if you own the site, you can check it out with an HTTP_REFERER, but that isn't always the case.
Easiest was to find the best search engine (Score:3, Funny)
All The Web? (Score:2)
I do not know how it stacks up to google but I know that it is pretty darn fast.
What did we learn... (Score:2)
Re:What did we learn... (Score:3, Informative)
Solving the keyword problem (Score:2)
Even with Google, I find that my keywords don't always match what the indexed sites use. Often it takes three or four tries to get the right keywords that will get me useful information.
Teoma sounds promising, since getting one site in a topic group can get you more in that topic group.
But are the search engines independent? (Score:4, Troll)
Since the search-engines are becoming our pointers to information, they do have a lot of control over what information we see. I doesn't matter that some web-server in malaysia has a web page describing the complete meaning of life, the universe and everything, if it's not in the search engines.
If all search engines are controlled by the same government (and yes oh yes, they are controlled) the web suddenly becomes biased.
Try searching for "marlboro" on google. What would you expect ? The marlboro home-page ? Oh, no; we have the Marlboro College, poems, but no tobacco company home page. Coincidence? Well, a search for IRIX gives me the SGI home page, so I think the search engine works as designed - what do you think?
Re:But are the search engines independent? (Score:2)
Maybe you should have picked a different cigarette, say "Newport," "Salem," or "Raleigh?"
Re:But are the search engines independent? (Score:2)
Considering that (a) Marlboro is not a tobacco company but a brand of cigarettes, (b) they do not appear to have an official website, NO I am not surprised by the Google results.
Try searching for Marlboro at other search engines like Altavista and Yahoo, and you will get similar results.
Now try searching for "Philip Morris," the maker of Marlboro cigarettes, and you will find they are the very first link -- just as expected.
Conspiracy theories... how quaint.
Re:But are the search engines independent? (Score:2)
IRIX is a product of SGI. Entering IRIX gives me SGI. Entering Marlboro does not give me Phillip Morris.
Now try searching for "Philip Morris," the maker of Marlboro cigarettes, and you will find they are the very first link -- just as expected.
If they didn't do this - the plot would be too obvious.
Conspiracy theories... how quaint.
Thank you 8)
Re:But are the search engines independent? (Score:2)
Contrary to popular belief, Google cannot predict what you expect to appear at the top. It can only present the most highly rated content. And how it decides that is well known -- it gives more weight to URL's that are linked from more places. And the more popular sites carry more weight in their linkage (presumably).
Back to the point - cigarettes be damned ;) (Score:2)
So maybe I don't have hard evidence that google is indeed biased already.
But my initial point stands - are the search engines independent? It's pretty much indisputable (hmm.. indisputable on
Centralized control over information (or, pointers to information in this case) is a potential problem.
Am I wrong ?
So, how do we deal with this ? As a regular joe-user there's pretty darn little one can do to prevent this centralization from happening - or ?
Re:Back to the point - cigarettes be damned ;) (Score:2)
Uhhh... WHICH search engines? There are many. Independent from WHAT? The government? Uhhh, yeah I would say there's a pretty good chance that the American search engines are not in cahoots with the government. Call it a hunch.
it could indeed be a problem to the credibility of the web if say 99% of the information being returned by search engines is returned from engines controlled by one government.
First, the "web" has no credibility, it is not a person or even a single entity like a company.
Second, there would only be a problem if the dominant search engines were in countries without free speech rights. I'll go out on a limb and say the U.S. has one of the better standards of free speech in the world. The dominant search engines like Yahoo, Google, Altavista, etc. are all in the U.S. I don't see any problem with "credibility."
Centralized control over information (or, pointers to information in this case) is a potential problem.
Please explain how there is any centralized control over the search engines? They are all separate entities.
Am I wrong ?
About what? I can't figure out your argument.
So, how do we deal with this ? As a regular joe-user there's pretty darn little one can do to prevent this centralization from happening - or ?
It's pretty simple. The internet is enjoying a free-market economy. You use the search engines that give the best results. The search engine with most users wins. The search engine that returns illegitimate results, if there was such a search engine, would not be popular.
These things can work themselves out in a free market.
They miss the whole point. (Score:4, Interesting)
Should we stop trying? No, the need for relevant results hasn't been fulfilled, except in the most minimal ways. But we need to look for new answers. I think that to take this any further, it will mean going client-side. To make results more relevant requires too much cpu power, to aggregate it at the engine website. A client side agent, using google as a starting point, and sifting through the results, spidering through them, makes sense. Don't start whining about traffic increase, the same thing happens now, only it's the person himself doing the spidering.
Also, the entire keyword paradigm is at odds with but the most simplistic search. Sometimes I'm looking for a diagram, or I'm looking to buy aa hard to find part. Some engines, like lycos allow you to search for audio or stills, but it borders on lameness. This needs to be epxanded. You need to be able to tell the engine, "hey I'm just looking for general info" or "hey I want to buy something with these parameters". For instance, the diagrams I look for, they can either be gif/jpeg or ascii art. A decent engine/agent should have no trouble returning results thaat reflect these requirements. Same with the "buying" type search, the electronic parts I'm looking for are not common items, and adding a keyword of "shopping cart" doesn't always cut it. As I see it, there are at least a few different types of searches, that a person might make.
I want to buy this item (or a simlar)
I want to find info (of an encyclopedic nature)
I want to find leads about (I don't quite know what I'm looking for yet)
I want to hear news about...
I want to find this file/software (or a similar one)
I want to be entertained about/with...
These things all lend themselves perfectly to a client-side agent. Those websites that don't bother to tag images properly, and yet the image is just stylized text? An agent has the power to OCR it back to normal, something an engine could never hope to do. Get rid of all the mirrors? Google is better at this than any other engine, but can it compete with an agent that can recognize a text mirror or a html page, or vice versa? Or any of the other nifty little optimizations that aren't even obvious to me at the moment? Sure, there will be problems. I'm not sure Joe AOL being able to accept that a proper search will take longer than it takes for a web page to load, but it still seems like the next killer app to me.
Wisenut ignored my robots.txt (Score:4, Informative)
Hence, I refuse to use wisenut.
Re:Wisenut ignored my robots.txt (Score:3, Interesting)
I'm curious to see... (Score:2, Interesting)
www.alltheweb.com Works for Me (Score:2)
Of course, Google is now the only player in town for Usenet Searches [alltheweb.com] since they bought Deja (and if they're reading this, I want them to bring back Deja's hierarchical nesting features...)
Most of 'em look the same (Score:2)
As an example, I did a search on "lisinopril", the generic name for a blood pressure medication I take.
Where as google provided one "category" besides the search results, WiseNut provided 10 relevant categories to further break down my search (ACE Inhibitors, brand names, blood pressure, heart attacks, drug information, etc.
Teoma provided 8 different categories, and vivisimo provided 11 categories and "more" option for more categories.
Personally, I find this to be a nice feature of all three of these engines. As for relevancy of the information, that's really a hard thing to quanitify.
Given the choice, though, I'm going to add WiseNut and Teoma to my list of search engines that I use. Beyond the features mentioned above, they took one good idea from Google and that's to keep the search screen sparse and uncluttered.
Just my humble opinion...
Google's sense of humor (Score:5, Funny)
31337 H4x0r g00g13 [google.com]
Google in the language of "Bork, bork bork!" [google.com]
Igpay Atinlay Ooglegay [google.com]
I'm feeling lucky (Score:2, Funny)
Re:Google's sense of humor (Score:2)
Re:Google's sense of humor (Score:2, Funny)
Re:Google's sense of humor (Score:4, Insightful)
To me this really shows the personality behind Google. They are a company of friendly, caring people, which is apparent just by looking at All About Google, or looking at the story of one of their staff taking a bike trip [google.com].
Google is a company with culture, a web site with a personality and a huge Linux cluster that they show off to the world. IMHO, Google's corporate personality has helped make it the best. That personality is what keeps the staff working, coming up with new ideas and technologies that push the web forward.
I don't see that on any of these new engines, and I think that that will in some ways dig their graves, just as Altavista's selloff did. Remember when it was altavista.digital.com? Remember feeling that there were people behind that site who cared less about how much money AltaVista was making and more about improving search technology? Then it turned into its own enterprise, no longer Digital's expariment. When it became a garbage portal, it lost that wholesome goodness that it once had. RIP, AltaVista. Congrats Google, live long and prosper.
Re:Google's sense of humor (Score:2)
Google Linux search [google.com]
Google BSD search [google.com]
Google iMac colors search [google.com]
The best thing about Google... (Score:2)
Vivisimo is a bit hard to pronounce (and I almost spelled Visio).
[accent=British] "Teoma". That's a tinny word, don't you think?[accent off]
In all seriousness, naming choice is very important as you all know. If you can't remeber the address, you won't go there. And don't say anything about bookmarks. I usually type in the URL of the sites I visit often.
Um, ask slashdot? (Score:5, Funny)
W.H.A.T. (Score:4, Informative)
I'm biased as I worked on it for a year, though.
no successor yet (Score:2, Funny)
No viable successor yet [google.com].
Search Engines We'd Like to See: (Score:4, Funny)
EinsteinExpress - When you absolutely, positively have to have next month's kernal patch yesterday...
SlashBot - The home addresses and personal phone numbers of FP'ers and goatse.cx linkers.
BootyCall - All porn all the ti... wait a second. We've got images.google.com for that! Sorry, my bad.
As much as I like google (Score:2)
Teoma discussed earlier on /. interesting article (Score:4, Informative)
--CTH
Index Size (Score:2, Interesting)
my 2 cents (Score:3, Informative)
1) They all try to distinguish themselves by stating "we're not just another search engine...". Basically, they are.
2) Wisenut is by far the least bloated, and it shows in terms of speed.
3) Lasoo combines "white pages" with a web directory. Clever, but putting it all on one page is a bit overkill IMHO.
4) None of them is as configurable as google.
However, it will be nice to see how they develop. They all need an innovative feature though, something to make the switch from google worthwhile.
Hmm.. (Score:4, Insightful)
Ok, yeah, I know how to use '-', but its still annoying...
Re:Hmm.. (Score:3, Insightful)
Ok, I agree with the rpmfind mirrors, but I have to disagree on the newsgroup issue. Usually when I'm really stuck on something (ie: Linux SMP box hanging under high network load (which makes backups a real bitch), forcing me to power cycle : flawed APIC handling for the 3c905 ethernet card), I hit google specifically LOOKING FOR NEWSGROUP discussions on the topic. Granted, I dont need 50 mirrored copies, but I definitely do want to see newsgroup archives indexed.
Re:Hmm.. (Score:2)
See, there's this thing called groups.google.com [google.com], and...
Re:Hmm.. (Score:2)
(ie: Linux SMP box hanging under high network load (which makes backups a real bitch), forcing me to power cycle : flawed APIC handling for the 3c905 ethernet card),
Out of curiousity, did you find a fix for this? I think that may be explaining the odd lockup I get on my system that I haven't bene able to pin down...
Re:Hmm.. (Score:2)
another workaround (that hits performance, but fixes the problem) is to use the "NOAPIC" option at the boot prompt. Supposedly it's fixed in the alan cox kernel, but it doesnt seem to appear in the linus version's changelogs. it may be fixed in 2.4.8
Re:Hmm.. (Score:4, Insightful)
Re:Hmm.. (Score:2)
Re:Hmm.. (Score:2)
Sure, if I'm searching form something like 'how to setup my dvd drive on linux' I want a HOWTO (and I go to yahoo for that), but for more obscure things (like maybe 'how to setup my mpeg decoder card on linux') the newsgroup and mailing list archives are very useful.
That's one of the main features of google for me.
Re:Hmm.. (Score:3, Funny)
Re:Hmm.. (Score:3, Informative)
Re:Hmm.. (Score:2, Informative)
So it sounds like theoretically the NEAR operator should be unnecessary.
Re:Hmm.. (Score:2)
Re:Hmm.. (Score:2)
Try searching for a "warez"* copy of gnu emacs 21.
You will get a million newsgroup mirrors. you will get thousands of results for emacs 20 that happen to have "21" in the url.
to get rid of some of these issues includeing rpms it helps if you use the logical nots
foo-bar tar gz -rpm -re -re: -from:
remember that google will only let you include more than 10 words on your search so go down to the bottem search box if you need to an "search within results"
*so named because its not released yet but does exist. Nothing like living in a world where free software is pirated.
God? (Score:2, Funny)
Getting smarter (Score:2, Insightful)
Re:Getting smarter (Score:2)
Meta-search (Score:2)
Maybe not on the web (where it might get threatened) but at least a command-line tool or CGI script.
Re:Meta-search (Score:2)
Likes, Dislikes, Pros, Cons (Score:2, Interesting)
Wisenut
Looks like google without cache, wiseguide provides a nifty preview of categories with matches.
Teoma
Match phrase button handy, no cache
Lasoo
Nice maps, but not a search engine for finding general topics, more geared to finding locations
CURE
Is this a search engine? Hit the user limit so got nowhere.
Vivisimo
The best of the lot. Nice frame layout, organization by category, but lacks ability to jump to page.
Goggle used for Cracking Databases. (Score:2)
Google catalogs open Administrator websites, and some of those websites have no or weak passwords. I reference google, since it does a good job of treeing websites. Search engines seem to be a good tool for looking for websites with weaknesses.
Example..
If you search on google for "myPHPAdmin" you can find databases without password protection. You can do simple things like SQL queries for Credit Card information or even Drop tables.
Lucky nobody has wrote a trojan that searches google for unprotected databases and drops all tables. Oh wait, maybe they have....
One of the great features of Google (Score:5, Insightful)
Re:One of the great features of Google (Score:3, Offtopic)
'nuff said.
Re:One of the great features of Google (Score:3, Funny)
It's a good start, but since we still don't get any hits for...
http://images.google.com/images?q=natalie+portman+ grits [google.com]
Re:One of the great features of Google (Score:2, Interesting)
Re:One of the great features of Google (Score:2)
No idea, but I also like their conversion of PDFs to text, and caching of the text.
I also love the cache because I can read sites that are 404'd. Great for digging up old specs on hardware.
Re:One of the great features of Google (Score:4, Interesting)
That kind of convenience is hard to beat by a general purpose search engine. The story changes if you start using meta information to narrow the search. Google does not do that as far as I know. However, using meta information inevitably narrows the scope of a search engine. Efficient distributed search engines for multimedia are currently emerging. E.g. morpheus actually uses meta information attached to a mp3 allowing for searches for tracks of a particular album, more albums of the same artist and so on.
Re:One of the great features of Google (Score:3, Insightful)
Re:One of the great features of Google (Score:2)
Re:One of the great features of Google (Score:2)
You can also include a form on your site that does this for you and google can customize the search results page to match the layout of your site.
Nothing tops google for tech support (Score:2, Informative)
The Best Part of Google (Score:2, Insightful)
What about other document file formats? (Score:2)
They could also easily support compressed documents, e. g. pdf.gz or pdf.bz2.
If the import filter really "understands" the file format (if it knows where things are emphasized or in bold, or larger font, not just the result of pdftotext given to the indexer) the quality of the query results could be improved as well. Words in headings or larger font could be regarded as more relevant for a page (in a similar way that words in h1 or h2 are considered more relevant with HTML).
Re:Google destroyed Deja (Score:1)
Re:At least "google" is spelled correctly. (Score:3, Insightful)
Re:At least "google" is spelled correctly. (Score:2, Funny)
At least you didn't sit there and type in the hundred zeroes.
I would do it, but the lameness filter doesn't like it.
Re:Yahoo (Score:3, Informative)
http://search.yahoo.com/bin/search?p=Apple+Assembl y+Line [yahoo.com]
Compare the results to this search submitted to Google:
http://www.google.com/search?sourceid=navclient&q= Apple+Assembly+Line [google.com]
(The first result is one of my pages. I made the rounds of several search engines a little while ago to check the page ranking. Yahoo is using Google's search results more or less unmodified.)
Re:Yahoo (Score:3, Informative)
Re:Altavista (Score:2)
Re:Altavista (Score:3, Insightful)
I remember... (Score:2, Funny)
Re:Altavista (Score:2)
Re:Lasoo (Score:2, Funny)
I haven't compared it to google yet, but I'd say Lasoo has its place in my utility belt. After typing in my address, I was able to click on "Bars" and now I know exactly how far my house is from each of the nearest local pubs! The distances are in meters, so I'll have to only drink imported beer and crawl metrically -- which kind of makes sense since I won't be on my feet anyway.
Re:Lasoo (Score:2)
very simple process actually. I was ablt to quickly figure out when you click it zooms in, so click the "out" button to zoom out to click to another region and then click in or simply type in your street address.
Re:phrase matching (Score:2, Informative)
Re:phrase matching (Score:2)