Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Check out the new SourceForge HTML5 internet speed test! No Flash necessary and runs on all devices. ×
The Internet

Google's Search Appliance 250

An anonymous reader noted that Google is working on a Search Engine that you can install behind your corporate firewall for indexing your internal documents. It's a bit thin on information, but it looks like for as little (cough) as $20k, you can have your own google box. Not for everyone obviously ;)
This discussion has been archived. No new comments can be posted.

Google's Search Appliance

Comments Filter:
  • by yobbo ( 324595 ) on Monday February 11, 2002 @10:47AM (#2986691)
    People don't have THAT much pr0n do they?! :)
  • by larien ( 5608 ) on Monday February 11, 2002 @10:48AM (#2986700) Homepage Journal
    Certainly I'd see value as a user of a huge corporate internet. Several times I've wanted to find information on some of our internal pages which, of course, I can't use google.com for because of the firewall. While there is an internal search engine, it's results can be less than stellar and I've missed Google.

    Aside from anything else, it gives Google a revenue stream so they can continue to provide their services (web, image and usenet searches) for free; they need to find a valid business model, and hopefully this can contribute.

    • My former employer is currently in the process of evaluating a '4-node' 'Google-Box', very neat hardware, essentially a mini-rack about 2.5 feet high, with 4 1U rack servers (presumably a mini linux cluster), storage, and a UPS.

      The selling point for them:

      As a governmental organization, regulations stipulate they must be able to provide online content to the RCMP upon request, so it must be hosted on-site. As I'm sure most corporations have similar guidelines, this could be a big cash cow for google at some point.

      Google's top notch search technology, now on-site? Sign me up!

    • Aside from anything else, it gives Google a revenue stream ... they need to find a valid business model ...

      Google's "sponsored links" seem like a valid business model to me. Search on something generic like computers [google.com] and you'll see pastel links pop up with advertisements. I imagine people pay a nice chunk of change for those.
      • by leviramsey ( 248057 ) on Monday February 11, 2002 @11:51AM (#2987062) Journal
        Google's "sponsored links" seem like a valid business model to me. Search on something generic like computers [google.com] and you'll see pastel links pop up with advertisements. I imagine people pay a nice chunk of change for those.

        Google runs on two business models: the Sponsored Links model (and the Google Sponsored Links are much more effective than any other online advertising out there) and the sale of search services (to Yahoo!, Washington Post, et al).

        Fact is, Google's already profitable. Why? Because they didn't make the moronic mistakes that the other dot-coms did. Have you seen a Google Super Bowl ad? Have you seen a Google ad anywhere? Exactly. The Google model is, quite simply, you run a lean and mean ship that gets the job done well, and you make money.

        • > the Google Sponsored Links are much more effective than any other online advertising out there

          I guess you have some data to back that up? Why are googles ads better than others? Because they annoy you less? When's the last time _you_ clicked on a google sponsor because of their compelling attraction.

          > Fact is, Google's already profitable.

          I guess you know that from their public financial statements right? (sarcasm) Or maybe because you'r on the board? Hmm, didn't think so.

          So, aside from being a google fan-boy (of which I am one myself), where to you get these wonderfully objective facts?
          • by jedrek ( 79264 ) on Monday February 11, 2002 @01:17PM (#2987555) Homepage
            Well... we had a 6% click-thru rate on our test run of 10.000 which cost us a whoping $110. I don't think that's too bad.
          • When's the last time _you_ clicked on a google sponsor because of their compelling attraction?

            Google's ads tend to be relevant to what I'm searching for, so I click on them often.

            Last summer I looked up filk music after seeing something about a "space-themed filk concert featuring Kathy Mar and..." at Stanford the day before the Mars Society convention. I searched for filk, and there was an ad to download some of Kathy Mar's music from mp3.com! [mp3s.com] I listened to what mp3.com had and then went to the concert. During the concert, I met Kathy and also met the guy who put the ad up.

            Oh, did you mean "What was the last time I bought something through Google adwords"? I haven't yet, but I am now a filk fan and plan to buy Prometeus Music's Space CD [prometheus-music.com] when it comes out. (Kathy's CD, which I didn't buy, is also a Prometheus CD.)

            I also ran $50 worth of ads for my non-revenue-generating bookmarklets site because I thought it would be a cool way to give Google money. I don't know how many people run ads without the intent of making money, though.
          • I think the biggest reason for thinking they're likely to be successful is because they're targeted; if you're looking for something in particular and you get an advert related to it, you're more likely to click on it than you are on $some_randon_ignorable_banner.
      • Exactly. In rebute of the thread's parent, Google doesn't need to "find" a valid business model. They have one, and have had one for quite some time. Google is a profitable comapany (albeit a private one). They make money. If you make money, that is a valid business model.

    • I don't know why /. doesn't cache them already. It's not like it would be that difficult.

      IIRC /. was one of the few sites I could actually reach during 9/11, it would make a lot more sense for them to implement this themselves to save other sites from the destruction of /. readers :)

      Just my worthless .02
  • by hawaiianshirt ( 245591 ) on Monday February 11, 2002 @10:48AM (#2986701)
    Everywhere you look, companies are hawking products geared for searching internal documents. Google is making a good move; enter an expanding market as an established leader in searching.
    • I'm suprised it's taken Google so long to get in on the act, after all Northen Lights got into this just recently as well (Can't be bothered to search for the old /. link to the story right now though).

      Three years ago I was involved in impelementing a similar box, from Excalibur Technologies, for the company I was working for during my university gap year (it was there that I first start reading /. too ;-) The company was a massive multinational ex-British state owned utility and wanted to be able, amongst other things, have every single company document on the network and have a database of all staff and their skillset so that as relevent business units were formed managers could place staff already on the books rather than get contractors in. The system sold for several hundered thousands pounds, so there's plenty of money in it even if it's only the big companies who are going to really need this kind of thing.

      Judging from the website Google clearly have some fantastic technology, and they certainly have the reputation, they should do very well.
    • The only problem with this that I can see as that most internal documents a company would be interested in aren't HTML documents that link to each other. So how are they going to page rank thousands upon thousands of stand alone .DOC files?
      • Google searches .doc files.

        http://www.google.com/help/faq_filetypes.html [google.com]

        1. What file types are returned in a Google search? There are 12 main file types searched by Google in addition to standard web formatted documents in HTML. The most common formats are PDF, PostScript, Microsoft Office formats:

        Adobe Portable Document Format (pdf)

        Adobe PostScript (ps)

        Lotus 1-2-3 (wk1, wk2, wk3, wk4, wk5, wki, wks, wku)

        Lotus WordPro (lwp)

        MacWrite (mw)

        Microsoft Excel (xls)

        Microsoft PowerPoint (ppt)

        Microsoft Word (doc)

        Microsoft Works (wks, wps, wdb)

        Microsoft Write (wri)

        Rich Text Format (rtf)

        Text (ans, txt) ~jeff

    • We've already spent way to much just for the software from someone else. Still have yet to launch it though. Google should have done this long ago as soon as they realized their software works. Well, ok, that's an oversimplification, but still, the worked on these corporate search programs before, and they just weren't up to par.
  • hmm. (Score:5, Funny)

    by raindog151 ( 157588 ) on Monday February 11, 2002 @10:49AM (#2986704) Homepage
    will it also index employee email?

    Searched the intranet for 'herbal viagra'.
    Results 1-10 of about 1,279,500. Search took 0.14 seconds.

  • Splendid! (Score:4, Interesting)

    by johnburton ( 21870 ) <johnb@jbmail.com> on Monday February 11, 2002 @10:49AM (#2986710) Homepage
    I see more of this in the future - if you want a search engine, buy one and put it on the network. If you want a web server, buy one and put it on the network. You want a disk server... Well you get the point.

    As hardware continues to get cheaper and software more expensive as it gets more complex it makes sense to do this rather than trying to configure multiple applications all on the same server.

    And good luck to google making money on this so they can keep their search engine fast and free of annoying advertisments.
    • That sounds great, until you take a step back and look at all the *crap* that people have tried to sell this way. Most of these products are just cheap PCs running a free UNIX, a little bit of other free software like a web server/router/firewall/sendmail, and maybe a little web config tool to help you set it up. I've seen products like this [shunra.com] sold for $30K or more! FYI the shunra is a horrible network simulator product that I evaluated at my last company - we ended up building something way better for $0 plus the cost of a PC, using FreeBSD and DummyNet. Look at all those lame-ass NAS boxes which cost $1500 and up. Why would I want to pay that kind of markup for the simplicty of setup, when the box is so severely cripped compared to a cheap PC? Unfortunately not everyone realizes how easy it is to do this stuff themselves, so there will always be a market for garbage like this.

      Now, there have been a few notable exceptions, and these are only the ones where the value of the software far exceeds that of the hardware needed to run it. This googlebox sounds like one of them. Another PC-based Internet appliance that is almost worth the $$$$ is Cobalt's Qube and Raq products - I wouldn't buy one myself because I know how to set up all that stuff w/o a pretty web UI, but I've heard great things from people who have purchased them.

      It's just too easy to get ripped off buying these appliances.
  • by egburr ( 141740 ) on Monday February 11, 2002 @10:53AM (#2986731) Homepage
    I've been looking (when not otherwise distracted) for a good search engine for my documents on my home network, on a linux server. So far, I haven't found anything I've liked (or that even seemed to work very well).

    I would like to find a search engine that will index:

    • text files
    • html files
    • PDF files
    • names of binary files
    Unfortunately, I am not able to spend much to purchase such a search engine (say $20, not $20K). This would be for my personal use, not for any kind of commercial use, and would not be funded except by my anemic hobby budget.

    Does anybody have any recommendations?

    • Try http://www.mnogosearch.org

      Brilliant search engine. It has parser for most file-formats (You can use pdf2txt to index your pdf-files). It even indexes your mp3's if you should happen to have some on your local net.

      Free (at least as in beer) for Unix. Binaries for Windows costs between $99 and $699.

    • by richieb ( 3277 ) <richieb.gmail@com> on Monday February 11, 2002 @11:15AM (#2986871) Homepage Journal
      Try htDig [htdig.org]. It does all these things and is free software. I used it on a corporate intranet in the past. Not as good as Google, but you can't argue with the price.

      • Can too (Score:3, Insightful)

        Finding that vital piece of information can be far more important than $20k, especially to a large organisation.

        • Finding that vital piece of information can be far more important than $20k, especially to a large organisation.

          Very true. However, try convincing the average corporate bean counter. So, instead install "htDig" and actually show that you can make $20K, with a search engine on the intranet. Once the people who use and need it are "hooked", you can proceed to getting Google (after all you should have supported software for "mission critical" functions, and you are much too important to administer htDig :-))

      • "Not as good as Google,"
        OK, fair enough. Have some suggestions for how to improve it? Unlike Google, you can tailor all the search weightings in ht://Dig.

        Either general suggestions like "titles should be weighted more" or parameter changes would be quite welcome.

        It's open source, it's yours. So don't you want to see it improve?

    • try ht://Dig. It's free and works with *nix. Info about pdf indexing is here: http://www.htdig.org/FAQ.html#q4.9 [htdig.org]
      It's a good solution for a small to medium sized website. If you run Linux, it might be on your install CD's, or might be installed already.
    • You actually gave me a very good idea that i think the community could benefit from. Because I'm not positive that what you're looking for is just an internal website search engine. My guess is that you're looking for something to search all documents in all directories (all readable by you anyway) on your local network.
      I can imagine this wouldn't be a tough task if you created a modified 'locate' command in perl with an updated updatedb script that would check for text files (cat those - store results in SQL database), strip html docs off tags (SQL those results), pdf2txt your pdf files and just store the names of binaries, heck you could even run "strings" on binaries if you were so inclined and store the results.
      Of course this would be much more disk and processor intensive than your typical updatedb so you might only run it say, once a month, or once every 2 weeks. But it could be a real life saver. The best thing todo would be to have one SQL server, with a cgi frontend, so you could just goto your webserver on your internal network, type in your query, and the engine would tell you on what machine in what directory you could find the document. I'm actually considering writing this now unless someone else has already done it, please reply if you know of a similar or identical system.
  • by BTWR ( 540147 ) <[americangibor3] [at] [yahoo.com]> on Monday February 11, 2002 @10:53AM (#2986736) Homepage Journal
    Google did exactly what us fanboys all whined and complained for - a company that made a good product (awesome search engine) without selling out (no popup ads). Google offered a free service, built up an enoumous following, and now offers its premium service for a premium price, while insuring its loyal customers continued free services. Forget eBay, Google is an Internet-Success-Story worthy of such praise!
  • by jellomizer ( 103300 ) on Monday February 11, 2002 @10:53AM (#2986737)
    The companies that are useing the apliance are Large Corporation with Hundreds perhaps Thousands of computers and Millions of files and documents to find. The real question is how much money is the company loosing from people who have to redo misplaced documents. or make new ones which are simular to an other document that someone else made a while back. In a large corportation a Thousand of people working at $20 an hour are taking 1 hour to redo a document or spend time finding it. It makes up for the caust. Also if it gives google more money the better change the search eng. Stays free and without a ton of anoying avertising.
    • If you consider the amount of time needed to create a search engine like Google, you'll see that $20k is very cheap. At my company, IT charges our dept $100/hr, so $20k only gives you 200 man-hours. And, that's cheap! In talking with some of my friends, their IT dept charges almost $500/hr, which would only give you 40 man-hours. I'd much rather pay Google for their search engine than get a product from IT that they threw together in 40-200 man hours.
  • by mESSDan ( 302670 ) on Monday February 11, 2002 @10:59AM (#2986778) Homepage
    From C|Net [com.com].

    It's a little more indepth than the India times article.

  • Is this new? (Score:2, Interesting)

    by TechnoLust ( 528463 )
    Our corporate intranet has an excite search on it, and the intranet is not accessible from the net. I doubt they would have paid $20k for it either. Does anyone else have something like this, because I was under the impression it was common to have an internal search engine?
  • Ouch. Try HTDIG. (Score:3, Informative)

    by Kozz ( 7764 ) on Monday February 11, 2002 @11:01AM (#2986798)
    Yes, quite CLEARLY it's only for those who've got some cash to blow. If you've got a modest-sized Intranet site, I would highly recommend htDig [htdig.org]. I've installed and configured it in several places and it works like a charm. Best of all, it's GPLed! Sure, it doesn't have all the fancy matching algorithms used by Google, but it does a damned good job nonetheless.
    • Re:Ouch. Try HTDIG. (Score:3, Informative)

      by ghutchis ( 7810 )
      Actually, saying it doesn't have all the fancy matching algorithms isn't really fair.

      Granted, we can't implement Google's patented things, but that's not to say we don't come close.

      Indexing the text of links to documents? Yes.
      http://www.htdig.org/attrs.html#description_fact or

      Keeping track of the weight of links pointing to a document? Yes.

      Probably the big "missing link" is a proximity weighting. Interested? Help is always welcome!

  • Quick Indexing (Score:2, Insightful)

    by Mattygfunk ( 517948 )
    I could see one of the advantages that this would have is the ability to index pages/emails/whatever very quickly. No need for the wait that accompanies a index request on a web search engine because the spider will be around every hour or less in an intranet.
  • Surprisingly few corporations are willing to spend money indexing their internal document set, as other search engine companies discovered.

    Excite, Altavista, HotBot, Lycos all at one time or another tried to sell to the corporate market with little success. So either things have changed since, or Google management repeating an old mistake from other companies...

    Moreover, companies such as Verity which specialize in corporate search engines have reported falling revenues as of late...

  • by HRH King Lerxst ( 79427 ) on Monday February 11, 2002 @11:03AM (#2986806)
    They just implemented this were I work, it's a vast improvement over what we had before. It even includes the cache and newsgroup features!!

    Two thumbs up!!
  • ... the ht://dig [htdig.org] search engine.

    In this climate of IT layoffs, I reckon it would prove cheaper and better to hire a programmer to take the GPL'ed ht://dig code and hack in some Google-like improvements.

    The major improvement needed is the ability to search on phrases, and to do boolean searches.

    Such a beefed up search/indexing system would not be subject to licensing fees, and would be freely redistributable (say, to other company offices).
    • I've never seen ht://Dig [htdig.org] before. Where I've needed search engines, I've deployed Harvest or WAIS.

      Aside from the GNU license and association with SourceForge, I'm not sure what advantages ht://Dig has over the other free/commercial indexing products. Perhaps somebody has a comparison page?

  • by powerlinekid ( 442532 ) on Monday February 11, 2002 @11:04AM (#2986810)
    At least then the search feature would work right and they can finally cache all those sites that we take down.
  • by base3 ( 539820 ) on Monday February 11, 2002 @11:05AM (#2986817)
    Google's product selling for $20,000, and being based on Linux, is a good counterweight to the FUD being spread by Microsoft et al that cries "If we write a product that so much as uses one GPL library, we have to GPL it. Waaaaa."

    Unless Google reimplemented their own operating system, or <shudder> ported it to Win2K, they have a very expensive product, that runs on Linux, that is not GPL.

    More power to Google--I'm glad to see them finding a way to make money without trashing their search engine, like happened with the previously good search engines that came before (e.g. Altavista, Lycos).

  • Note the date, gentlemen. If Google is selling wholesale software solutions, the countdown clock to paid searches begins today. I'm betting that in less than a year's time we'll be asked to pay for Google searches. Hopefully by that time someone will have figured out a good system for micropayments.

    Free is wonderful, but free doesn't scale when it comes to indexing the majority of the internet.
    • Look at the posts above - there is a link to a BBC report that said that Google is *already* profitable...

      So, if they're now profitable (actually, for the last 2 quaters), why should they charge money now? where's the logic?

      Another issue that someone mentioned here - Yes, Alta vista and other companies did try to sell their search engines and have fallen - but google got 2 points:

      1. They're number 1 in search on the net.
      2. Dead easy setup - plug the machines, give IP, and open your browser - from there you just have to setup where to get the data from and let the machines do the job. Nothing more...

      I wish good Luck for google - I always use it (gg: in konqueror)..
  • by ajm ( 9538 ) on Monday February 11, 2002 @11:12AM (#2986851)
    Part of the success of the google technology is based on the page rank system which depends on many people linking to pages and so "ranking" them. On a corporate site you don't have as many separate opinions (i.e. pages managed independently) so perhaps the page rank part of google won't be as successful. OTOH just having fast search of all the docs would be good here :)
  • It's not that expensive, considering the amount of money a corp wastes every year. If you put it in perspective - it is half of an average worker's yearly salary - and if management thinks it will save that much money over a year. . .
    Companies have private jets so the pres / vp can get wasted while traveling across the country - $20k is nothing.
    Google roxxor! :)
  • Document management (Score:4, Interesting)

    by stinky wizzleteats ( 552063 ) on Monday February 11, 2002 @11:17AM (#2986884) Homepage Journal

    This has a LOT more business application that appears on the surface. And $20K for such a solution is comparable to paying $50 for Red Hat to run a server.

    Back in my systems integration days, we had very many law firm clients who used document management to organize the truly prodigious quantity of information they had to deal with. Spending $50K on the solution was not unheard of even among small firms. In fact, they usually wound up spending $20K just on third party maintenance utilities to support their document management systems!

  • by SplendidIsolatn ( 468434 ) <splendidisolatn@ ... minus physicist> on Monday February 11, 2002 @11:18AM (#2986889)
    Sorry if this sounds uninformed, but I had always been under the impression that Google's Business Plan was based on the idea of a free public search engine and a commercial private one for companies, which would also offer more and better features.

    Isn't this just confirming what we already knew?

    On top of that, depending on the size of your intranet and how efficient/inefficient indexing already has been, $20K may be a bargain.

    Of course, how many companies are really going to have a use for it? For giggles, lets say the entire Fortune 500. That's 500 * 20K = 10,000 K = 10 Million Dollars US. In the grand scheme of things, that's a lot of money, but not a LOT of money. Perhaps they'll add on pay-per-use functions for even ritzier search features?

    Sigs? We don't need no goddamn sigs!

    • $20k is jsut the tip of the iceberg - there's also a good revenue stream to be had in those yearly support contracts for the software.
    • If you read the entire article you would know that there are two versions for sale, one small $20k box which can index up to 150,000 documents, and one "millions of millions" version which costs $250k.

      If a large company puts out all the revisions of all their documents it will be quite a lot of documents :). $250k is still quite cheap for something that will index all electronic documents the company has ever produced.

  • Like infoseek.... (Score:3, Interesting)

    by CDWert ( 450988 ) on Monday February 11, 2002 @11:21AM (#2986906) Homepage
    Years ago Infoseek offered a version of their search engine to Index LARGE collections of documents. We had over 500,000 IT was around 15k if I remeber correctly. Python on a Sparc 20, (20k itself at the time with mem proccesors array and tapes) So we had alomst 4k tied up in the whole thing, There was if I remeber correctly a per site, or per page fee in addition over so many documents, I made an error in a config file once and allowed it to traverse links, other than filling the hard drive, quickly, the additional costing we did after to see how much it would be should we decide to keep those docs was hilarious.

    20k, Isnt bad at all if your talking some serious indexing. We indexed 5, F500 compaines techincal documents at the time, before they were all in house, this was 97-98. It was slick, I often wondered what happened to that software package.

    Anyone know what google is written in ? I decompiled a fair bit of Infoseeks just to see what was what, and because I could :) Indexing LARGE repositories isnt easy and config can be a pain. 20k sounds ok to me. I have YET to see anopen source solution that can handle VERY large document sets ASPSeek, but it still has issues, and over about 2.5 million docs I hear its a dead horse.
  • by Bluedove ( 93417 ) on Monday February 11, 2002 @11:24AM (#2986923) Homepage
    Rather than a google engine to index everything there, i'd rather have a WayBack Machine [archive.org] that allows me to see the variant versions of documents. (that aren't in a revision control system accessible to me)

    Wouldn't it be great for when they say "your code doesn't meet the specification of what the product needs to do" and you can use it to say "let's look to the wayback machine to see when you changed the spec but didn't bother telling me"


  • I know I'm biased (and ignorant), but Google is probably the best general-purpose search engine out there, with truly innovative quality filtering like PageRank(tm) and other very neat tricks. They have been around long enough that even the weakest of minds know Google. If this new retail product is as efficient and clean as their websearch, and well supported, they're going to make a killing! I really hope they find huge success, they've earned it.
  • by victim ( 30647 ) on Monday February 11, 2002 @11:40AM (#2986994)
    Just curious about people's opinions here. Google gets covered fairly regularly on slashdot. Usually when a company that uses software patents to protect its business from competition comes up on slashdot they get reamed along with the USPTO.

    slashdot [slashdot.org] talked about this in 1999 when the patent came up. Its 2+ years later now. google has mostly crushed the competing search engines because the results of their algorithm are preferred to other algorithms. Their revenue sources are not public, but I believe I read recently that half of their revenue is from advertisements and half from technology licensing.

    So, the point for discussion...

    The world's favorite search engine exists because of its software patent. This patent has caused great harm to the competing search engines. Is this ok because...
    • the software patent system is just fine
    • many software patents are silly, but this one is worthwhile.
    • it is a silly patent, but google is good enough that we forget about that.
    • no one cares how google got where they are. It is just good that they work well.
    • it is not ok.
    • Because they don't do evil or annoying things. That isn't a tremendous excuse, but it just works in practice. No intrusive ads, performance is always great for a free service, etc.

      Philosophically, however, I'd imagine that parsing/indexing patents are far more legitimate in many people's eyes, than say, one click purchasing patents.

      • by swb ( 14022 ) on Monday February 11, 2002 @12:52PM (#2987431)
        Because they don't do evil or annoying things. That isn't a tremendous excuse, but it just works in practice. No intrusive ads, performance is always great for a free service, etc.

        Tremendous excuse? I'd say its a future model for all businesses.

        Forget the tedious absolutism of the neosocialists -- that model will never be implemented anywhere (except at the barrel of a gun), and anyone who won't be happy until they get there will never be satisified. However, a company that does a good job at what they do and produces something that they can either give away or appear to give away something without doing the annoying, evil greedy things that other companies do should be the benchmark.

        For example, Mercedes Benz -- what if they still sold their really expensive cars to rich guys who would pay for them BUT they would also sell a car that went 200,000 miles without major service for $10k?

        I think the list goes on -- subsidize basic, honest products and services with expensive stuff that others are willing and able to pay for. It makes you a saint. I don't see why so many other businesses hold onto the "rape everyone" philosophy.
    • Nobody's perfect (Score:3, Interesting)

      So what, Google isn't a 100% libre-kosher company? Name any of their competitor that is. It's called "lesser of two evils".

      As far as I know, Google has never filed for frivolous "IP" lawsuits, they respect web standards, they provide gratis, decent service, they don't fuck with your browser, and they tell you who paid for word placement as opposed to just putting paying advertisers on top without mention. They also happen to use free software and give it good press.
    • by ethereal ( 13958 ) on Monday February 11, 2002 @12:35PM (#2987309) Journal

      I agree with the "many are silly, but this one is worthwhile". Google's approach was non-obvious, innovative, and really advanced the state of the art. It wasn't just another "do what we did before, but with a computer this time" patent.

      I'll admit that it helps that their site is non-painful to use, but that's just gravy. Google's search is so much better that even if their site was a pain, it would still be a worthwhile search tool.

      • I think one could argue that ease if use is part of what makes their results so useful.

        If it was too complex to use for the average computer user to pull the data they need I doubt they could stay profitable. Currently its the best, not only for the results, but how the end user interacts with their system.

        Its amazing how often the "I'm Feeling Lucky" button gets exactly what your looking for.

      • I agree with the "many are silly, but this one is worthwhile". Google's approach was non-obvious, innovative, and really advanced the state of the art.

        Since the "state of the art" advances more quickly in CS than it does in most areas, should we expect Google to place its original patent in the public domain after several years? Or do you think that in several years, someone will invent a completely different algorithm that yields better search results, rendering Google's patent obsolete?
    • Just curious about people's opinions here. Google gets covered fairly regularly on slashdot. Usually when a company that uses software patents to protect its business from competition comes up on slashdot they get reamed along with the USPTO

      This is only alright for google, because the average joe slashdot user doesn't have to pay anything to use their services. (proving further that it's all about the "free beer").

      Look at the .gif or .mp3 standards. When the creators asked for a certain amount of money per usage, slashdotters were in an uproar.
    • I contend they would have succeeded with or without the patent. Like the old Altavista, Google has a cohesive picture of what a search engine should (and shouldn't) be.

      The unwashed mass of portal-shopping-news-flowers-and-oh-yeah-searching engines might mimic the ranking scheme, but the vision and interface? I'd be less surprised if the giant pandas solved their endangerment problem by building underwater colonies.

    • It might be about what they patented... They aren't suing AltaVista for having a search engine. When Amazon sued BN it was because they provided a similar feature, not becuase they copied the code. But, what do I know....
  • to make them profitable. Google does so many things so well, and provides it all free to the world. It's not asking too much, I think, for them to ask companies to foot the bill for something like this if that's what it takes for them to continue to stay in business and keep doing all this neat wonderful free stuff.
    • Google is and has been profitable. They are a private company that makes a profit. I don' know where all this crap comes about "finally Google can make a profit...". Google is expaning their already successful busines...

  • Why? (Score:2, Interesting)

    Google is great search engine for the Intenet, because it ranks pages according to how many other pages link to it. Its very democratic. I don't see how Google behind the firewall would be a viable product, what will it rate document on how many other company documents link to it?

    There a number of other existing indexing engines that are signigiantly cheaper and more mature. Google should stick to what it does best. I guess this shows they aren't very profitable and are looking for other sources of revenue.
  • Not kidding. I work for a very large multinational and the corporate search engine is an excercise in frustration. It's purpose in life seems to be to return bizarre and obscure documents as the results of it's searches.

    $20k is nothing to shell out[1] for the capabilities that Google has.

    [1] In corporate terms.

    • When I was an Intranet webmaster at Motorola, we used 'FreeWAIS' for Intranet indexing, until Corporate security decided that indexing everything was a security risk :-)

      Not kidding. I work for a very large multinational and the corporate search engine is an excercise in frustration. It's purpose in life seems to be to return bizarre and obscure documents as the results of it's searches.

      You actually got results returned from your search server?
      Lucky bastard. Our corporate Intranet search engine usually would just return 'Query Timed out'. Eventually they just took the search boxes off all the web pages.

      I've since built a simple Harvest [ed.ac.uk] index for the Intranet.

      It can be very interesting finding all of the 'cobweb' documents on intranet sites. Ancient documents relating to projects and managers long since vanished among other stuff that management would prefer to see forgotten...

      There are some cool features that are unique to Google, but I'm not sure if 'Convert PDF to HTML' and 'highlight search terms' are worth $20K.

  • Open source, right? (Score:3, Interesting)

    by Zico ( 14255 ) on Monday February 11, 2002 @12:16PM (#2987223)

    Right now Google tends to be among the bigger darlings of Slashdot, but will they remain that way if they release this product and it's not Open Source? 'Cause they're nuts if they're planning on charging $20K for it but making it Open Source. Are they traitors to the cause, or is it just another understandable case of "Money talks, bullshit walks" when it comes to Open Source and the Real World?

  • Google's claim to fame is its ability to rank results properly (something no other search engine ever got right). The rank, if I recall correctly, is _mostly_ based on links from other sites.

    Now, when you're indexing thousands of doc and pdf files on a company network, how many of those link to each other?

    And how many companies have internal newsgroups that can be searched? (No, Exchange shared folders don't count - or can Google index those as well?)
  • Like duh!


    (Please think about it before you roast me.)
  • google's cheap (Score:2, Insightful)

    by sl0ppy ( 454532 )
    for $40000, you can get a sun e220, and run altavista's search engine on it. even then, if you want to integrate it, you still need to do 30-40 hours of work to make it all work right.

    having something for $20000 or so is a godsend, especially if it comes with its own hardware (even though its hardware is probably not as nice as an e220)... throw in that they'll probably do the work when it breaks, and this is a no-brainer for anyone needing to index even as few as 25000 pages.
  • No more 25-man midnight raids that cart off your entire data center. Now the FBI or BSA can just pick up your search applicance.

New crypt. See /usr/news/crypt.