Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Google to Offer API

Posted by CmdrTaco on Sat Apr 06, 2002 12:54 PM
from the now-thats-smooth dept.
philipx writes "From the ruby-talk archives here's a little interesting snippet from a post you have to check out: "Here at Google, we're about to start offering an API to our search-engine, so that people can programmatically use Google through a clean and clearly defined interface, rather than have to resort to parsing HTML." It goes on talking about SOAP and I think this is utterly cool."
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Cool, but.... (Score:4, Insightful)

    by Patman (32745) <pmgeahan-slashdot@@@thepatcave...org> on Saturday April 06 2002, @12:55PM (#3295739) Homepage
    This is very cool, but how long will it last? How will Google make many(and by extension, stay open) when you don't even have to visit their site?
    • Considering most of their money is made through licensing their search technology to other searching outfits and businesses, they shouldn't have any trouble making money.
      • If you look at that snippet of Ruby code there you can see that there is a field for a Key of some sort. I'm assuming google will sell you this service and provide you with a key in which you would use it. I know absolutely nothing about ruby (other than it's name) though this is the first thing that came to mind when I saw that code.
    • They could actually charge for a devkit or usage to break even on the project. Even if it did costsome money, I could see it being well worth the price, if it works well.

      I just wonder how it will tie into my app. Will it open my browser? Will the Google Bar plugin be the foundation?

      We'll just have to wait and see...

      • They could actually charge for a devkit or usage to break even on the project. Even if it did costsome money, I could see it being well worth the price, if it works well.

        I just wonder how it will tie into my app. Will it open my browser? Will the Google Bar plugin be the foundation?


        The post describes a SOAP web service which in most cases is an RPC call in your application of choice. However unlike RPC in days of yore using SOAP to do RPC in applications is relatively easy. If you want to learn more about SOAP I suggest reading A GEntle Introduction To SOAP [weblogs.com] by Sam Ruby for an overview of the protocol and A Busy Developer's Guide to WSDL 1.1 [weblogs.com] to see how one could go from defining a WSDL file (as the Google sys admin is trying to do) to actually accessing the web service remotely from a Java application.

        There is also a grab bag of resources on XML webservices [gotdotnet.com] at the .NET Framework community website.

        To answer your question, if the Google API is available as a web service then it can be intergrated into any application at all from command line to dynamic web page to GUI application as long as there is network availability on the host machine.
    • ..and may work again. I think google is in a unique position where they could make a value proposition by using a combination of a barter system and of course the monetery system. consider this: google could have one of two modes of payment:
      • 1) you pay a subscription fee ($/query, or flat fee, a combination of both like the utilities... whatever works best).
      • 2) for people that don't want to pay, or cannot afford to pay... put up a barter system. the way that would work is, the subscriber gives up clock cycles (in the SETI@home fashion to build up 'karma' or 'virtual money' that can be used to pay for the subscription system mentioned previously. That way people who cannot or don't want to pay for the service directly in real-dollars can keep their computers on and earn google-dollars to later redeem it.
        • Google benefits from the monetery system in an obvious way. They also benefit from the barter system by vastly adding to the crunch power which hopefully improves their indexing/grading system. Unused clock cycles which would otherwise be wasted can now earn some value for the users and at the same time give google the 'value' for providing their service.

          So their 'open' system if presented in the form of barter could actually work for the advantage of both parties involved.

      • That's the beautiful thing about a site that doesn't fund themselves through the use of banner ads. It doesn't MATTER if you access their content through the main website, an affiliate like yahoo, or an api interface.

        They sells text ads on their main site. Yahoo pays them a fee. So how will the API users pay for themselves?

      • No, they don't sell any of the search result slots. They sell ads above and to the right of the search results, but these do not affect the search results themselves.

        - Amit
  • Cool feature (Score:4, Insightful)

    by Rock 'N' Troll (566273) on Saturday April 06 2002, @12:56PM (#3295742) Homepage Journal
    Good idea. By the way, shouldn't /. have a specific "Google" topic?
    • Perhaps a farmer picking apart a haystack, one piece at a time.
    • seriously,

      all of us do nothing but rave about google day and night

      for it is a search engine we love, with a company many of us have come to love

      I for one would love to see google have its own slashdot icon

      Come to think of it, there are plenty of USELESS icons none of us give a damn about

      the following are a few:

      Beanies [slashdot.org]
      E+ [slashdot.org](huh?)
      OS 9 [slashdot.org]

      Heres hoping for a new google icon!!

      Just my two cents, all taxes included

      Sunny Dubey

  • DoS Google? (Score:3, Interesting)

    by David Kennedy (128669) on Saturday April 06 2002, @12:56PM (#3295743) Homepage
    The only problem I can see with this is that there was a recent thread on here about Google blocking a lump of IP addresses as someone in there was automatically querying way too often and affecting their load.

    With the exposed API I could see, by malice or sheer accident, floods of queries coming in...
    • Re:DoS Google? (Score:4, Informative)

      by Otterley (29945) on Saturday April 06 2002, @01:23PM (#3295821)
      In order to DoS Google it doesn't really matter whether you bang on the front door or the back door.

      In fact, an attack through the front door will be more likely to succeed because you're hitting the rendering engine, which takes a lot more CPU time (believe it or not) than the search engine.

      OTOH the back door is lightweight and is as such advantageous for not only third parties but also Google itself to employ.

      Besides, if you're being abused, if you don't want to use technological avenues to keep miscreants away, you can always use legal ones.
  • Text ads... Open standards for content distribution... If only certain other sites [slashdot.org] would follow...
    • Open standards for content distribution... If only certain other sites...

      Ah, but they do [slashdot.org]

      • That's just the headlines... I want messages... moderations... articles... everything...
          • I'd be perfectly willing to subscribe if I got XML data. It's not about the ads, as you've said there are plenty of easier ways to get rid of ads (I use mozilla). It's about being able to write a PHP script to show replies to my posts on my PDA. It's about coming up with my own arbitrary moderation scheme (automatically -1 posts with the word "unconstitutional" in it).
    • Text ads... Open standards for content distribution... If only certain other sites would follow...

      Apples and oranges... Google's bread and butter is their patented PageRank technology, which they license for what I'm sure is a lot of money. Slashdot, having made the decision to opensource slashcode do not have this option, therefore we're forced to endure banner ads and subscriptions as their only source of revenue. Ironic, eh? The people that screamed so loud about how long it took ./ to release the source for slash are now bitching about subscriptions and banner ads.. Like it or not, if slashcode was proprietary it could be sold and licensed and you wouldn't have to see ads here (or at least not the larger ones). Sourceforge figured this out too late, and are now trying to sell the SourceForge software as a source of income.

      Hopefully ./ will wise up and figure out if they ever want to make any real money they'll have to offer a real service.. Like consulting to companies/webmasters to setup slashcode for customers (like MySQL AB does)... Too bad VA Linux went out of the hardware market. I think a pre-configured "Slash Appliance" (sort of like google's Search Appliance [google.com]) would be cool as hell for companies needing an internal collaboration system. ./ has really missed the boat here, IMHO.

      Shayne

  • by astrashe (7452) on Saturday April 06 2002, @01:02PM (#3295763) Journal
    This is really fantastic. I can already think of a dozen scripts or so that I'd like to write to take advantage of this. I love the fact that this is from a Ruby list, and it's about Google. It's not MSDN and MSN.

    They'll need a business model of some sort -- without the ads, and with the potential this has to hammer their servers, they'll need to meter access to the API in some way. But I'll pay -- where do I sign up?

    I'll bet that this is how they'll end up making most of their money a couple of years from now.
    • It's not MSDN and MSN.

      I'm curious as to whether people would actually want such functionality from MSDN. It's one thing to be able to do a Google search from a function call and get the results back as XML but do people want API docs and technical articles retrieved via getArticle() and getAPI() webmethods?

      One place where it might be useful however is KnowledgeBase articles [microsoft.com]. Perhaps a web service that retrieves a KB article given the Q number (e.g. Q123456) might be useful.

      Disclaimer: This post is my opinion on doesnot reflect the thoughts, strategies, intentions or opinions of my employer.
    • One way I'd consider of making money if I were google would be to merge the AdWord links with the top results so they are indistinguishable. If you have a paying account with them, you can opt to "clean up" the results.
    • They'll need a business model of some sort -- without the ads, and with the potential this has to hammer their servers, they'll need to meter access to the API in some way. But I'll pay -- where do I sign up?

      Another option is to give better access to paying customers: a paying customer is given unlimited use of the search, while private individuals (distinguished via IP/registration/...) would be limited to, say, one search per 5 seconds. It would be great to be able to use this API for some small things without having the hassle of paying. A 5-10 second delay isn't very bad in a small home situation, but is out of the question for any larger-scale applications.

      I'd say it would also be consistent with their current user-friendly business model, and give another jolt of good PR for them.
  • Cool! (Score:2, Insightful)

    For my high school senior project I wrote a Java program that made specific searches on google, and parsed the results. I spent 3/4 of my time perfecting the nasty string manipulation to strip out the HTML and isolate indivisual results, urls, etc. in my own databse. Had the API come out two years ago, I would have spent a lot less time on that thing. Way to go Google!

    Could this be in response to the supposed competition from tokohma? open up thier results in some way to increase thier usage?

  • by Anonymous Coward on Saturday April 06 2002, @01:03PM (#3295768)
    Google Terms of Service [google.com]. Some excerpts:

    Personal Use Only [...] You may not take the results from a Google search and reformat and display them

    No Automated Querying: You may not send automated queries of any sort to Google's system without express permission in advance from Google.
    So how useful might that API be if you can't do anything with it...

    • by red_dragon (1761) on Saturday April 06 2002, @01:16PM (#3295805) Homepage

      I'd assume that the API would be subject to a different set of terms and conditions than those for the main site. Given that it'll probably be a pay-for-use service (as another poster hinted at), it'd most certainly be that way.

    • You could write a program that would use Google as its back-end search engine for the Internet. That program could be sold and anyone can use it on their computer. For instance, I can imagine Apple using that API for their Sherlock search application. Just because the software that uses it is distributed doesn't mean that it violates the license. As long as the results are not redistributed (i.e. in your own public web site), and the search is initiated only upon request from the user and it not some kind of cron job, then it's okay. Apple's Sherlock and Mozilla's search tools both conform.
    • Um, doesn't it say you can do something with it, or did I miss a statement about personal use somehow?

      I suspect that, despite the outcry and outrage from some quarters, they're not simply going to give away their entire search engine API connected to their search farm. Perhaps they'll limit it, meter it, and even charge for it. All would be more then fair.
    • No contradiction (Score:5, Insightful)

      by fm6 (162816) on Saturday April 06 2002, @01:55PM (#3295913) Homepage Journal
      The "Personal Use" restriction means that you can't download results for Google than pass them on as your own product. There's no restriction on downloading and reformarting results for your own use. Nor on applications that help you do it. There are already a lot of products that do this -- including plugins for all the major browsers.

      On the other hand, Google would obviously not want you to set up your own search site that passes queries to their engine, harvests the results, and presents them on your own site. That is the obvious target of the "Personal Use" restriction.

      As for the "Automated Query" restriction -- well, what do you think they mean by "Automated"? Programmatic access to their engine? They couldn't prevent that even if they wanted to. "Automated" obviously means programs that issue hundreds of queries for data mining purpose. Example: crawling the Groups archives to harvest email addresses.

      (This was a matter of some concern to me, when I noticed that the Google Usenet archives included all my company's private groups. I'd innocently used by real corporate email, innocently thinking that the groups weren't accessible outside the company. But the spam volume is still very low. Their bot detection software must be quite good.)

      Note that making a simple API available doesn't enable any new kind of access to the Google engine. A clever programmer can already parse the HTML results. The API just makes it easier -- and gives Google another product they can sell licenses for.

      • Yes they can prevent it. I wrote a script to browse google's cache, and they somehow detected it and disallow my IP from running it. But I can still query through the browser.
  • Some unscrupulous players could surely abuse this by 'making their own' search engines that essentially rip off google without any hassle what so ever?
    Ok, it can be done already, but this would make it possibly too easy...?

    Also, this will miss out their ads etc that they get revenue from, I wonder what their long term stratagy is?

    • While it is possible practically. I doubt they could pull it off. Lets say you do create a search engine and all of the sudden a huge number of requests come through? What do you do?

      Or lets say Google spikes the search request at some competitors to prove they are using Google.

      So sure they could do it, but I doubt any popular site could get away with it for long.
    • Some unscrupulous players could surely abuse this by 'making their own' search engines that essentially rip off google without any hassle what so ever?

      Yahoo [yahoo.com] hasn't had enough problems with it to take it down. It's really nice being able to make my own PHP script to display customized stock quotes on my PDA.

    • If you look at the way it works right now, using the interface requires an authorization key.

      If you run the Ruby script, as is, the result is thus:

      #: Exception from service object: Invalid authorization key: xxxxxxxxxxxxxxx (SOAP::FaultError)

      If somebody starts abusing a particular key, it's a no-brainer for Google to shut the key off.

  • I've been writing a bookmarking application that directs the user to Google and later remembers the last Google search so you can resume it. This API will simplify the interface significantly and open up a whole new world of possibilities.
  • by Anonymous Coward on Saturday April 06 2002, @01:13PM (#3295793)
    The first page I visit every morning [monolinux.com]
    ---

    The following is the preliminary code that a particular Google sysadmin (ian@) is trying out. He'd prefer to have a single WSDL file do all of the configure (from Google's end to client), but he first needs to get some advice from an experienced Ruby hacker.

    Also, let's keep in mind that this API will actually be decreasing Google pageviews and hits, which will in turn make their AdWords, AdWordsSelect, and textads less effective. So, it's our duty to continue to support Google and show them that the free/open source software people are behind them 100%. We know that Teoma just doesn't deliver, and Google's already got 3 billion pages indexed and cached.

    Support Google today, because they're the future of information indexing on the Web!

    --- begin code ---

    #!/usr/bin/ruby

    require 'soap/driver'

    endpoint = 'http://api-ab.google.com/search/beta2'
    ns = 'urn:GoogleSearch'
    key = 'xxxxxxxxxxxxxxx'
    service = 'file:GoogleSearch.wsdl'
    query = ARGV.shift || 'foo'

    soap = SOAP::Driver.new(nil, nil, ns, endpoint)

    # uncomment the next line to dump the traffic on the wire
    #
    #soap.setWireDumpDev(STDERR)

    soap.addMethodWithSOAPAction('doGoogleSearch', ns, 'key', 'q', 'start',
    'maxResults', 'filter', 'restrict',
    'safeSearch', 'lr', 'ie', 'oe')
    r = soap.doGoogleSearch(key, query, 0, 10, false, nil, false, nil,
    'latin1', 'latin1')

    printf "Estimated number of results is %d.\n", r.estimatedTotalResultsCount
    printf "Your query took %6f seconds.\n", r.searchTime
  • by Idimmu Xul (204345) on Saturday April 06 2002, @01:16PM (#3295803) Homepage
    key = 'xxxxxxxxxxxxxxx'

    I havent tried to get it to work yet, due to not having ruby installed, but does this imply some sort of subscription service?

    Possibly a new way for them to raise revenue? Im assuming that the bold line means the authors key has been blanked out so other people cant abuse this service for free?

    Lameness filter encountered. Post aborted! Reason: Too much repetition. :/

  • Are they going to release the source code to the search engine itself? That would be REALLY cool...

    We can finally find out how to implement their PigeonRank system...
  • by drok (78225) on Saturday April 06 2002, @01:23PM (#3295822)
    Last year Google temporarily had an XML interface available using a query like: http://www.google.com/xml?q=slashdot

    Of course, now it's just forbidden. I am surprised they would go back to such a service, it would seem to wind up losing revenue for them depending upon whether or not people are good about passing along whatever Ad-words Google returns. They could expect the traffic to be low enough to not matter compared to the continued word-of-mouth benefit. Or access to the SOAP interface could be offered as a subscription model (pure speculation on my part).

    -Robert
    • See, with Google, the Ad's are really links that are for the most part relavant to the topic being searched for.

      If the results are returned using SOAP, then the backend surely would want to display the ads because a lot of the time, they are what the user is looking for.

      I know if I am looking to buy something search Google for vendors, I am more likely to choose a vendor from the Ads on the side. I figure it is a bit safer since these people actually have something invested in it.

      The only reason I can think that someone would filter out the ads is simply because they want to hurt Google. Who wants to hurt Google though?

      The click through rate is probably going to make things hard since there is no way to tell if a user clicked an ad. That just means a different guage...
  • Ode to Google :) (Score:4, Insightful)

    by Khalid (31037) on Saturday April 06 2002, @01:29PM (#3295842) Homepage
    Google has been an enchantment for me since it's beginning !

    They have always made the right decision ! they have offered internet users an incredible asset ! and I was so much grateful when they decided to rescue Deja, a site something I just don't know how I can leave without !

    I view them as the most "honest and fair" site on the Net ! and without any doubt the most useful too.

    Go Google ! you are showing the right way ! to all these stupid-crapy-portal sites which have invaded the net, I just hope you manage to stay in business and prosper for a loooooong, looooong time :)
  • it would have made the creation of my random google searcher [pbump.net] a bit easier, and faster.
  • IS this API going to have A system and method for enabling information providers using a computer network such as the Internet to influence a position for a search listing within a search result list generated by an Internet search engine, because that is what google is being sued for at the momment [slashdot.org]. Interesting they choose now to release the API. Almost as if they can show that the function is an intrigal part of a different system (by way of this new API), that they have a chance in the courts. I'll let you be the judge!
  • com.google.soap.search.GoogleSearchFault: Invalid authorization key: xxxxxxxxxxxxxxx
    at com.google.soap.search.QueryLimits.lookUpAndLoadFr omINSIfNeedBe(Query
    ...

    Alas, looks like the rest of us won't be able
    to play with Google's beta SOAP service. Which makes quite a bit of sense - this would be a great way for Google to allow people to resell Google in a standardized way, be it from inside a program (scary, too easy to reverse engineer) or from some other web service (less scary. :-). It doesn't make much sense for Google to say, "Hey, world, come and use our search services for free without our ads."
  • by fm6 (162816) on Saturday April 06 2002, @01:57PM (#3295920) Homepage Journal
    The keep adding groundbreaking features to their products and throwing them out as if it were no big deal. Don't they know they're supposed to beat the PR drum every time one of their engineers burps? Bunch of commies!
  • by Anonymous Coward on Saturday April 06 2002, @02:37PM (#3296065)
    They do an output without HTML already, but it looks like they've restricted access to it. Compare
    http://www.google.com/search?hl=en&q=blah&output =p rotocol4
    with
    http://www.google.com/search?hl=en&q=blah&output =w ashingtonpost
    with
    http://www.google.com/search? hl=en&q=blah&output=u nclesam

    -nonymous
  • by wka (23275) on Saturday April 06 2002, @02:41PM (#3296092)
    First, here's a link to a current XML API for accessing Google:

    http://www.google.com/xml?q=slashdot [google.com]

    You'll (probably) get an error page.

    I read about this on Scripting News [scripting.com] in February:

    Dave Winer made an inquiry [userland.com] to Google about accessing this XML API.

    Their initial response [weblogs.com] was not very helpful, asking for the link to be removed, and saying that the link is "obviously reserved for Google partners." Eventually, Google let Dave access the API. Now, he sounds like he's under NDA [userland.com] about this.

  • by Perdo (151843) on Saturday April 06 2002, @05:32PM (#3296684) Homepage Journal
    Slash either needs to get a Google box or use these APIs to fix their search feature. There is so much haystack data compared to good needles on Slashdot and the search is so bad that most of the great gems of knowlege that Slashdot has generated might as well have never existed. It can take an hour to find even a popular poster's comments.

    Need to reference John Carmack's comments? Sorting him out of the masses is next to impossible. Even a comment poster as prolific as Signal 11 (arguabley slashdots first and greatest Karma Whore) is nearly impossible to find. First 30 matches of how many? You want to sort through jeffy124's 700+ comments and 24 submitted stories just to find the pertinate one I need by hand? Not to mention the benefit to Slashdot's editors, being able to follow a clear history of articles on a given subject to look for repeats and make more informed editorial commentary. If 90% of readers never read the comments, the editors owe that 90% the sort of editorial commentary attached to each story that only good research can provide.

    In fact, the editors could try it on an interim basis immediately, and provide the service to readers only if they had the resources. I sort of get the feeling that the editors are still thinking of slashdot as a small time blog run out of their apartment closet server.

    Run google on slashdot now and you get the news from three weeks ago. Incorperate a google box or google APIs into Slash so I could search today's news and I would Pay 10 cents of subscription funds per search in a heartbeat.

    Editors: look at the number of hits to your current broken search engine. Double that number because a dedicated google box would be so much better it would get used a whole lot more. Multiply that by 10 cents per search. See if the numbers work to afford the initial expenditure to get a nice yellow rack mount google box. Slashdot is sitting on a goldmine of data and no one can search it and Slashdot cannot profit from it without a nice pay per search subscription using the best engine available.
    • Google does not have a pay for placement plan - if you are making reference to the practice of changing the order of search results based on advertiser dollars.

      That was the very thing that turned people onto Google. I very much doubt that they would change that.