Catch up on stories from the past week (and beyond) at the Slashdot story archive


Forgot your password?

Google to Continue Storing Search Requests 234

isabotage3 writes "Although he was alarmed by AOL's haphazard release of its subscribers' online search requests, Google Inc. CEO Eric Schmidt said Wednesday the privacy concerns raised by that breach won't change his company's practice of storing the inquiries made by its users."
This discussion has been archived. No new comments can be posted.

Google to Continue Storing Search Requests

Comments Filter:
  • not news (Score:5, Funny)

    by Lehk228 ( 705449 ) on Wednesday August 09, 2006 @10:11PM (#15878070) Journal
    what does google have to do with any of this? it's somehow news that google somehow is confident that they aren't a bunch of total fuckups like AOL?

    my girlfriend's cat could run an ISP better than AOL
  • by Anonymous Coward on Wednesday August 09, 2006 @10:12PM (#15878073)
    Everyone knows that Google is really a front for the NSA. Think about it, massive quantities of data, searches
    that can be corealated and traced back to individual users, gmail storing and 'indexing' all your mail, it's
    the governments wet dream.

    Just wait until Windows-Live services take off , and G-Drive as well. Why not have all your data ready for inspection
    by the nice people at the NSA.

    'scuse me, there's a knock on the door, the folks from the black pizza van prolly wanna ask for directions.
    • the links to the CIA are there for all to see

      Google was created by Hoover back when the interent was just a couple of tubes. Intelligence gathering by vacuum suction.
      The cleaner was just a comercial spin off
      Gmail ... G Male... G Men.

      how much more obvious can they be people?
  • The differance (Score:3, Interesting)

    by gomaze ( 105798 ) on Wednesday August 09, 2006 @10:14PM (#15878080) Homepage
    The biggest difference is that the majority of the AOL searches were done well users were logged into AOL. Thus it will be a bit harder to trace what people search for back to themselves if they are not logged in but not impossible. Here is to hoping Google has a better lockdown policy compaired to AOL.
    • Re:The differance (Score:3, Insightful)

      by CheeseTroll ( 696413 )
      Unless you have a gmail account, and don't remember to log out and delete your Google cookies.
      • and dont forget google adwords to boot
      • There was a previous article a long ways back about how Google tracks search characteristics by IP address. This was before Gmail came out, but I wouldn't doubt they still use it. Your cookies and Gmail are not safe.

        What really concerns me is this: If someone searches for something like their SS# or CC#, thinking that someone had gotten ahold of it and posted it online. They think that the search goes into a black hole when instead, it's now going into a "memory hole," ala 1984.

    • Re:The differance (Score:5, Interesting)

      by AchiIIe ( 974900 ) on Wednesday August 09, 2006 @10:41PM (#15878179)
      Do not forget that
      a) Google keeps a permanent cookie
      b) If you ever used gmail, that same cookie has been linked to your permanent cookie

      We need something that will keep those results private, something such as:
      a) Greasmonkey/adblock setup to disallow google searches access to the cookie
      b) Automated searching tools that will pollute ones searches with fakes,
      c) Deeper leveled (ie Proxomitron / privoxy ) scripts that clear this out

      and while here, I would like to talk about, they have a fantastic privacy policy, I encourage you to read it: []
      • Re:The differance (Score:5, Interesting)

        by nolife ( 233813 ) on Wednesday August 09, 2006 @10:58PM (#15878236) Homepage Journal
        Permanent cookie?
        I set cookies to delete automatically when closing FF and have used some combination of tools or manually doing it at least weekly for years. I doubt mine is anything close to permanent.
        • Re:The differance (Score:5, Informative)

          by nmb3000 ( 741169 ) <> on Thursday August 10, 2006 @12:12AM (#15878471) Journal
          I set cookies to delete automatically when closing FF and have used some combination of tools or manually doing it at least weekly for years.

          I think this is kinda funny.

          The whole original point of cookies was to make a user's life easier. You don't need to log into Slashdot every time you visit the page. You only need to authenticate with GMail or Yahoo once a day to read email. Your shopping cart is remembered. Etc, etc, and yet people are so paranoid that they still clear them out on a regular basis.

          It's true that there's some data mining involved, but I think it's trivial enough that it's not worth the extra effort (IMO anyway). So what if Doubleclick (may they burn in Hell forever) knows that some guy visits Slashdot, ThinkGeek, and PennyArcade? I figure my privacy is fine as long as they cannot link the activity back to me personally. If that bothers you, whitelisting sites makes it pretty easy to weed out data miners, though it can become a pain when sites use cookies for navigation, shopping, etc.

          One tip I do have for IE users, is to try out the Restricted Sites zone. I've added a few sites to it and it drastically speeds up page loads. For example, used to be slow and ad-ridden with popups, but after adding it to Restricted Sites, it has no cookies and no JavaScript which means no ads, no popups, no nothing. Page loads are 500% faster.

          I use my Windows credentials to secure my computer and enjoy the typing saved by not clearing my cookies every ten minutes.
          • Re:The differance (Score:5, Insightful)

            by Jah-Wren Ryel ( 80510 ) on Thursday August 10, 2006 @01:35AM (#15878670)
            So what if Doubleclick (may they burn in Hell forever) knows that some guy visits Slashdot, ThinkGeek, and PennyArcade? I figure my privacy is fine as long as they cannot link the activity back to me personally.

            The ignorance in this statement is so staggering that I had to respond and lose the moderations I've made on other posts to this story.

            If you have any account online for which you have ever disclosed your true identity (like in order to make a purchase) then that account information can and will be cross-referenced with all of the tracking data that the tracking companies have been able to put together on you. They are expectionally good at finding those information leaks and putting 1 and 1 and 1 and 1 together to make 4.

            Don't be lulled into a false sense of security even if you are the type to disable cookies. Cookies are not the only way Doubleclick and the like track people. Embedded images, tags, 3rd party style sheets with god knows what javascript, ip address correlation, etc. The bag of tricks is practically bottomless.

            I religiously use the following extensions to Firefox, with almost every site fully locked out, and even then I still leak personal information like a seive:

            NoScript []
            CookieSafe []
            AdBlock Plus []

      • by Anonymous Coward on Wednesday August 09, 2006 @11:31PM (#15878361)
        In FireFox 1.5.x

        Edit -> Preferences -> Privacy Tab -> Cookies -> Exceptions

        Then add the Google domains you wish to block/allow. This will result in many random cookies being generated by Google for each search done (as they will think you are a new comer each time). Personally I white-list all my cookies, only allowing the sites I trust to set cookies, which are then automatically cleared when I close FireFox.

        Also do not use GMail via the web interface, it is possible to use GMail via an email client residing on your computer. r=13273 [] er=13285 []

        From there you can use your choice of email Encryption/Steganography as you see fit.

        You can only be controlled, if you allow it.
        You can only be surveyed, if you are unaware of or ignore it.
        It's your choice.
        • Also do not use GMail via the web interface, it is possible to use GMail via an email client residing on your computer.


          From there you can use your choice of email Encryption/Steganography as you see fit.

          most excellent... but I'd still have to go online every now and again to selectively archive or delete items.

  • False Positives (Score:5, Insightful)

    by BrianMarshall ( 704425 ) on Wednesday August 09, 2006 @10:16PM (#15878089) Homepage
    I don't like it.

    If the government ever does hunt for people guilty of something by searching people's searches, they are going to get a lot of false positives. There is always more people interested in, for example, bombs, than there are bombers.

    • Re:False Positives (Score:5, Insightful)

      by kfg ( 145172 ) * on Wednesday August 09, 2006 @11:05PM (#15878261)
      If the government ever does hunt for people guilty of something . . .

      Who said they're hunting for guilty people?

    • Re:False Positives (Score:2, Interesting)

      by Zapd ( 29091 )

      There is always more people interested in, for example, bombs, than there are bombers.

      And then there are the clever bombers. The dangerous ones, that don't use Google or Ebay.
    • Re:False Positives (Score:5, Interesting)

      by nametaken ( 610866 ) on Thursday August 10, 2006 @04:15AM (#15879029)
      That's true, but I'm not worried about them finding out that I once read up on explosives. In fact, I'd be just fine if I trusted that they were only finding bombers with that stuff.

      I'm more worried that some day I'll be a reasonably successful businessman (however unlikely), with a big mouth. Then they'll go find all the most vulgar shit my friends and I have swapped via email and use it as a, "look what a f'ing weirdo this guy is... lets have DCFS take his kids because he replied 'ha ha' to that awful video way back in 2002."
    • From this [] statistical analysis of similar screening systems:

      The US Census shows that there are about 300 million people living in the USA. Suppose that there are 1,000 terrorists there as well, which is probably a high estimate. The base-rate would be 1 terrorist per 300,000 people. In percentages, that is .00033%, which is way less than 1%. Suppose that NSA surveillance has an accuracy rate of .40, which means that 40% of real terrorists in the USA will be identified by NSA's monitoring of everyone's ema

      • That doesn't make surveillance useless... it's a classic problem in information theory: precision vs. recall, or whatever you want to call it.

        Precision: What fraction of the RELEVANT data is identified by your search
        Recall: What fraction of the search RESULTS is relevant

        According to that article, NSA's precision is 40%, while their recall is 99.99%. This indicates that their surveillance strategies are actually rather good. The "problem" is that the population studied has many more innocent people than te
    • Well, we could just arrest all the people that were looking, too. Actually, I was going to make one of those cliched "guilty nothing to hide" references, but then it dawned on me -- if we arrested enough of America, won't it eventually get to a halfway point where we just all switch? We get inherent access to prisons for safety, and criminals get kicked out into the real world? I'd live in a cell block if it had free HBO and muffins.
  • by artifex2004 ( 766107 ) on Wednesday August 09, 2006 @10:17PM (#15878098) Journal
    "I know this one guy who asked me to cancel his account last week, and a couple days later his mom found out about his lesbian penguin grits fetish. Now, I'm not threatening you, or anything. I'm a reasonable guy. I'm just sayin', you might want to give that some more thought, Mr. cheating-on-wife-on-the-down-low..."
  • I, for one, don't mind all that much if Google saves my search inquiries, just so long as they keep the information private and (hopefully) anonymous. Google has also had a pretty damn good track record at doing just that.. Comparing them to AOL isn't even apples and oranges.. More like apples and live grenades...
  • Never? (Score:5, Insightful)

    by SandmanWAIX ( 674838 ) on Wednesday August 09, 2006 @10:32PM (#15878148)
    "We are reasonably satisfied ... that this sort of thing would not happen at Google, although you can never say never," Schmidt said during an appearance at a major search engine conference in San Jose.

    Well .. you could if you didnt store them.
  • by LiquidCoooled ( 634315 ) on Wednesday August 09, 2006 @10:33PM (#15878150) Homepage Journal
    Storing every single search performed by every person in the world across a whole epoch could pretty much give you the pulse of the world.
    Watching as news spreads and worries and concerns grow or when good news occurs or even just good publicity, there are millions of people all adding entries into the real hitchhikers guide.

    Google will be almost certain of knowing the current number one chart hit at any location on Earth at any time simply by the concentration of searches for that artist/song, it could follow gun culture or tv plotlines or anything flowing into its servers.

    In the right hands, this could become an amazing asset for the whole world. I believe the current owners of google are primed to achieve such a feat.

    I however wonder what will happen when Page and Brin are gone or are sidestepped by the government.
  • by talledega500 ( 994228 ) on Wednesday August 09, 2006 @10:46PM (#15878189)
    A search proxy will prevent establishign ip and user identity with search terms and tracking of results clicked on. Get hip to it. Alot of services exist. This is my fav []
  • THEY AIN'T PRIVATE (Score:3, Insightful)

    by tomstdenis ( 446163 ) <> on Wednesday August 09, 2006 @10:52PM (#15878213) Homepage
    You sent it as PLAINTEXT over the INTERNET.

    This [or the thing against AOL] is not a story.

    I couldn't care less about Google releasing all the odd shit I look for. If I was I would find a private search engine that worked over HTTPS.

  • Logging vs. Abuse (Score:4, Insightful)

    by otisg ( 92803 ) on Wednesday August 09, 2006 @11:04PM (#15878259) Homepage Journal
    This is to be expected, and Google is right. Of course they won't stop storing all this information about us. Sure, it can be used for all kinds of evil purposes (but they don't do evil, right?), it could be misused, as in the recnt AOL example, or it could be used for all kinds of good things, such as having a search engine that knows what I want before I have time to enter my query.
  • by treerex ( 743007 ) on Wednesday August 09, 2006 @11:05PM (#15878263) Homepage
    Every search engine logs your queries. This is the way it is. If they tell you they don't log the queries, they're lying. The difference is that they don't make it available. In a previous life I worked with several search companies you've heard of on various search related technologies, and they *never* released query logs. Even cleansed the data were kept close to the chest. Queries are going to be logged with the IP address of the user. Some engines will track click-throughs on the results as well. That data is invaluable to a search engine.

    AOL's faux pas here was attaching personal information to the queries themselves: once that per-user identifier was attached all bets were off.

    If you are interested in working with query data, and do not work for a search company, you are shit out of luck, because you can't otherwise get this data. All of the research published on queries was done by Alta Vista, Google, Yahoo, Lycos, MSN... research on spelling correction of search queries is done by the same groups: they're the only ones with access to that data, until this AOL release (or older releases from other companies.)

    Having this data is a boon for researchers, but a net loss for people.

    • AOL's faux pas here was attaching personal information to the queries themselves: once that per-user identifier was attached all bets were off.

      Well, that, and not sanitizing the queries themselves--they should have at least tried to scrub any personal information in the queries as well--apparently people will stick all sorts of things into a search box!
      • They address this a bit in the readme that went out with the data: sanitizing the data corrupts the data, from a research perspective. And it is really difficult to do this adequately. Sure, you can scrub it with a regexp for SSNs or Phone numbers, but names? Using what name list? And what if the names being searched for aren't the name of the searcher, but someone else? That is valuable information. The point is you cannot do this easily, and never adequately.

        I suppose they could have released a subset of
    • And also this kind of tracking is already happening in other fields, for example the Superstores like walmart et al. I have seen an open space of 20 or more persons, all querying a gigantic database made of each and every sale slip from every shop in the country.

      They produce geographical maps of soda consumption, correlate with average temperature, football games, whatever . And if you pay with the shop's buying card then your personnal data is taken into account as well.

      I really doubt this is a legiti

  • by JJJJust ( 908929 ) <JJJJust&gmail,com> on Wednesday August 09, 2006 @11:14PM (#15878300)
    Fact: Google has a Beta Search History feature. It's an opt-in thing, but, it's quite handy. Stores all the searches you make. Really handy if you want to find something you found a year ago. I think Google knows what its doing and how to preserve, protect, and defend its users. Otherwise, I don't think they'd risk offering the service. Now, if only our elected officials could preserve, protect, and defend that little nagging thing called the United States Constitution... and stop nosing in our searches!
  • different approach (Score:3, Interesting)

    by snye ( 819462 ) on Wednesday August 09, 2006 @11:25PM (#15878342)
    Perhaps the solution to this problem is not to keep the data private, but to create a database that is meaningless. During idle time (nighttime, classtime, etc) a computer could run an automated search routine that would create search queries from perhaps, names from, or topics from /. This would bury legitimate search data in a mountain of meaningless data, making the database virtually useless. Of course, it would have the same effect if for every legit search one performs via google he/she then performs three or four bogus searches. Wonder what law that would violate.
  • by schwaang ( 667808 ) on Wednesday August 09, 2006 @11:33PM (#15878363)
    Dear Mr. Schmidt,

    You say you are "alarmed" at what happened at AOL and say "it wasn't a good idea." But please explain what makes you "reasonably satisfied ... that this sort of thing would not happen at Google."

    Are there serious policies in place protecting individual privacy? Is it something actively on the mind of every employee who loads a big pile of search data onto their laptop for some work project? Are there standard tools for scrubbing indentifying information?

    I'd like to give Google the benefit of the doubt here, but this is just too important to me.
  • It's like some weird disjointed conversation, almost 10,000 lines and no clicks.

    joe o
    went thru his social sec files
    friday lawn mow
    do they keep prison records
    say no
    prisoners use to call here
    they don't get no social sec
    lists of them
    not social security
    mean no
    joe to ask you
    did he steal some of that money
    i ask you
    stole from us
    government would have caught
    if steal from them
    took from us
    • Strange, as my results for that user are different, they look like she/he/it has typed in the lyrics of a song, especially as if you imagine you are singing them and compare timestamps they correlate pretty well.

      Who knows what people do when you are bored! Practice typing a song into AOL's search bar. M'OK
  • Global Thermal Nuclear War
  • No one has mentioned the Scroogle Scraper yet? []

    Try the Scroogle Scraper. No Google cookie,
    No Google search tied to your IP address.
    No advertizements. While you're there, donate.

  • by tiny69 ( 34486 ) on Thursday August 10, 2006 @12:56AM (#15878588) Homepage Journal
    Although he was alarmed by Slashdot's haphazard release of its users' online replies, CEO CmdrTaco said Wednesday the privacy concerns raised by that breach won't change his website's practice of storing posts made by it's users.

    "We are reasonably satisfied ... that this sort of thing would not happen at Slashdot, although you can never say never," CmdrTaco said during an appearance at a major website conference in Walla Walla, WA.

    The security breakdown, disclosed earlier this week, publicly exposed about 19 million replies made by over 1 million Slashdot posters during the three months ended in May. OSTG's Slashdot intended to release the data exclusively to spammers and government spooks, but the information somehow surfaced on the Internet and was widely ignored.

    The lapse provided a glaring example of how the information that people post on the website can provide a window into their embarrassing, or even potentially incriminating _ wishes and desires. The replies leaked by Slashdot included condemnations of the current government as well an infatuation with Natalie Portman and hot grits.
  • Give and Take (Score:4, Interesting)

    by Xeth ( 614132 ) on Thursday August 10, 2006 @01:44AM (#15878686) Journal

    While no doubt many people are clambering to speak to the evils of storing search queries, it's a very useful process, and blindingly obvious that Google would keep doing it. And we're not just talking about advertising. Advertising is just a section sliced out of a very complex structure approximating the character of a user. Google has shown a consistent goal of trying to categorize and understand all the information on the web. Why would they pass on an opportunity to build a persistent model of a user? With a nice AI, you could dramatically increase the relevance of a user's queries by looking at their past records and keeping a profile.

    While I am well-aware of the potential dangers of trading anonymity and privacy for a little convenience, it may well be worth it in the long run. Those concerned about governmental influence aren't seeing the big picture. If the government is determined, they'll just look at a higher level. Ask the ISP to parse the input to Google (unless you're connecting to Google over an encrypted channel? I wasn't aware any such thing existed, outside of proxying). Or simply get Google to pass along the IPs of anyone making a hot-list query, no storage required.

  • by programmerar ( 915654 ) on Thursday August 10, 2006 @02:24AM (#15878781) Journal
    On a somewhat related note: i'm interested in the way Google set up their registration for Gmail. You have to be "invited" by someone else. This means that if they saved all the links between people, which i'm sure they did, they could see the network of people all around the world. They could see how many steps any person is separated from another.

    Like someone said a few posts aboove, all the saved searches do amount to a very interesting sample of peoples minds. In the same way, Gmail registration data will be an interesting sample of human networking.
  • by Ph33r th3 g(O)at ( 592622 ) on Thursday August 10, 2006 @07:18AM (#15879341)
    These won't keep your searches secret (your ISP can log every request sent in the clear, and you can't trust proxy operators who even if they're good guys are under tremendous pressure from the authorities to log and cooperate--you can be tracked on JAP/TOR if each hop is compromised--think gag order/honeypot/PATRIOT Act/RIP Act/<insert other country's tyrannical law here>), but they will help keep any one search engine from having enough data to create a comprehensive psychosocial dossier of you:

    • Use different search engines--spread the love.
    • Scrub the Google cookie, change IPs early and often if your ISP makes it easy.
    • Use TOR [] or JAP [] when possible. (Don't forget, fresh cookie every time.) They're not perfect, but makes it less likely you'll be in the dragnet unless you're a specific person of interest--good intel isn't exposed chasing small fry.
    • Don't vanity search or search on identifiers for people close to you on a machine you use regularly.
    • Salt your searches with misinformation. Interested in motorcycles? Search for flower gardening. Arabic? Search for German. Search for random stuff now and then.
    • Don't tip search engines off to your plans. Don't do searches containing the words "how" and "to" unless you're looking for HOWTOs. They're common words anyway, and don't really help.
    • Don't use services like Gmail and search at the same time. (The wisdom of providing Gmail with personally identifying information and using it at is questionable given Google's aggressive data gathering.)

    Executive summary:

    Don't assume anything you type into a search form isn't being logged with as much information, including your IP, that they can gather. Search accordingly.
  • A Google Question (Score:2, Interesting)

    by wehup ( 567821 )
    How does Google respond to a subpoena issued as a result of a legal action. Example: Law enforcement obtains the Google cookie ID and requests information from Google in an attempt to prove prior intent for some action. What about the insurance company that wants to prove someone knew of a pre-existing medical condition, but didn't bother to disclose it?

    Does Google simply fork over the information?
    • They won't say how they respond, or how many such requests and subpoenas they receive. And that's enough for me to assume the worst. Eventually, if they're complying, citations will start to leak into court records--but since those are behind sites not generally indexed by search engines, it'll take an involved lawyer or a layman who happens to read the docs on the case throwing a flag.
  • I hope they at least remove USPS/UPS/FedEx/etc tracking numbers before releasing the stored search data. Not much else gives away someone's location better than a tracking number reporting a package was delivered in SPRINGFIELD, KY at 1pm on 13/13/95.
  • Aren't these search queris part of Googles "payment" for providing the free to use search function?

    What is Googles financial motivation for providing its search free of charge to the world? For all the hardware/datacentre/bandwidth to keep those spiders out there working, to rank all those pages and provide the search engine optimised so it gives good results? They dont advertise on the search page so no money there, their benefit comes from knowing what people are searching for and whether they find it or
  • by tom6a ( 871216 ) on Thursday August 10, 2006 @08:59AM (#15879828)
    What information could Google release/lose/etc if the data was not protected? According to their privacy policy Google records the following information in their server logs:

    Here is an example of a typical log entry where the search is for "cars", followed by a breakdown of its parts:

    * - 25/Mar/2003 10:15:32 - [] - Firefox 1.0.7; Windows NT 5.1 - 740674ce2123e969
    * is the Internet Protocol address assigned to the user by the user's ISP; depending on the user's service, a different address may be assigned to the user by their service provider each time they connect to the Internet;
    * 25/Mar/2003 10:15:32 is the date and time of the query;
    * [] is the requested URL, including the search query;
    Firefox 1.0.7; Windows NT 5.1 is the browser and operating system being used; and
    * 740674ce2123a969 is the unique cookie ID assigned to this particular computer the first time it visited Google. (Cookies can be deleted by users. If the user has deleted the cookie from the computer since the last time s/he visited Google, then it will be the unique cookie ID assigned to the user the next time s/he visits Google from that particular computer).

    See ght=c4171#c4171 []
  • by sdo1 ( 213835 ) on Thursday August 10, 2006 @09:00AM (#15879839) Journal
    No matter what safeguards are in place, ANY company like this is only one stupid intern away from a similar situation as AOL faces. Even if there's absolutely no malicious intent, information like this tends to have a very low vapor pressure. The information exists, and as the AOL incident points out, people want the information (as witnessed by the incredible number of articles, websites, and discussions about the content of the AOL database).

    Someone will eventually screw up. It's inevitable. It's Murphy's Law... if it can happen, it will... especially given an ample number of opportunities. And there's lots of opportunities for someone to mis-handle this data.

    I'm usually fairly on top of things like this, but to be honest, until this happened, I didn't know that Google Personal Search History existed. And apparently the default is to save the history and have it attached to my gmail account. I've now deleted the history and paused the data collection, but does that mean it's really gone? How do I know... maybe it's just hidden for now and not really gone. And it's a little bothersome that the default is to keep the data. The default should be to not save it attached to any sort of personally identifiable informaion unless I give explicit, and repeated, permission to do so.

  • I assume they whole fuss is being made because they store the IP address with each query. Otherwise, it would not be a privacy issue. So one has to wonder: why *do* they store the IP address? What is the value to them? And if they are concerned about privacy, why not store a hash instead of the original IP, if some sort of information grouping is wanted?
    So, obviously, *if* they indeed store the IP address the only explanation is that they do not give a damn about privacy and actually use that information.


In less than a century, computers will be making substantial progress on ... the overriding problem of war and peace. -- James Slagle