Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?

Tracking Users Via the Browser's Cache 124

Mukund writes to point us to an article he has written about a method of tracking using the browser cache instead of cookies. A demonstration shows that tracking can remain continuous if you clear only cookies or only the cache, but not both. (Firefox's Clear Private Data tool can be set to clear both when closing the browser.)
This discussion has been archived. No new comments can be posted.

Tracking Users Via the Browser's Cache

Comments Filter:
  • Pretty clever.. (Score:5, Informative)

    by CTho9305 ( 264265 ) on Sunday September 17, 2006 @07:59PM (#16127002) Homepage
    For those of you who aren't going to RTFA, basically you send a JS file with a unique ID and tell the browser to cache it... then any page that includes that JS script gets your unique ID... even if you disallow all cookies.
    • Re: (Score:3, Interesting)

      But what if the user has disabled Javascript? Then this method would be useless, no?
      • Actually, yes.
      • Re:Pretty clever.. (Score:5, Interesting)

        by Breakfast Pants ( 323698 ) on Sunday September 17, 2006 @11:19PM (#16127633) Journal
        Sure, but they could just put a small iframe to foo.html and mark that page as cacheable, on that page have a small image, dynamically generated, to [unique_id].gif and mark the image uncacheable on your server. Now when you visit, your cached copy of foo.html tries to download [unique_id].gif every visit.
      • Re: (Score:3, Informative)

        by mTor ( 18585 )
        Exactly. That's why I use NoScript [noscript.net]... and everyone else should too! Get it and you'll eliminate all kinds of attacks.
      • Re:Pretty clever.. (Score:4, Informative)

        by TheLink ( 130905 ) on Sunday September 17, 2006 @11:43PM (#16127735) Journal
        It'll be useless.

        But do a search on "Timing attacks on Web privacy".

        ALSO, I don't think you even need to use timing attacks because a browser that caches that has stuff cached will behave differently from a browser that caches but doesn't have stuff cached. Pretty obvious isn't it?

        There is no way around that except to use a browser that doesn't cache at all - which will affect browsing performance. For slightly less privacy you can use a browser that always starts in the same state for each browsing session.

        AND even if you use such a browser, if you have a distinctive browsing pattern and fingerprint, people could still identify you.

        e.g. you use a noncaching, no-js browser, with a fake User-Agent (says it's IE but behaves like Firefox), and you start browsing with a particular site first at a certain time followed by another site etc - or you load a particular bunch of sites in the morning (opened in tabs). Could get quite distinctive ;).

        But there are far more important things that people should be worried about. What their government is doing for instance.
        • Re: (Score:3, Informative)

          by baadger ( 764884 )
          Another approach to try and prevent this might be to get the browser not to send conditional GET requests *at all* and to just reload silently from cache.

          This however would of course mean that everyone has to make sure their webpages are properly cache [web-caching.com] able [ircache.net] with reasonable (perhaps dynamically generated) expiry dates.

          The nature of HTTP and the web make it very difficult to remain totally untrackable all you can really do is prevent the worst of it.
          • by x2A ( 858210 )
            But why bother even visiting a website where you distrust its creators so much as to need to employ such methods against them?

          • by Ed Avis ( 5917 )

            Another approach to try and prevent this might be to get the browser not to send conditional GET requests *at all* and to just reload silently from cache.

            Back when I used a modem I had the wwwoffle [demon.co.uk] proxy server set to always used cached pages whenever possible - the only way to get an updated version from the site was to hit Reload. It was nice and fast, and sometimes useful to be able to still browse a site that had disappeared in the real world, although on hitting Reload your precious page would disappe

          • by TheLink ( 130905 )
            Not sending "conditional gets" won't prevent my proposed method from working since that method involves an item/url that is marked as cacheable that causes the loading of another item/url that is marked as noncacheable.

            When the browser first loads the cacheable item, it will get a unique cacheable item which points to a unique noncacheable item.
            Thereafter if it is "properly behaved" it will keep loading the same unique noncacheable item everytime it is pointed to the cacheable item.

            The trick of course is t
    • by Feyr ( 449684 )
      proxies are going to wreak havoc on this scheme :)
      still a nice trick though
    • Javascript can compromise anonymity! ... Wow. ... What else is new? I mean, even if this particular story hasn't been referenced, I think this could qualify as a dupe ;-)
    • by MarkRose ( 820682 ) on Sunday September 17, 2006 @08:34PM (#16127189) Homepage
      Well if anyone tosses their cookies in my java, I, for one, am sure not going to drink it!
    • by Anonymous Coward
      You don't need to store that unique id in a javscript variable.
      Send some image (webbug), say it should be cached, but "must-revalidate" and "hijack" the Etag/IF-*-Match headers.
      • by baadger ( 764884 )
        The those that don't know, the HTTP "Etag" response header is a unique key (most of the time a hash) that identifies (and verifies) data sent across HTTP. This means the tracking website wouldn't even need to use a 747rhf28r.png garbage filename meaning such tracking could be accomplished in something as mundane as a website's corporate logo.

        However, this alone wouldn't tell the tracking website anything the couldn't find out from decent analysis of web server logs, essentially just how often you hit a page
    • by fm6 ( 162816 )
      Well, I did RTFA, and I wish I been lazy for once. The dude takes 3 or 4 long paragraphs to say what you said in a single sentence. I am so tired of Slashdot stories where TFA is a half-witted rant by some blogger who flunked Freshman English.
  • An interesting idea (Score:3, Interesting)

    by FonzCam ( 841867 ) on Sunday September 17, 2006 @08:00PM (#16127013)
    But seriously most people leave cookies on and those who know to turn them off are probably the sort of people who regularly clear their cache. The percentage of users you could target with this would be very small for the effort required. If tracking user usage is that important to you then just refuse to serve the page with cookies disabled.
    • Re: (Score:3, Informative)

      by shird ( 566377 )
      Except IE6+ has a default setup to block cookies from being set by sites other than the one you are on, cross domain cookies or whatever theyre called. ie. banner ads that set cookies etc.
    • by jp10558 ( 748604 )
      Wouldn't someone figure out how this works and write a proxomitron filter, userJS and or Greasemonky script to kill it, forget about those who care and run with JS off and turn it on in site specific prefs, use NoScript or something similar in proxomitron?
  • by Elgonn ( 921934 ) on Sunday September 17, 2006 @08:00PM (#16127014)
    So it still doesn't work on some of us.
    • by misleb ( 129952 ) on Sunday September 17, 2006 @08:41PM (#16127226)
      That's OK because the browsing habits of the type of people who turn off Javascript are not particularly interesting anyway. So it all works out.


      • by jZnat ( 793348 ) *
        I'm sure you'd find just as many porn sites in their history as users who blindly enable JS for everything.
        • by misleb ( 129952 )
          Keep in mind that we're not talking about raw, unfiltered browsing history or a compete dump of the cache. We're talking about a single thread of tracking. Like you visit one site which puts the said javascript in your cache and later visit another site which references the same item and they knew you were at the first site. They can't read your whole history.

        • I'm sure you'd find just as many porn sites in their history as users who blindly enable JS for everything.

          ... but it would be the other kind of porn ;-) Think about why they disabled javascript and cookies in the first place!

          Funny thing: my captcha for this post was backside. I kid you not!

  • Seems a bit paranoid (Score:2, Informative)

    by Anonymous Coward
    Regarding Sourceforge/Google. Did he consider that Google's automated email may have gone to sourceforge alias which was then forwarded to his email address?
    • by chrisd ( 1457 ) * <chrisd@dibona.com> on Sunday September 17, 2006 @08:19PM (#16127124) Homepage
      That is indeed what we do, send the confirmation email to the blah@sourceforge.net alias. We do -not- have the translated email addresses and thus the only information we are using is that which is displayed on the project home on SF.

      • by mukund ( 163654 ) on Sunday September 17, 2006 @08:29PM (#16127172) Homepage
        Hi Chris

        I did receive the email on my sourceforge.net address. My problem was not with which email address I received the mail at. I don't see why I have to be contacted for a Google service, when my subscription is with Sourceforge.net.

        Don't take this the wrong way. I have used Google services for a very long time, but I think this is a bad precedent. Picking up an email address in an automated way from a website and mailing me about your services, when I haven't asked for it is as good as what a spammer would do. And the email suggested you had a table of projects, which made me assume Sourceforge shared this with you. If Sourceforge.net didn't and you can attest that I'll edit out that part of my article (I would not want to blame Sourceforge for something that they didn't do).

        To the parent poster: This may seem paranoid.. some other poster suggested the same to the other Canonical-Debian issue too (on the other blog). When something is not right, it simply needs to be questioned. That's all.

        Kind regards,

        • by rossturk ( 975354 ) * <rturk@ostgTEA.com minus caffeine> on Monday September 18, 2006 @03:06AM (#16128356) Homepage

          We provided Google with a list of registered project names on SourceForge.net to allow future integration between the open-source repositories with minimized namespace conflicts.

          The email you saw, if I am not mistaken, was generated when someone tried to create a project at Google with the same name as a SF.net project you belong to.

          Unless I am very mistaken about Google's intentions (and I don't think I am), your email address was not picked in an automated way. It was a direct result of an action that was relevent to you, specifically. That may or may not make it seem any better to you, but I don't find it particularly nefarious. Rather, I think it's good that Google and SourceForge are working together to protect your interests..

          Ross Turk
        • by Bob Uhl ( 30977 )
          I'm pretty certain that sf.net email addresses/project associations are public.
  • NoScript Extension (Score:5, Informative)

    by rdwald ( 831442 ) on Sunday September 17, 2006 @08:08PM (#16127074)
    Saved by NoScript [noscript.net] again. If you're not using it, you really should; it can block exploits before anyone knows they exist! (Since they may require JavaScript, and this would block them. My statement is strictly true.)
    • Re: (Score:3, Insightful)

      by ShakaZ ( 1002825 )
      I agree that NoScript is a must have and would by default block this tracking method... However let's imagine it's integrated in a website for which you have enabled javascripts, then you're f@cked... and from my personal experience it looks like everyday there are more sites which you can't use correctly whith scripts disabled
      • by x2A ( 858210 )
        "However let's imagine it's integrated in a website for which you have enabled javascripts, then you're f@cked"

        Only if you frequent websites that are trying to f@ck you. If you are, perhaps you should look at your own browsing habits.

    • I am trying it now using Mozilla Firefox version 2.0b2 running in my knoppix remaster (see screenshots, below).

      Here is my brief description for those who have not tried it:
      The extension shows a bar at the bottom of the browser when one goes to a website, showing the status of the blocker. Then, if it is something like etrade.com, and you want to work with it, you can easily allow it. One can close the bar when on a page, and the NoScript icon remains at the bottom right of the browser window. If you click o
    • Well, uh. Blocking JavaScript will only block this particular implementation of tracking, but it won't fix the problem. And please stop sending this Firefox propaganda here. It is not informative, as pretty much everyone here are already aware of the extension mentioned. Why are these comments are modded up?

  • How often does an average Slash reader close his Firefox window?

    (I ask because I leave my Deer Park and Safari windows opened for months.)
  • In other news (Score:4, Insightful)

    by $RANDOMLUSER ( 804576 ) on Sunday September 17, 2006 @08:24PM (#16127150)
    You can have total anonymity or marginal functionality. Since HTML alone offers almost nothing in the way of functionality (beyond rendering) you need something more (JavaScript, Java, Flash, ActiveX (arguably in ascending order of dangerousness)) to provide even rudimentary functionality. If I'm really so tinfoil-hat that I'm worried about my browser cache betraying what I'm up to, I probably need some medication and/or an air-gap between me and the Internet(s).
    • Re: (Score:3, Insightful)

      by ResidntGeek ( 772730 )
      What? How is it paranoid if a method is demonstrated to allow you to be tracked through your cache? You think people won't use this? Do you think only people with tinfoil hats think advertising companies have been tracking people on the web for over a decade? I'm honestly confused, please explain yourself. By the way, if you clear your cache and cookies often, you CAN have both anonymity and functionality.
      • I didn't say you were paranoid, you must have imagined that.
        • If I'm really so tinfoil-hat that I'm worried about my browser cache betraying what I'm up to, I probably need some medication and/or an air-gap between me and the Internet(s).

          You're right, you didn't say paranoid, you said tinfoil-hat. And my imagination has been systematically sterilized by the American school system, so there's no danger of me imagining things.
          • OK, I tease, and you come up with a cute ("systematically sterilized"/"no danger of imagining") answer.
            Seriously. I live in a state (Illinois) where you have those radio transponder thingies for the toll roads. I can use them, and ponder that the owners (the state) of the system are tracking which (and when) gates I go through, or I can wait longer in the cash-only lines (ignoring the fact that they've got cameras on my license plates anyways), or I can imagine that "they" have "satellites" tracking my ev
            • Wait... I'm really tired, and I can't tell if I'm missing another joke, so forgive me if I am.

              I used to live in Florida, and used the radio transponder thingie for toll roads. I know the owners were tracking me, because I could view all the gates I'd gone through in the past month. I think you're drawing the line between knowledgeable and paranoid a bit off the mark.
      • by x2A ( 858210 )
        "Do you think only people with tinfoil hats think advertising companies have been tracking people on the web for over a decade?"

        No, just that only people with tinfoil hats /care/ that advertising companies are tracking people. Not everyone has the level of self-shame where they feel the need to hide what they're doing.

  • Old news (Score:5, Informative)

    by christo ( 329 ) <.csoghoian. .at. .gmail.com.> on Sunday September 17, 2006 @08:25PM (#16127152) Homepage
    Move on folks, there's nothing to see here.

    This was done last year, by these guys: Browser Recon @ Indiana University [indiana.edu]

    Defenses against this, and other attacks have been created and deployed through two firefox extensions
    put out by Stanford University: Safe History [safehistory.com] and Safe Cache [safecache.com]

    This stuff ain't new.
    • by ExcalHM ( 986447 )
      Yeah... it's pretty old news... although I know very few who this actually works for.
    • Re:Old news (Score:5, Informative)

      by The MAZZTer ( 911996 ) <megazzt&gmail,com> on Sunday September 17, 2006 @08:36PM (#16127199) Homepage

      Wow that's even scarier than this one in the story. Yours only needs CSS.

      It stems from the whole idea of marking links "visited". CSS attributes can be applied to visited links to set them apart from unvisited ones. The page in your example uses CSS to tell the browser to request a page from the server if a link is visited. This page, when loaded, knows that the load means you visited the website in the link.

      The worst thing is that this is a perfectly legitimate use of CSS by current w3 standards. A preventive measure for browser vendors may be to not allow any external resources to be used in :visited CSS.

      • by jZnat ( 793348 ) *
        Oh my, I never thought of tracking like that. Ouch...
      • by TheLink ( 130905 )
        There was an even older published idea that involved caching of images and timing stuff - I can't find the link at the moment - but it did get mention on a fairly mainstream tech site I think.

        But anyway, I'm not sure why this is such a big deal - this is pretty old and obvious stuff. In general terms if the browser has stuff cached it will behave differently from a browser that doesn't have stuff cached.

        Just a bit of thinking and you can come up with many ways to distinguish between the different browsers t
  • by QuantumFTL ( 197300 ) * on Sunday September 17, 2006 @09:13PM (#16127353)
    I saw this article [asp.net] on Digg a while back, using an ingenous JavaScript that would look at the *rendering* of a link to determine if you'd been there or not (and possibly upload this information to the remote server). That's kinda scary...
  • Thought I'd mention that the parallel IE option seems to be under the "Tools | Internet Options..." dialog, "Advanced" tab, "Security" tree: "Empty Temporary Internet Files folder when browser is closed" (unchecked by default)

    The IE "Security" and "Privacy" tab also contains some options that let you handle cookies and Javascripts different ways for different sites; this is why IE exploits that get around the dividers between different classes of sites are noteworthy.
  • take a look at Firefox' or Mozilla's or Seamonkey's Bookmarks in a plain text editor, it keeps dates about visiting web sites that could be used to track users (that is) if website's servers can access it to look at it. seens like such an unnecessary feature, if i can find a way to shut off the record keeping within bookmarks i would re-write my bookmarks to keep only the name and URL...
  • The author just didn't use the right browser.
    • by x2A ( 858210 )
      If you wanna track your users, then what's important is what browser they use, not what you use yourself, and very few of them use opera.

  • simple solution (Score:3, Interesting)

    by oohshiny ( 998054 ) on Monday September 18, 2006 @12:35AM (#16127923)
    Use separate browsers, accounts, and/or machines for different purposes. I wouldn't dream of using my regular browser for on-line banking, for example.
  • since i do html/actionscript/dHTML stuff, i have my browser cache size set to 0. this would technicaly prevent the id to be cached, no? ant
  • These two firefox extensions can help block some of those style attacks

    http://www.safecache.com/ [safecache.com]
    http://www.safehistory.com/ [safehistory.com]

    They do this by segmenting your cache and history so that each page only has access to each individual history.

    this page has more info about the method they use,
    http://crypto.stanford.edu/sameorigin/ [stanford.edu]
    and this is a *PDF* on the subject

    http://crypto.stanford.edu/sameorigin/sameorigin.p df [stanford.edu] **PDF WARNING!**
  • by TheStonepedo ( 885845 ) on Monday September 18, 2006 @01:27AM (#16128091) Homepage Journal
    Most people that clear history and caches are doing so to prevent snooping done using the location bar and history toolbars (or analogues) of their browser. You don't want your boss/family to see exactly which non-work-related/porn site you were viewing. While tracking a user may be good for data mining purposes, it's not necessarily a horrible thing for day to day use. I don't like the thought that just about anybody knows my browsing habits, but I don't find it invasive unless those tracking me are going to confront me about it. Let data miners collect their statistics; most folks' machines will not clear their history or cookies or cache. My irregular or perverse browsing habits are but a drop in the statistical pond.
    • by baadger ( 764884 )
      > My irregular or perverse browsing habits are but a drop in the statistical pond.

      I bet that's what AOL Searcher #4417749 [nytimes.com] or #927 [consumerist.com] thought...
      • by x2A ( 858210 )
        You've just posted links to pages that contain information about people, collected through such means... this makes you better how exactly?

        • by baadger ( 764884 )
          "better"? What are you referring to exactly?

          My only point was you're just a statistic until someone broadly analyses those statistics, singles you out (for whatever reason) and then decides to analyse your records specifically. At this point you cease to become a drop in the statistical pond.
    • My irregular or perverse browsing habits are but a drop in the statistical pond.

      Watch out though. If these habits get into the hands of banner ad marketing firms, you might be astonished what kind of ads will show up at the pages you visit, even during mundane browsing.

      Can be pretty embarrassing when looking up some coding technique or whatever together with a colleague, and suddenly "interesting" ads pop up, due to your browsing habits the night before...

      Also, better not use the same amazon account for

  • Couldn't this easily be prevented if the browser had an option to only allow Javascript from the original site? I think a similar option for cookies exists and having it for Javascript would be quite useful and prevent other unwanted things.
  • Stealther [mozilla.org] is a Firefox extension which temporarily blocks history, cookies as well as referrer header.
  • is it even possible to use the internet on a network and not be tracked, are there any tools or ways to not be seen by a network administrator ???
    • by dvmrgn ( 1003562 )
      Exactly. When you are using my machine I reserve the right to track you. There are lots of database driven sites where every page is unique and tracked. I work on one where the server logs are loaded directly to the database and there is an interface that "follows" a visitor through the site 15 to 30 seconds behind. To correlate the data longterm to identify individual users from the pool of visitors would be trivial. Not worth it in our minds, but I am sure others are doing it.

You know, Callahan's is a peaceable bar, but if you ask that dog what his favorite formatter is, and he says "roff! roff!", well, I'll just have to...