Last Updated: 2015-02-24 16:41:58 UTC
by Johannes Ullrich (Version: 1)
There are a number of different use cases to track users as they use a particular web site. Some of them are more "sinister" then others. For most web applications, some form of session tracking is required to maintain the user's state. This is typically easily done using well configured cookies (and not the scope of this article). Session are meant to be ephemeral and will not persist for long.
On the other hand, some tracking methods do attempt to track the user over a long time, and in particular attempt to make it difficult to evade the tracking. This is sometimes done for advertisement purposes, but can also be done to stop certain attacks like brute forcing or to identify attackers that return to a site. In its worst case, from a private perspective, the tracking is done to follow a user across various web sites.
Over the years, browsers and plugins have provided a number of ways to restrict this tracking. Here are some of the more common techniques how tracking is done and how the user can prevent (some of) it:
1 - Cookies
Cookies are meant to maintain state between different requests. A browser will send a cookie with each request once it is set for a particular site. From a privacy point of view, the expiration time and the domain of the cookie are the most important settings. Most browsers will reject cookies set on behalf of a different site, unless the user permits these cookies to be set. A proper session cookie should not use an expiration date as it should expire as soon as the browser is closed. Most browser do offer means to review, control and delete cookies. In the past, a "Cookie2" header was proposed for session cookies, but this header has been deprecated and browser stop supporting it.
2 - Flash Cookies (Local Shared Objects)
Flash has it's own persistence mechanism. These "flash cookies" are files that can be left on the client. They can not be set on behalf of other sites ("Cross-Origin"), but one SWF script can expose the content of a LSO to other scripts which can be used to implement cross-origin storage. The best way to prevent flash cookies from tracking you is to disable flash. Managing flash cookies is tricky and typically does require special plugins.
3 - IP Address
4 - User Agent
The User-Agent string sent by a browser is hardly ever unique by default, but spyware sometimes modifies the User-Agent to add unique values to it. Many browsers allow adjusting the User-Agent and more recently, browsers started to reduce the information in the User-Agent or even made it somewhat dynamic to match the expected content. Non-Spyware plugins sometimes modify the User-Agent to indicate support for specific features.
5 - Browser Fingerprinting
A web browser is hardly ever one monolithic piece of software. Instead, web browsers interact with various plugins and extensions the user may have installed. Past work has shown that the combination of plugin versions and configuration options selected by the user tends to be amazingly unique and this technique has been used to derive unique identifiers. There is not much you can do to prevent this, other then minimize the number of plugins you install (but that may be an indicator in itself)
6 - Local Storage
HTML 5 offers two new ways to store data on the client: Local Storage and Session Storage. Local Storage is most useful for persistent storage on the client, and with that user tracking. Access to local storage is limited to the site that sent the data. Some browsers implement debug features that allow the user to review the data stored. Session Storage is limited to a particular window and is removed as soon as the window is closed.
7 - Cached Content
8 - Canvas Fingerprinting
9 - Carrier Injected Headers
Verizon recently added injecting specific headers into HTTP requests to identify users. As this is done "in flight", it only works for HTTP and not HTTPS. Each user is assigned a specific ID and the ID is injected into all HTTP requests as X-UIDH header. Verizon offers a for pay service that a web site can use to retrieve demographic information about the user. But just by itself, the header can be used to track users as it stays linked to the user for an extended time.
10 - Redirects
This is a bit a varitation on the "cached content" tracking. If a user is redirected using a "301" ("Permanent Redirect") code, then the browser will remember the redirect and pull up the target page right away, not visiting the original page first. So for example, if you click on a link to "isc.sans.edu", I could redirect you to "isc.sans.edu/index.html?id=sometrackingid". Next time you go to "isc.sans.edu", your browser will automatically go direct to the second URL. This technique is less reliable then some of the other techniques as browsers differ in how they cache redirects.
11 - Cookie Respawning / Syncing
Some of the methods above have pretty simple counter measures. In order to make it harder for users to evade tracking, sites often combine different methods and "respawn" cookies. This technique is sometimes refered to as "Evercookie". If the user deletes for example the HTTP cookie, but not the Flash Cookie, the Flash Cookie is used to re-create the HTTP cookie on the user's next visit.
Any methods I missed (I am sure there have to be a couple...)