Implementing Do Not Track and the work at W3C
By Karl Dubostkarlcow. Friday, February 10, 2012 7:57:02 PM
On the Opera Desktop Team blog, there is a new experimental build available which includes support for the "Do Not Track" feature. Last year, in April 2011, the W3C invited the industry and the user alike to participate in a workshop on Web Tracking and User Privacy. A few months later, after a very successful workshop, a working group started the work on Web tracking with essentially three items in its charter:
- Tracking Preference Expression (Do Not Track): This specification defines the technical mechanisms for expressing a Do Not Track preference, for example as an HTTP header or a DOM property. It may include mechanisms for sites to signal whether and how they honor this preference.
- Tracking Preference Expression Definitions and Compliance: This specification defines the meaning of a Do Not Track preference and sets out practices for Web sites to comply with this preference.
- Tracking Selection Lists: This specification defines a format for interchangeable lists for blocking or allowing Web tracking elements and expected user agent interpretation of this format.
The work is not finished. Since the beginning of the Working Group, we exchanged around 2000 messages. There are representatives of the different stakeholders: browser vendors, regulators (USA, Europe mainly), Advertising business, Privacy businesses, Service providers and user advocacy organisations.
What does DNT mean?
Many of you may have heard about the DNT (Do Not Track) HTTP header being implemented in the major browsers, Firefox, Safari, Internet Explorer and now Opera. When the user activates it, it sends a signal to the server in its headers for each HTTP request. The current form is:
DNT: 1
This is basically no more than you wearing a badge in the streets saying that you do not wish to be tracked. This is very important to understand. We do not want to create a false sense of privacy or security to our users. This signal is being defined by the Tracking Preference Expression specification.
As a user you might then say: “So what? That doesn’t protect me.” and you would be right.
The most important document, and currently most debated, is the Tracking Preference Expression Definitions and Compliance document. We are in the process of defining what a service provider (and its associated Advertising business partners) should do when they receive the DNT: 1 signal. This is essential. Plenty of questions are raised during the discussions such as the definition of tracking, data aggregation, personal information, co-branding, etc. These are very hard questions because they are rarely technical. Some of the decisions could be very disruptive for the Web industry as large. It’s why the group is trying to forge a path that all the stakeholders will be able to live with but also to implement the specifications. If the Working Group decides a meaning for DNT: 1 and nobody is willing to implement it, because it is too hard or disruptive for their business, the users will have lost. There is a sweet spot to reach that will satisfay the Adveristing industry and the NGO and legislators.
The third document is a defence mechanism initially proposed by Microsoft. We found the proposal interesting at Opera and we decided to work on it with Opera. It fits in with our previously already available Site Blocking API. The rationale is simple. If a user activates DNT: 1, but some service providers do not behave accordingly to the meaning of DNT: 1, then there is a mechanism for users to block these sites. This last document has met more resistance than the two others and we are still working on it to have a concrete proposal in front of the Working Group.
Why is it important for Opera?
This work is very important for Opera for two reasons. We are both a browser implementer and a service provider. The recently released build will help us to understand the interactions and the issues it might create when a user is activating the DNT: 1 header. We would like to see how implementable the Working Group suggestions are on the server side too. Our social network, My.Opera, and the very useful Opera Mini browser have to be tested against the specification.
Last Working Group Meeting in Brussels in January 2012
Mağruf ÇolakoğluZAHEK # Saturday, February 11, 2012 9:52:44 PM
Guscitizenofgaya # Saturday, February 11, 2012 11:32:38 PM
metude # Sunday, February 12, 2012 5:01:48 AM
ooscarr # Sunday, February 12, 2012 3:43:35 PM
Charles SchlossChas4 # Sunday, February 12, 2012 11:06:26 PM
Cutting Spoonhellspork # Monday, February 13, 2012 2:53:35 AM
Telling servers to explicitly not track you is like telling them to take a closer look at your activities, the best counter is behaving normally and presenting different faces to different domains. Security 101, seriously.
One defense mechanism would be to generate a different ID for each domain visited. Thereby, no two IDs can be matched by different domains controlled by a common third party. Many sites set cookies that contain a Windows username (being creative and unique is therefore a bad thing), but if that was masked by obfuscation and compartmentalization it would be more private.
Data such as one's IP address cannot be rotated without significant ongoing effort, similarly registration data such as email address and username will create patterns. You may create additional email addresses, but logging in to all from the same IP in a brief span of time will create a pattern again. Visiting multiple domains without even setting cookies will still create a movement pattern attached to an IP address.
So while I think the base concept of DNT is a worthy goal to reach for, there are so many practical considerations necessary to defeat user profiling that it just seems like a great big waste of everyone's time. Tracking compliance would be pretty much impossible without putting a magnifying lens over each domain AND its holder AND the entities which control the holder of that domain.
I just cannot see that Do Not Track has an attainable objective, and certainly not the means to enforce that objective if it were somehow technically attainable.
Karl Dubostkarlcow # Wednesday, February 15, 2012 5:36:54 AM
DNT is not about stopping tracking. It is why the acronym is badly chosen. It is about framing the use of data that services collect when we visit a Web site. I encourage you to read the compliance document and send comments about it. There has been a lot of discussions about the differences in between first and third party tracking. First party being the web site which you interacts directly with and will continue to track you. Then there are the partners providing services or ads: third parties. Should these be allowed to track us only on this site (silo mode) or not track at all when receiving DNT:1. These are open discussions with all participants.
For sure, there is no agreement yet on what is the reasonable middle ground, but there is a discussion going on.
Karl Dubostkarlcow # Wednesday, February 15, 2012 4:58:02 PM
Originally posted by metude:
what do you mean?
Jimtoyotabedzrock # Thursday, February 16, 2012 9:56:07 PM
Originally posted by karlcow:
If you use a silo mode that implies that when you return to that site they still know it is you. I do not think the every day user would expect that behavior.
I think a public survey of less technical users are needed. It would help push the ad companies since they are least likely to like DNT.
I would suggest instead the sending out all the http requests to third parties with the DNT header you should instead add extra attribute for the script tag so the browser can just not download and run some of the code.
You might want to consider integrating a simplified tracking ability into the browser to help speed up web pages. New advertiser and tracker code could just request a unique id and a domain list that can view the id.
Then the page author could just give a <tracking url=""> tag for the advertisers, analytics packages they wish to use. The end user could be tracked across mobile desktop browsers if these site preferences are synced and the user would be given greater control.
Karl Dubostkarlcow # Friday, February 17, 2012 2:02:36 PM
Originally posted by toyotabedzrock:
The issue with this solution is that "domain names" are different from "business entities". A same business entity can have different domain names. Different business entities can share the same domain name. So everything which relies only on HTTP requests will fail. Now if we take what you are proposing, it also means that the script has to know when it is in the context of a business third party or a business first party. The same script may have different behaviors depending on the context.
I'm not saying that it is not a solution to explore, but that all of this is really tricky.
BTW, I think you should read the compliance document and the archives of the mailing list and send your own comments to the Working Group.
Cutting Spoonhellspork # Monday, February 20, 2012 3:24:33 AM
Doesn't this also tie into cross-domain resource sharing privileges? Does any browser have sufficient mechanisms to enforce a mixed-access-mode data policy per-frame without introducing a large overhead?
Jimtoyotabedzrock # Tuesday, February 21, 2012 3:58:47 PM
Originally posted by karlcow:
My solution would require the page authors to add the tag to the script. The advantage for them would be that if they use the special tag the browser could contain the script or functions needed and be able to run faster.