Coming up with a proof-of-concept link tracking mechanism for use on guardian.co.uk led us to the ability in HTML5 to create custom attributes for storing metadata on directly on an element.
Preamble
We had to provide some information somewhere in the page markup about the tracked component so that some javascript could then append that information to the href attribute value as a query string. For some non-technical reasons, we had to have this info in the query string and for SEO reasons we didn’t want to just have it appear in the raw href — hence the javascript.
One way of proceeding would have been to hide the information (for this initial stage, just a unique, numerical ID) somewhere near the link, perhaps like this:
<span class="hidden tracking-code">id-45</span>
<a href="blah.html" class="track-this">This link will be tracked</a>
What then? With javascript you could traipse back up the DOM to find the nearest span element with a tracking-code attribute, get its content, reformat it and then tag it on the end of the anchor’s href, but that’s all pretty clumsy and you run the risk of those values appearing to the end user.
So, we decided it made sense to include the information directly on the element. First we considered the name attribute:
<a href="blah.html" name="id-45">This link will be tracked</a>
This is neater, but the name attribute is deprecated on anchor elements so it didn’t make sense to implement a new feature with it. We flirted briefly with the idea of hijacking the little-used rev attribute just because I was reasonably sure it hadn’t been used across our site for anything else. But this didn’t sit well with us as, little-used or not, this wasn’t an appropriate use of the attribute. We didn’t want to use the class attribute for similar reasons. It didn’t feel quite right, and would have been marginally more complex to parse as we’re already using it heavily for styling purposes.
So, we turned to custom data attributes:
<a href="blah.html" data-track="id-45">This link will be tracked</a>
From there, a simple bit of jQuery extracted that value and appended it as a query string to the href on page load. A downside to this mechanism is that to the user, all this is visible as cruft at the end of the URL in their browser but as I said above, there are non-technical reasons for us approaching it in this manner that I won’t go into here. So that we didn’t break SEO, we use canonical data on every page.
Custom data attributes
A custom data attribute is an attribute that begins with data-. As described in the spec, they exist “to store custom data private to the page or application, for which there are no more appropriate attributes or elements” so they were basically exactly what we needed. They’re easy to access with javascript:
var data = document.getElementById("a").getAttribute("data-track");
The HTML5 spec also refers to a dataset object that returns a map of name/value pairs:
var data = document.getElementById("a").dataset.track;
No browser support for that yet, though there is a jQuery dataset plugin.
You can also style them, should you need to:
a[data-track] {
color: red;
}
Is it safe?
Browsers are supposed to ignore unrecognised attributes, so there’s little risk that this information would ever appear on the page in a broken fashion. From a more philosophical view, it’s not ideal mixing up your data with your markup — it feels like it breaks the model of separating concerns, but then we also use the occasional inline style attribute as well (sometimes it’s just the right tool for the job), so I’m happy to be pragmatic about things.
Further reading
Some in favour, some against. All seems to be a bit polarising…
- Dan Webb: Put that data-* attribute away son…you might hurt someone
- Barklund: Datasets in HTML 5 and what they’re good for
- John Resig: HTML5 data– attributes
Update 12 May
The query string will now be appended on the onclick event, not on page load, to further minimise exposure to robots and people. So it’s only visible by examining the URL in the browser’s address bar after the target page has loaded.
Tweet




