Skip navigation

Sign up | Lost password? | Help

Tom Heath's Displacement Activities

Posts tagged with "semanticcamplondon"

Microformat Authoring Not Necessarily Easy

, , , ...

A couple of weeks ago Danny blogged about an Amaya hack that made it easier to insert microformat class names into an HTML document. It's a neat little trick, but the title of the post ("Easy microformat authoring") only reinforces the received wisdom that microformats are easily implemented, especially relative to something like RDF. Predictably this issue raised its head at SemanticCamp in London and led to a brief intellectual scuffle that sadly fizzled out without any real conclusions being reached. I sensed that Premasagar "got" it - he seems like a pretty smart guy - but there seemed to be a lot of microformats enthusiasts suffering from a kind of weapon focus: someone lunges at you demanding data interoperability, but you don't properly take in their face or fully assess the situation because you're focusing on the microformat they're holding in their hand.

In my experience this view that microformats are easy is a myth. It may be trivial to construct snippets of HTML marked up with microformats, but what I found when implementing hReview in Revyu.com is that adding the appropriate classes to the kind of code that exists in the wild is anything but easy.

In most cases it was not adding the class names themselves that was the problem (although not even the hReview "spec" seems to know what the semantics of "url" actually are). The big issue was getting the structure right. Despite the claim that microformats are for "humans first, machines second", checking that I'd applied the right classes to the right elements within my HTML source required me to think like an HTML parser in order to check that elements were correctly nested and therefore reflected the meaning I intended.

After a couple of hours of peering at the hReview classes in my HTML I was fairly confident that I'd got the structure right, but wanted some validation. So I went in search of a microformats validator. This was quite funny. Apparently nothing of the sort exists, then or now. The best answer I got was to run my hReview through an XSL transformation and check that the RDF/XML that came out the other side looked OK. Excuse me while I choke on my coffee.

Therein lies the issue with microformats. Without an underlying abstract data model, validation becomes a bit like standing back looking at a used car, kicking the tyres, concluding "yeah, looks alright", and then handing over the cash.

Maybe none of this matters. Maybe the Web can handle microformat garbage just like it handles so much other rubbish. What really drives me mad are the claims that microformats are up to the same jobs as RDF, and so much easier to implement.

The "humans first, machines second" claim is perverse. What my little anecdote suggests is that, in spite of these claims, microformats are neither easy to use for humans, or particularly likely to yield much reliable data for machines.

Saturday and Sunday at SemanticCamp

, , , ...

I'm here at SemanticCamp in London. We're getting close to the end now, but it's been a great couple of days. The guys from the Centre for Digital Music at Queen Mary are here in force, as are the BBC guys working on RDF export of BBC Programmes. Yves and the guys from C4DM showed a lot of cool stuff yesterday about music info on the Semantic Web, followed by a talk from the BBC guys. Michael Smethurst and co. have already got the TOTP/Later data online as linked data; arrival of the Programmes data will be a huge milestone.

One of the highlights of the day for me was hearing what Chris Jackson, Lee Denison and Ashok Argent-Katwala are doing with URI Play. They're working really hard at making sense of the architectural options, but their plans for providing easy access to TV shows across different service providers and potentially a hub for linking these together look awesome.

Later in the day Georgi gave a great talk about DBpedia, with a particularly nice slide about the SEMANTIC Web community and the Semantic WEB community. This was followed by an interesting but sobering talk from iand about Open Data Licensing.

I was suprised in the morning intro session by the number of people present who chose Microformats (big M :wink: as one of their three tags. On the one hand I'm please that the title "SemanticCamp" didn't put this community off; i.e. the event obviously wasn't perceived as a purely Semantic Web show. On the other hand, mid-afternoon yesterday it became obvious that all the Semantic Web people were in one room and all the Microformats community were next door. We haven't had an open flame war yet about upper vs lowercase Semantic Web (thankfully), but there does seem to be a clear divide in where people's loyalties or priorities lie, and I'm a bit sad that the uF community here still seems pretty infatuated with Microformats. Sigh.

It's been good to see some new faces, and catch with some that are more familiar. Aside from the excellent meal at Memories of India on Gloucester Road, the highlight for me was the lengthy, late-night, beer-fuelled discussion with danbri about information resources, non-information resources, and 303 redirects. More on that later I guess.