Skip navigation

Sign up | Lost password? | Help

Tom Heath's Displacement Activities

Posts tagged with "rdf"

Yes, the Semantic Web does matter, and RDF is a key part of that picture

, , ,

Paul Miller has a nice new post over at ZDnet, entitled Does the Semantic Web matter? He ultimately concludes 'yes', and I agree, but some of the details raised an eyebrow for me.

"Continuing landgrabs by startups that seek to attract, trap and exploit eyeballs stand unashamedly on the shoulders of Semantic Web promise whilst running counter to its basic tenets of linking and openness. On the other hand, companies 'just' doing perfectly reasonable - and valuable - things with the meanings of words, phrases and documents latch on to the Semantic Web's buzz, whilst being all about Semantics and not at all about the Web."

I have to agree, almost violently, with both these points.

One passage I can't agree with however is this:

"The speed with which 'RDF' or 'OWL' enter any conversation about the Semantic Web is worrying; and must ultimately prove self-defeating as potential adopters retreat from a barrage of terminology and an opaque glut of unnecessary detail."

This may be a fair criticism with regard to OWL, but saying this with regard to RDF is like criticising discussions of the Web in the early 90s for quickly coming down to details of HTML. Yes, we need to focus on what we can do with the technology, but lets not kick back too hard against discussing the technical details.

URIs and the RDF data model are exactly what enables the Semantic Web "proper" to address the issue of linking that Paul rightly criticises many startups for not properly addressing. We can't hope to understand or predict the emergent properties of a Semantic Web without understanding the fundamental components of that Web, and right now RDF is about as fundamental as the components come.

Twine and Linked Data

, , , ...

A little while back I wrote briefly about first impressions of Twine. Now that the recent flurry of Twine-related analysis has died down, and a few more people have had the chance to actually use the system, it's probably a good time to look at what Twine has to offer from a Semantic Web point of view. Given Tim's recent post that emphasises the importance of Linked Data to the Semantic Web concept, and Nova Spivack's follow-up post, the timing is even better.

Speaking briefly as Joe User, my first impression was that Twine doesn't yet offer me any clear benefits over del.icio.us. Yet Another Popularity Arms Race is kind of fun while people build up their number of connections, but this masks a bigger issue that I'll get to in a moment. Despite not planning to ditch del.icio.us any time soon, I'm not going to criticise Twine particularly from a user perspective. Getting these things right is hard, and it is still in private beta. However, the one area where I have to comment (constructively, I hope) is regarding Twine's use (or otherwise) of external data.

For me (and many others) the Semantic Web is all about structured, linked data, and the reuse potential this creates. I don't get the impression that this is at all divergent with the view held at Radar Networks. Unfortunately this principle isn't yet fully embodied in Twine as far as I can tell. That's a real shame, and a missed opportunity to demonstrate the power of non-silos.

This issue struck me from the moment I signed up. There was no option to provide the URI of a FOAF file from which my profile could be populated with people I know, a photo, location data and links to other online accounts I hold. Instead I had to recreate all this information manually, despite much of it being out there on the public Web here, and here, and also here ready for consumption. I even had to upload a photo.

For an application that claims to be Semantic Web enabled this is almost unforgiveable. Sure, not everyone has a FOAF file (but how many more have a photo online?), but for those who do and have wondered what to do with it this would be a great payback, and would in turn encourage more people to create one, or sign up with services like MyOpera (hey, that's why I blog here) and Revyu that create FOAF on their behalf.

For me, probably the low point of this signup process from a Linked Data perspective was having to enter my location as a text string. In a world graced by DBpedia and Geonames this really shouldn't be happening. In fact I've since gone back and replaced the textual location with the URI of Birmingham (UK) from DBpedia (http://dbpedia.org/resource/Birmingham) but of course it's not actually a link in either the HTML or RDF output.

Just in case anyone missed it, yes, there's RDF data describing things in Twine. Hurray! Let's not underestimate the significance of this. But, and I'm afraid there is one, Marshall Kirkpatrick's comment about the lack of RSS output is just the tip of the iceberg. I don't just want RSS, or fragments of RDF, I want Linked Data in RDF.

Sticking with the profile theme, when I signed up I added a number of links to Web pages with which I'm associated, such as this blog, my profile page on Revyu and the Platform site at Talis. To Twine's credit these are all exposed in the RDF document about me that is generated from my profile data. Great. Umm, except that they're referenced using the property http://www.radarnetworks.com/2007/09/12/basic#url rather than something in more widespread use, such as http://xmlns.com/foaf/0.1/page. Likewise my "account" is not a sioc:User, and there's no statement here saying that the URI that identifies me (http://www.twine.com/item/1tjtp3mx-185) identifies a thing of type foaf:Person.

One of the key things about creating network effects on the Web of Data is not just reusing those URIs that identify "things" (like the place "Birmingham"), but reusing widely adopted properties and classes from vocabularies/ontologies such as FOAF, that are widely understood by applications. Of course there may be a mapping defined between http://www.radarnetworks.com/2007/09/12/basic#url and foaf:page, but unfortunately I can't tell, as the ontology URI http://www.radarnetworks.com/2007/09/12/basic# just 404s. Linked Data principle number 3: "When someone looks up a URI, provide useful information."

It is pleasing to see that Twine has minted a URI for me (http://www.twine.com/item/1tjtp3mx-185) that is distinct from the page on the site that describes me (http://www.twine.com/user/tomheath). This is definitely good. To really play nice in the world of Linked Data, however, there are a couple of other tweaks that are needed. If I dereference the URI that identifies me, I currently get a 302 Found response that redirects me to the page about me at (http://www.twine.com/user/tomheath). The important bits of the headers look like this:

GET /item/1tjtp3mx-185 HTTP/1.1
Host: www.twine.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5

HTTP/1.x 302 Found
Content-Length: 0
Date: Fri, 28 Mar 2008 18:06:21 GMT
Location: http://www.twine.com/user/tomheath
Content-Type: text/plain; charset=UTF-8

This needs to be changed to a "HTTP 303 See Other" redirect in order to be in line with the finding on httpRange-14. There is also some work to be done with the content negotation on the site. At present, if I dereference my URI and ask for application/rdf+xml, I get a 200 OK response and an RDF document returned. It's seems that I am not my homepage, but I am my RDF description.

(The "How to Publish Linked Data..." tutorial has more on these issues.)

Weirdly the RDF I get back from this request is different to that from the RDF version of my profile page. This time I am a basic:Person (but not a foaf:Person), there's no sign of my location or links to my other Web pages, but links to all my connections are given.

I imagine that all these sorts of niggles will be ironed out as the site develops further, but in the meantime Nova might like to slightly tame the claims he makes about support for Linked Data in Twine. Despite saying that "You can learn more about Twine's support for Linked Data and see some examples here", the example given does not show Linked Data, but simply an RDF fragment describing the book Jurassic Park. Perhaps the next iteration will have owl:sameAs links to http://dbpedia.org/resource/Jurassic_Park and http://www4.wiwiss.fu-berlin.de/bookmashup/books/0394588169, and say that the author is http://dbpedia.org/resource/Michael_Crichton. Then there'll really be some claims to make :D

Having picked many holes, and hopefully provided some useful feedback, my final comment is a feature request. I think is on Radar's, err, radar, but deserves to be aired for the sake of completeness. One of the features I'd most like to see in Twine is greater native handling of different types of things. Right now I can only add one of a finite list of things (audio, book, bookmark, event, person, etc). In order to truly scale I think an open world view needs to be taken on this, where even the "Add Item > Other" menu has an "Other" option, and types can be drawn from data on the Semantic Web at large.

For example, right now "Review" is not an explicitly supported type. Nor is "Cheese". I would like to be able to add a URI such as this http://revyu.com/reviews/e707061801ce1f020d8ca1ed75e50d0e4daeb6e3 to Twine, and the system then tell me that it's a review, not the other way around. At that point the claim of being able to tie it all together will really hit home.

Making Links at the BBC

, , , ...

Ian and I spent last Friday at BBC Television Centre in London. For anyone of my generation who grew up in the UK this place probably has an almost mythical status, as the place to send your competition entries or milk bottle tops for the latest Blue Peter appeal. We were there for a workshop on the theme of the Semantic Web, organised by Nicholas Humfrey and Patrick Sinclair from BBC Audio and Music Interactive.

Not only was it a privilege to get a look inside this great institution, it was great to see so many BBC people turn up to hear about the Semantic Web. Nick and Patrick had put together a very nicely structured programme, introducing people to the Semantic Web from the conceptual level of Linked Data (that was my bit), through a talk on DBpedia by Georgi Kobilarov, to the highs and lows of enterprise scale RDF storage as revealed by Steve Harris of Garlik, and finally to interfaces for structured data as presented by Daniel Smith from the University of Southampton. Hope all the slides will be linked to from the BBC Radio Labs blog in due course. In the meantime you can find mine here.

Aside from the inherent pleasure associated with talking to people about the Semantic Web, the highlight of the day for me was getting a sneak preview of the Linked Data work that's going on within the BBC, and will hopefully soon see the light of day on the public Web. The /programmes area of the BBC site will be home to large amounts of RDF data about programmes going out across all channels, and each will be identified by a dereferenceable URI.

This is a huge deal, and testament to the hard work put in by people like Nick, Patrick and Michael Smethhurst from the BBC, with input from people like Yves. There is already a public commitment to linked data principles at http://www.bbc.co.uk/programmes/developers , but what impressed me most was the extent to which linking to external data sets seems to be baked into the thinking from day one. Expect to see strong links to Musicbrainz in the first instance, and no doubt to many more data sets over time.

The BBC are well ahead of the game here. They don't have an angry mob of license-fee payers at the gates demanding access to BBC data in RDF, with chants of "give us our data, we've paid for it already" (or hopefully something more poetic). This mob will never materialise. They've seen the willingness of the BBC in this area with previous initiatives such as the Catalogue, and are down the pub dreaming up ways to use this data. With the advent of the current work on Linked Data and /programmes the non-mob have even more to dream about.

Perhaps as a publicly-funded organisation the BBC is obliged (morally or otherwise) to be a good citizen of the Web of Data. However, I don't get the impression that that's what this is about, in the first instance at least. I'm left with the feeling that this is a result of a bunch of guys really getting the Web of Data, and seeing the value that links can bring to their organisation.

Twine First Impression

, , ,

David Peterson was kind enough to send me a Twine invite (thanks David :smile: Aside from the obligatory half hour spent making lots of friends (again) and adding a few items to try things out, I haven't really spent enough time with it to form strong impressions. However, the one thing that struck me while I was signing up was: why do I have to upload another photo of myself, that's the same as the one I use on countless other sites and even bother to define in my FOAF file? This isn't very Webby, let alone Semantic Webby. Hmm.

Saturday and Sunday at SemanticCamp

, , , ...

I'm here at SemanticCamp in London. We're getting close to the end now, but it's been a great couple of days. The guys from the Centre for Digital Music at Queen Mary are here in force, as are the BBC guys working on RDF export of BBC Programmes. Yves and the guys from C4DM showed a lot of cool stuff yesterday about music info on the Semantic Web, followed by a talk from the BBC guys. Michael Smethurst and co. have already got the TOTP/Later data online as linked data; arrival of the Programmes data will be a huge milestone.

One of the highlights of the day for me was hearing what Chris Jackson, Lee Denison and Ashok Argent-Katwala are doing with URI Play. They're working really hard at making sense of the architectural options, but their plans for providing easy access to TV shows across different service providers and potentially a hub for linking these together look awesome.

Later in the day Georgi gave a great talk about DBpedia, with a particularly nice slide about the SEMANTIC Web community and the Semantic WEB community. This was followed by an interesting but sobering talk from iand about Open Data Licensing.

I was suprised in the morning intro session by the number of people present who chose Microformats (big M :wink: as one of their three tags. On the one hand I'm please that the title "SemanticCamp" didn't put this community off; i.e. the event obviously wasn't perceived as a purely Semantic Web show. On the other hand, mid-afternoon yesterday it became obvious that all the Semantic Web people were in one room and all the Microformats community were next door. We haven't had an open flame war yet about upper vs lowercase Semantic Web (thankfully), but there does seem to be a clear divide in where people's loyalties or priorities lie, and I'm a bit sad that the uF community here still seems pretty infatuated with Microformats. Sigh.

It's been good to see some new faces, and catch with some that are more familiar. Aside from the excellent meal at Memories of India on Gloucester Road, the highlight for me was the lengthy, late-night, beer-fuelled discussion with danbri about information resources, non-information resources, and 303 redirects. More on that later I guess.

On the Web, but not *In* the Web

, , , ...

In my recent Talk with Talis podcast, Paul Miller and I got chatting about the conceptual difference between exposing data on the web using Web2.0-style APIs (such as Amazon), and serving up Linked Data (also look here for TimBL's original Design Issues document, which spells out what must rapidly be becoming "the four commandments of Linked Data"). The discussion centers around the "On the Web, but not In the Web" distinction. Kingsley liked the discussion, and suggested it should be blogged for posterity, so here is a transcribed excerpt (starting at 28m41s through the podcast):

Paul Miller: You said that reviews you put into Revyu.com are available on the web as a normal review, and also available on the Semantic Web, to be embedded in other places. Now, how is that different to me doing a review on Amazon, and cutting and pasting it and sticking it into epinions, or my blog, or whatever?

Tom Heath: OK, so, if you do the review in Amazon it will be available on the Web in two ways. It'll be available on the HTML Web for people to browse with their browser, and the review would also be available through the Amazon Web Services API, which means that it is reusable to an extent: I can query the Amazon Web Services API and retrieve that information and do something with it. But this kind of highlights a really key distinction between Web2.0 APIs and the Semantic Web, or the Web of Data, or the Linked Data Web, or however you choose to name it, in that by default if you write a review in Revyu then it's there available, it has a URI, people can make other statements about it, they can reference it in other RDF statements on the Semantic Web, and they can also link to it from the HTML Web.

So, in contrast, if you write a review in Amazon, then the ability to link that review with other bits of information is very limited. You can't necessarily easily say that the review references a certain item or is provided by a certain person, in any way other than embedding this information in XML elements within the results from the Amazon Web Services API. So, this information is available on the Web, but it's not really in the Web, if that distinction makes sense.

It's a distinction that Tim Berners-Lee has, um, well I'm not sure if he's explicitly made the distinction but he always uses the phrase "in the Web" and I never really understood, I never really got why he was using this form of words until recently, when it dawned on me that something being on the Web doesn't really make it in the Web, and I think that's the key distinction between data from Amazon, the Amazon API, or any of the the other Web2.0 kind of APIs, that it's there available on the Web but it's not really in the Web, because it's hard to link it together, which is something that RDF does very well, which XML doesn't really do.