In the last couple of years we, as in the Semantic Web community, have learned a great deal about how to publish data on the Web. As we've become more familiar with this process we've got better at knowing where to look to find existing data that could be published online according to Semantic Web and Linked Data principles. What hasn't kept pace with this process is the availability of vocabularies/ontologies for describing this data. I may now be able to get hold of data about changes in polar ice caps and polar bear migration patterns, but would bet money that there's no vocabulary with which to describe this data. Choose almost any domain and the situation will be the same.
If we're serious about building a Web of Data, then this issue has to change. I see this from my own work, but Peter Mika's experiences at Yahoo!, and the strength of his conviction (conveyed very nicely in this blog post), provide some great confirmation that I'm not alone in this perception. The vocabulary bottleneck has to be eased.
So, tomorrow is a chance for us to start changing that. The solution won't come overnight, but I hope that we can start the ball rolling. VoCampOxford2008, and any VoCamp in fact, is about creating some dedicated time and space to create and publish vocabularies in domains that interest us. We all have grand ideas while waiting at the bus stop/traffic lights, doing the washing up, wherever, about cool domains we could model and in which we could publish data, but without some ring-fenced time in which to do so these plans can easily come to nothing. VoCamp aims to solve that.
The primary success criteria for the next two days will be the publication of new vocabularies on the Web that increase the availability of Linked Data. That's the main goal, but there are many others. I am confident that this first VoCamp will be an opportunity to share issues, expertise, modeling techniques and design patterns. In doing so we will all become smarter. There is an opportunity to scope requirements in the wider Semantic Web field that impact upon the availability and reuse of vocabularies. Collectively we can identify missing pieces of the technical infrastructure required by the Web of Data, and begin to build a social infrastructure that helps us collectively ease the vocabulary bottleneck.
These are grand goals. Even if none of them were to be achieved, there is one other goal which I'm sure will be met; that is determining whether the VoCamp format works, and if so how. If the format fails, we'll need to look elsewhere for a solution. If it succeeds, fully or partially, we'll be closer to knowing how to do it even better next time.