Skip navigation.

exploreopera

| Help

Sign up | Help

Semantic Web at Opera

Posts tagged with "Semantic Web"

avatar

Complexities of tag to vocabulary mapping

, , , ...

"Things should be as simple as possible, but not simpler" -- Albert Einstein


I'm happy that quite a few people have mapped their tags, more than I expected! :up: Now, many things are somewhat complex by their very nature, sometimes you have deal with complexities to make things that are useful. And even more often, the question is who or what should deal with the complexities. When I created the tag mapper, I realised that there would be conflicting goals, and now I'd like to discuss them with you:

For some terms, you will see that there is a dropdown under "Relation" that contains both "topic" and "depicts". The "depicts" is there when the term is a noun, and the idea is that when a tag is used for a picture, then you can say that the picture depicts a dog, for example. So, it is a way to very directly express that meaning.

Doing it this way would also create really simple SPARQL queries, to get all pictures that depicts a dog would amount to (ignoring the namespaces):
    SELECT ?pics WHERE { ?pics foaf:depicts <http://www.w3.org/2006/03/wn/wn20/instances/wordsense-dog-noun-1> . }

That's about as simple as these queries get.

Also, as I mentioned, the plan is to use tags for content labels, and since I'm Opera's representative on W3C's POWDER working group one of my main concerns is how people can easily tag their content with content labels on sites like Opera Community. Again, the easiest would be if a content label could be directly associated with a picture. For example, we have to live with a certain amount of nudity in the pictures our users upload since we don't want to exercise censorship, we just don't want to push it on random visitors, and we want to facilitate parental control, and that's one of the things content labels will be used for: A standard way to say that a picture contains nudity.


This is made possible by a slightly more complex user interface. Arguably, it would be easier to just map your tags, not also having to decide if it depicts something, or if it points to a content label. If we didn't do this, it would be harder to formulate the SPARQL queries, but more importantly, this would only be the beginning, since you're clearly using the same tag not just for pictures, but also for blog posts, and you wouldn't say that a blog post depicts something. So, for this to really work, we would need different relation types depending on the type of resource, whether it was a picture or a blog.

That's where it gets nasty.

So, I'm wondering if the course I've started out on is unworkable. Perhaps the relationship from a tag to a term like the Wordnet terms should be unchangeable? That would make the user interface simpler, but the queries and other uses would be harder. So, it isn't just a question what's the simplest, rather, it is a question of who should deal with the complexities.

Now, it is important that as many as possible to participate in tag mapping. Not everyone needs to write applications or queries that uses these data, so those wishing to do so are probably better suited to deal with the complexities than all those seeing the tag setup page. On the other hand, it is quite important to make the POWDER specification quite simple too, and this use could add complexity to the specification.

So, your opinions will matter here, please bring them forward!

Then, you might ask, dear guinea pigs :psmurf: , why I made a complex user interface to begin with, and then ask if I should make it simpler? Well, clearly, I couldn't start discussing this with you if I didn't show you a complex user interface that worked, it would be much harder to explain what I had in mind. So, I figured I might as well do a little research on you. The experiences gain from this is something that will be used to make the right decisions when designing important standards that will be with us from many years to come, and I feel it is important to get this experience now, before those standards are set.
avatar

Marrying folksonomies and taxonomies

, , , ...

With the new release, there is a new feature that you can find on your account page:

I'm allowing you to map your tags to a controlled vocabulary. For the time being, it is Wordnet, but I can include a number of such things. The main reasons I used it is that it is allready quite widespread, is organised in subclasses and superclasses (e.g. it knows that a dog is an animal, and that the W3C is standardising a Semantic Web representation of it.

Up to recently, people have classified their stuff into a set of rigidly defined categories, so-called taxonomies. The advantage of having such a rigid set, are that the meaning of each category is clearly defined, you avoid problems such as one person using "dog" while another uses "dogs", and you can organise your categories into hierarchies such as Wordnet, which makes it very usable for a number of applications.

The downside is that it is also very difficult to know the whole vocabulary well enough so that you can classify things in the right way. Thus, it has remained the domain of professional librarians mainly.

Enter "folksonomies", or "tags". Tags have become enormously popular, as it is easy to use. You just name a tag for yourself and use it. You can the find your pictures easily, if people are using the same tag for the same thing, you can also find it to find similar things. But, and a big but, people often use the same tag for a lot of different things. You can perhaps find some related items, but it is often not very reliable, and since everyone would need to maintain their own tag hierarchy, you can't figure out from a picture tagged "dog" that it also depicts an animal.

Now, you can say explicitly what you mean by mapping your tag to any of the suggested Wordnet meanings. If you do, we get the advantages of both these approaches.

For the time being, we don't do a lot with it, and it is only mapped to Wordnet, but in the future, we hope to enable exactly the kind of searches above, "give me all pictures of animals" will include pictures tagged "dog", "cat" etc.

There will also be many more vocabularies. For example, we intend to use tags to allow you to set any Content Label available. We can also let you map tag with a person, when, for example your picture depicts a person.

With this, we intend to make tags the simple way to annotate things on the Semantic Web. Try it out! On the top of your page, you'll see a "My Account", and it is a "Tags" page underneath.
avatar

The SPARQL Engine is back up

, ,

There has been many urgent matters that we have attended to upgrading the Opera Community, and therefore, the SPARQL engine fell into neglect for a while, but that was only temporary! It is now back up, with 15 million triples. Be warned however, that not all data may be there, even though it has grown substantially. Also, it is now too big to build the way it has been built in the past, so it needs to be rewritten, which will take some time to do. Thus, it remains an experimental service for now.
avatar

A new semweb widget...

, , , ...

People ask from time to time what you can do with widgets. Although you can make a "Hello world" widget (the instructions fit on a business card) it seems that this is a little simplistic for most people. Instead they make a clock widget.

Beautiful as some of these are (mine, for example :wink: ) they are not in themselves the thing the world needed most. But every so often something comes along that has real value. I think the first semweb widget was Jibberjim's Widgnaut, a widget version of foafnaut that crawled My.Opera's FOAF data. (Since we added the possibility to link your My.Opera FOAF to external files, it isn't a walled garden, just a starting point. Kjetil++! :smile: ).

Today we released a new widget - the tabulator widget version of Tim Berners-Lee's tabulator, which is a generic RDF browser. This means that now available for your delectation, an RDF browser running in Opera. I'm waiting to see how it works when widgets are released on mobile browsers, but please, get Opera 9 (or 9.01) if you haven't already, download the widget, and give it a try.

Many thanks to David Håsäther who did the widgetising and debugging to make it work cross-platform, Tim and all the folks who have worked on the tabulator project, JibberJim for building the RDF parser in the first place (way back when) and Gorm for making that work in Opera too.
avatar

Sorry, I broke some URIs

, ,

Since the launch of the Opera Community in September 2005, all members of the community has had an URI identifying them. Having URIs for everything is important, and even though you can't be retrieved over the network the URI is useful to identify you as a person.

Furthermore, we know that Cool URIs don't change, but I'm afraid I just changed them... :no: You see, the final, little part of the URI I gave you was very badly designed. It had a lot of unnecessary complexity, and with complexity comes bugs, and in fact, it turned out that some of them weren't even valid, since I hadn't taken into account that some usernames begin with a number.

It wasn't with a very light heart that I changed the URIs, but as I discussed this with other members of the FOAF community, it was clear that it was better to do it now, and deal with any problems that may occur with a process known as smushing, than dealing with the complexity that I had, especially as more people start using the URIs for real applications.

For most users, this change will not mean anything. The usual FOAF tools continue to work, it is only if someone has been indexing the URIs it will have any impact. I don't know if anyone has been doing that, but I would be interested in hearing about it! :sherlock:

UPDATE: I was a bit uneasy about this, so I found a way to at least attempt to unbreak most of the URIs, by using the sameAs property from OWL on those valid fragment identifiers from the previous version. That should give us the best of both worlds! :cool:
avatar

More on the SPARQL query engine

, ,

Some time ago, we announced the SPARQL query engine. Since then, the amount of data that can be accessed through it has grown with the growth of the Opera Community. Also, more data has been added. Being the first major site to publish such a query engine, it was done both so that the community could experiment with the data and so that we could gain experience on the server side.

The approach I chose was to rely heavily on allready available libraries, mainly Redland, a great library written in C, but with bindings to many other languages. One of its features is that you can insert RDF statements into a model, which again can be stored in many different types of databases. Also, it can take SPARQL queries and return results. Thus, all that was needed to get this running was to take all the data out of the Opera Community databases, create RDF statements from the data and insert it into the Redland RDF model, and create a system between the web server and Redland's SPARQL query interface.

Given that we plow new land it wouldn't come as a surprise that some problems have surfaced, and I've worked on them occasionally. For one thing, it takes about 5 hours and takes a lot of resources to build the model this way. Therefore, I have only been able to renew the model once a week, much less in real time. Also, not all the data are available for query while rebuilding occurs.

I have now addressed the latter problem, but that it takes 5 hours to rebuild persists. I have tried many approaches to find a way to resolve that, which hasn't yet taking me to a solution, but it has provided much more insight into the cause of the problem.

As many have pointed out, the main hurdle in using the SPARQL query engine has been to find out what kind of data is in there, it has been a trail and error thing up to now. With the recent issues ironed out, and with a new version of Redland, I expect to provide an approach to that problem soon.
avatar

Welcome to the Semantic Web blog at Opera

, , ,

This is the opening post of the Semantic Web blog here at Opera. The Semantic Web tries to extend the Web by the development of new frameworks for sharing data between applications and there is a number of people at Opera interested in this development. In this blog, we'll share some of the ideas and keep you posted about the development we do.