menneske.org - work hard. play hard.
Business dev | Social tech | Open Data | UX IxD IA | Collaboration | Innovation | Chocolate | Entrepreneurship | Information Security

TaggingWith my company having a large degree of knowledge workers it is a fairly common event in the company to hold internal conference days with in-house specialists giving talks on their various topics of expertise. The previous one was held in the middle of December, and several of the talks of the day was about using tags for organizing information. While this probably isn't a particularly cutting edge topic any more, one of the talks by Filip Van Laenen stood out in being about how one should leave hierarchical code repositories behind, and instead use various forms of tagging to organize files with source-code in a so called tagarchy. While this is both a novel and quite interesting topic in itself, what really caught my attention was a mention of how using a combination of distinct 'hard' and 'soft' tags can be used to good effect in logically organizing files of program code. The example was that a set of 'hard' tags would describe generally unchanging technical aspects of the code in the file, like for instance pattern-types used or services provided. Then a separate set of 'soft' tags would be more about code usage, like for example if it is needed by or contains login functionality or whether it supports one or more particular areas of the business logic.

The presentation rapidly convinced me of the potential usefulness of having a distinction between 'hard' and 'soft' tags for semi-structured data like program code, but I sensed that the concept could be put to even better uses elsewhere. A rather obvious application for this would be to improve the currently popular approach of single level folksonomy or social tagging, like that which is used on YouTube, Flickr and Del.icio.us amongst others. By separating the tags used to describe items on such services into multiple logical groups, one will immediately get an extra level of semantics for searching or filtering the otherwise unstructured data. This should make the tagging systems of such services a lot more powerful and useful than they currently are, especially in providing better findability for items and more descriptive search-results on the service.

It is however apparent that a clear limitation to the potential of tag-typing hinge on which selection strategies are used to decided on which logical tag-groups to include. A first impulse could be to continue with the successful crowdsourcing used in the original folksonomy tagging, and simply let the users themselves assign the tag-groups. While tempting, I believe that this would not alleviate the current trend of non-semantic tags and neither provide any particular advantages, so in this case going towards the other extreme of semantic taxonomies appears to be more suitable. But while semantic taxonomies are generally considered very advantageous over folksonomy tagging, a major downside is that they are often overly complex and thus can be very demanding to work with, especially for amateurs. To alleviate this I instead propose using a professionally selected, limited set of tag-types, and combine these with folksonomy tagging within each type. This way one can get the best of both worlds by obtaining a modicum of semantic meaning from the tag-types, while at the same time providing the freedom of independent crowd-sourced tagging as we already know it.

On which tag-types to expect I would suggest that images for instance should have separate tag-types to describe its actual contents, its context, any persons depicted and perhaps its intended usage and any special techniques used to create it. With the addition of such tag-types the accuracy of an advanced search on Flickr or iStockPhoto would most certainly improve greatly.

The big open question then is if this is an actual feasible technique, or if there are a bunch of reasons for why this wouldn't work as I have proposed here. Please enlighten me if you have any thoughts or experiences about this, as I feel that a system such as this could be a suitable next step towards a more semantic web.


TrackBack to this post!

TrackBack URL: http://mt.menneske.org/mt-tb.cgi/168


Comments (2)

I tend to think that taxonomies, in the strict, hierarchal sense, is evil at all levels, they are not flexible enough as a knowledge organisation system. You need much more flexible organisation. Secondly, I don't think taggers will bother to learn where their tags belong in a greater taxonomi.

I think people should tag as freely as possible but then, to make it useful outside of its original context, one should link each individual tag to an externally maintained ontology.

I implemented something at my.opera.com on this, and described it in the semweb blog there: http://my.opera.com/semweb/blog/

Then, MOAT came along http://moat-project.org/
So, there has been quite some work in this area.

Taxonomies might well be evil, but that doesn't mean they can't be useful at the same time, and with limited use they also won't limit flexibility much. Secondly you point out that it will be difficult to get taggers to use the correct tag-types, but I consider this to be more of a user interface issue rather than a problem with the concept itself. Given that it is possible to design a user interface that ensures mostly correct tagging I feel that such a combined solution will be superior to the current free-form tagging systems.

MOAT certainly sounds very intriguing and it too appear to have great potential, but from a quick glance it appears that it may suffer even more from the problem of users not bothering to provide correct links for their tags than what would be the case in the system I just suggested. Again this is also mainly a user interface problem, but one that appears to be considerably harder to solve in my eyes. If this problem is solved however, I must absolutely agree that MOAT is a quite superior solution. Thanks for the input :-)

Leave a comment