Tags, Wiki and Folksonomy
- Following JW's suggestion, I'm renaming the page so as to avoid confusions between tagging and wiki markup. I decided to maintain the word folksonomy in the title, though, since what I intend to discuss here is the specific issue of integrating wikis and folksonomy services, not the more general problem of Wiki metadata (I agree that a WikkaMetadata page might become an interesting hub, though, with links to this page, to WikiPing and to other metadata-related discussions) -- DarTar
What is tagging
I'd like to share with you some thoughts on the possibility of integrating a tagging system into WikkaWiki.Tagging is a simple way of labelling nodes and building a basic category system.
Tagging (aka Folksonomy or Social bookmarking) is becoming one of the most pervasive practices in the field of social software. Tags allow users to categorize content: categories emerge from single users' labelling of URLs. Tags offer also a smart (distributed) ranking system: URLs that receive more times the same tag are likely to emerge as the most relevant and authoritative sources for the topic associated to the tag (an idea similar to the PR strategy adopted by Google). Web services building categories out of users' tags include del.icio.us (the first service which introduced social tagging), Technorati, Simpy, Jots, Flickr (tagging applied to photographs), CiteULike (tagging applied to scientific literature).
An example of folksonomy web-service
Look at this example of technorati's page for the tag: wiki or just choose one of these tags.
Why tagging
The interest of tagging is twofold.
- On the one hand, tags can be used internally as an alternative to categories: they help organize content by topic in a simple and intuitive way.
- On the other hand, tags can be used to post labelled content to external web services.
The second aspect deserves some attention. It is quite common for bloggers to tag their posts, so that web services like technorati can categorize them by topic. It might be interesting to add such a feature to wikis as well.
Tagging: From Blogs to Wikis
Addressing JavaWoman's concern (see below), why should a Wiki need a tagging system, I see two main reasons:
First, Wikis -- at least those dedicated to the general public -- are often structured as collections of nodes dealing with single topics; the Wikipedia is probably the most known example. A WikiPage is often a thematic page that is linked from a hub or category page. Given this one-topic/one-page nature of most Wikis it seems natural to consider tagging as an interesting way of describing wiki content. Internally, hubs can be built automatically by grouping pages that use the same tag. But, most important, tagged pages can be published on external web services (see below). JavaWoman argues that such services are tailored to blogs, not to wikis. Only partially true: blikis - see below - can already be used to broadcast content to such web services. Moreover, the fact that wiki-centered services to aggregate content (like recentchanges) are not common is not a good argument not to invent them :)
Second, Wiki engines can actually be used to power Blogs and the two technologies are likely to merge in the future. Here's a nice quote on the hybrid nature of Wikis with respect to blogs:
The more I think about it, the clearer it is to me that Blogs and Wikis are really instances of the same meta-level idea. They should be unified into a single system. Blogs organize information temporally along a single thread. Wikis organize information spatially around a set of nodes representing ideas. Blogs have no concept of space. Wikis have no concept of time. What we really need is a single framework that enables information to be organized freely in space and time. You can create Nodes that represent ideas and link them to one another just like you do in a Wiki. You can post articles to any Node (or set of Nodes), just like you do in a Blog and they appear sequentially by time. When writing any article you can enter Wiki commands to quickly link to, or create new, Nodes. This is the best of both worlds. You can then filter it by Node name, Time, or both. (From: Integrating Blogs and Wikis -- A Higher Unifying Framework)
There are already many wiki engines that offer blog functionality (Blikis), thus combining the advantages of the two kinds of tool.
A nice example is Rui Carmo's The Tao of Mac, a seamless integration of a wiki engine with blog functionality.
More information about Blikis is available on Wikipedia.
How to broadcast tagged content
Publishing tags requires just a simple modification of the RSS generation script, to add the following lines:
<category>[tagname]</category>
<dc:subject>[tagname]</dc:subject>
<dc:subject>[tagname]</dc:subject>
Another way to tag pages consists in adding a rel="tag" attribute to a link. For instance:
<a href="http://apple.com/ipod" rel="tag">iPod</a>
<a href="http://en.wikipedia.org/wiki/Gravity" rel="tag">Gravity</a>
<a href="http://flickr.com/photos/tags/chihuahua" rel="tag">Chihuahua</a>
<a href="http://en.wikipedia.org/wiki/Gravity" rel="tag">Gravity</a>
<a href="http://flickr.com/photos/tags/chihuahua" rel="tag">Chihuahua</a>
Once links or RSS contain tag information, you can easily publish them by pinging (automatically or manually) web services like technorati. The tagged URL will then show up in the corresponding tag page.
- These examples seem to come from a Technorati page; this page specifically explains how to publish "tags" to Technorati. It also explains that Technorati is a service for blogs really, and where it gets its content from (publishing "tags" about your own content is just one way). More importantly, the RSS example is specifically for RSS/ATOM (one of the several RSS standards) - you cannot just add such tags to any RSS file: if the RSS standard you're using doesn't support the <category> and/or <dc:subject> tags you'll end up with an invalid RSS feed that will just not load any longer in many feed readers. I'm not sure what pinging other services than Technorati would do - are there others that actually support these (RSS/ATOM) tags or the rel="tag" attibute in links? --JavaWoman
- Hey, you're raising three different issues: 1) is the tagging system useful only for blogs (my opinion is it isn't: see my comments above); 2) does the data format used for tagging produce invalid RSS; 3) is the data format useful only for technorati or also for other web services. Concerning the two last issues, as you know, I'm definitely against adopting a data format in Wikka that goes against web standards or adopting a feature that results in invalid RSS. Given the success of such tagging systems it seems at least very implausible that they haven't considered this issue (Technorati tags are supported by the following blog services: b2, Blogger, Ecto, MarsEdit, Movable Type, Nucleus, Radio UserLand, TypePad, WordPress -- see Technorati's Standard Support and Ping Configuration pages). Moreover most blog services offer the possibility to ping technorati, del.icio.us as well as many other folksonomy services: I would be very surprised to see different standards (or invalid standards) used to broadcast information from each of these blog services to folksonomy services. -- DarTar
Integrating tags in a wiki
Forthcoming...
- I'm still puzzled about the purpose of this. What do you gain that you don't already have by having a search engine indexing your wiki?
BTW, of the examples of "tagging" services mentioned above, Technorati seems to be the only one that actually supports categorizing your own content; the others are more "collaborative" efforts (people categorizing others' content) - although it's nor even reallly collaboration but rather a compilation of personal categorizations of bookmarks. I'm very doubtful about the value of these efforts though, especially since there are no agreed-upon vocabularies. There are already established standards for adding meta data to content though, such as the Resource Description Framework (RDF) and Dublin Core (RDF also supports describing your own as well as other's content). The W3C's effort for the www.w3.org/2001/sw/ Semantic Web Activity is based on such standards, as are various related activities. --JavaWoman
- Digging a little further I stumbled over this page which nicely describes my misgivings about the totally informal assigning of keywords on "tagging" services and merging "tagging" from different people made for different purposes. What amazes me is that this is a (blog) post from just a month ago and the comments read as if the discovery of these kinds of problems is something new. In fact this is very old news - it seems the people discussing "folksonomy" (or "fauxonomy" which I consider more descriptive :)) have never even heard of RDF or ontologies or all the work that's already been done in this area for years, and are discovering it all over again. Sigh.
Actually, maybe that's the problem. Maybe the metadata/RDF/ontology community hasn't been doing enough 'outreach' promotion of their work and ideas. I really 'discovered' RDF a little over 4 years ago (though I was aware of it before) but I didn't need any outreach for that: I just happened to be right on the spot at a conference where yet another application of RDF was being discussed and I was interested in the application. There are actually many applications of RDF already (and of DC, which also can be used as a part of RDF); one example is the RSS 1.0 standard. It's all about meta data - and long before there was DC, or RDF, or even the web there was meta data and there were ontologies and methods for creating them. It makes me a little sad to see a whole web community seems to be in the process of discovering it all over again, without taking advantage of all the work that's been done before. A lot sad, actually. --JavaWoman
- JavaWoman, as a senior editor for the Open Directory Project, I think I'm quite familiar with ontologies :) I see two different issues here at stake: the first one is about the type of content organization offered by traditional ontologies vs. folksonomies; the second is the standards issue.
I think it is important to keep these issues separated. The first issue addresses your question: What do you gain that you don't already have by having a search engine indexing your wiki?. As you know, there is a huge difference (a cognitive one ;)) between search engines and ontologies. This is not really the point. The real question is the kind of content classification that can be offered by emerging ontologies as opposed to traditional ontologies. There is an evident difference between human-edited taxonomies (like the ODP) that are facing major scalability problems (basically not enough editors to deal with the increasing number of submissions) and self-organizing, sloppy, practical, de-facto ontologies. Folksonomy are not going to replace human-edited directories in the short run, but they are likely to become (as soon as they improve their technology) their most important competitor in the long run (see for instance: Beyond the Folksonomy vs. Ontology Distinction. The second issue concerns the standard used to describe metadata. The tagging system implemented by folksonomy services is ridiculous compared to what full-fledged RDF can do, but maybe this is the reason of its success. Most folksonomy systems have adopted a simple format (a very limited subset of page metadata) that allows a sort of quick-and-dirty labelling of an URL: it certainly lacks the descriptiveness and rigueur prescribed by ontologists for metadata standards, but it practically works. A quote on this topic found at Many-to-many:
This is something the "well-designed metadata" crowd has never understood — just because it's better to have well-designed metadata along one axis does not mean that it is better along all axes, and the axis of cost, in particular, will trump any other advantage as it grows larger. And the cost of tagging large systems rigorously is crippling, so fantasies of using controlled metadata in environments like Flickr are really fantasies of users suddenly deciding to become disciples of information architecture.
- In conclusion, to me the question is to understand whether basic metadata description for wiki pages can be done using (valid) standards that are compatible with folksonomy services and whether we think such funcitonality is interesting for wiki users or not. -- DarTar
- My two cents: I think it is a good idea. Some things which are standard in the blogging community could be usefull for wikis, too (how about trackback for example?). But i don't see tags as a replacement to categories. I rather see it the other way round: use categories to create the tags (automatically). The problems I see: a)wikka isn't able to ping (or is wikiping compatible?) b)as jw said, what should be pinged c)there should be an option to turn it off. Dartar, perhaps you should write technorati an email? They encourage developers to do so and ask questions about how it works. Or see http://www.technorati.com/developers/ --NilsLindenberg
- Nils, thanks for your reply. a) Technorati accepts XML-RPC pings, isn't Wikiping based on the same standard? b) that's the interesting point. IMO Blikis (wikis used as blogs) can just ping ordinary tag aggregators, Wikis might need a dedicated kind of service (but I'm ready to bet it won't take a long time till we see the first one); c) sure, much as WikiPing can be disabled by admins.
Concerning trackback functionality, Pierre Gorissen has modified Wikka to support trackbacks (the page is in Dutch) -- DarTar
- a)I don't know, i have tried to find out but the tecnorati page isn't that helpfull. But that leads to another question: what would happen if you wikiping with tags?(Trackbacks) Hadn't seen that page but looks promising. I'll have to read trough it. --NilsLindenberg
- Nils, suppose you went to Rome last week and you wrote a page in your Wiki called TroublesInRome. You label this page (using some dedicated markup) with the following tags: "pope, vatican, rome, italy, catholic church". Assuming your wiki is able to ping a folksonomy service, a link to your TroublesInRome page will show up in the relevant tag pages of these services, for instance: http://technorati.com/tag/Vatican, http://del.icio.us/tag/vatican, http://www.flickr.com/photos/tags/vatican/ etc. -- DarTar
- Sorry for writing misinterprative: my question was what would happen to WikiPing (for example recentchanges.net). Could it handle beeing pinged with additional information? Would "pope, vatican, rome, italy, catholic church" simply be unused or would it cuase recentchanges.net to fail? --NilsLindenberg
- A ping to Technorati is simplicity itself: all it is is a message to Technorati that "here's a enw page" or "this page has changed"; Technorati then goes to fetch the page and find the "tag" links or find the RSS feed with the categories in it. The ping itself does not contain "tags" and has a different content than a ping to RecentChanges.net: if yo uwant tp both, yuo have to build two different pings and send them each to their respective server. --JavaWoman
- Well... my basic questions remain:
- What do you gain by tagging that search engines don't already provide? This is an important one. If you talk about "emerging ontologies" you are (also) talking about "emerging metadata". And while (most) search engines have stopped collecting them as coded from the documents they index, the index they build is itself a set of meta data. I agree that "there is a huge difference (a cognitive one ;)) between search engines and ontologies" - as long as we are talking about formal ontologies; but when we are talking about emerging ontologies the difference isn't all that big. Just why, to what end, would you use a "tagging" service to find something and would it really work better than using a search engine to find that information? It's easy to say that tagging "works" or is "practical". It may be so in creating informal ontologies - but is the result any more practical than a search engine? And I just found I'm not alone having these misgivings.
- Just which services are you going to ping? The only that I've found is "pingable" (i.e. will listen to what you, yourself, state are the tags that apply to your own content, as opposed to what others say about it) is Technorati - which does not only use this information but also gathers it from other services where people add "tags" to others' content, and which is exclusively for blogs. Even if pinging Technorati from a Wiki would work, the question remains: what other services are there that can be pinged to publicize "tags" about your own content? If they exist, I haven't found them. If Technorati is the only one, your own tags may well get drowned in other people's tags because self-publicized tags aren't the only source that's used. (Back to question 1: how much better will the result be than what a normal search engine is already doing? After all, it's actually pretty hard to write about something without actually using teh word for that something - so a search engine will pick up on it anyway.)
- My third issue (not a question) is with the prediction that "Wiki engines can actually be used to power Blogs and the two technologies are likely to merge in the future". While it's true that you can build blogs with wiki engines (pretty primitive blogs though), I certainly don't agree that the two "technologies" are likey to merge any time in the furure. If anything, I think it's more likely they will each become more specialized for their respective purposes. Blog engines keep getting more and more features that wiki's don't have, or need, and vice versa. CMS systems that support both wikis and blogs normally have two different modules for that (and they are generally more primitive than real wiki engines and real blog engines). The mere fact that it's possible to combine the two does not predict that they will be combined in the future. It's also not true that "Blogs have no concept of space" - there are blogging systems that have very sophisticated category systems, where time data becomes little more than annotation, a historical coincidence.
- Then: "does the data format used for tagging produce invalid RSS"? That's not a question - if the RSS standard used does not support the <category> or <cd:subject> tags, then it will produce invalid RSS. The Technorati spec specifically refers to RSS/ATOM. For a nice horror story about RSS standards, read The myth of RSS compatibility ([dive into mark). And don't forget we're actually using two different RSS standards - and poorly at that - the page revisions feeds are in a different format than the RecentChanges feed.
- Finally, no I didn't raise the issue whether "the tagging system useful only for blogs" - I merely pointed out that Technorati is only for blogs; the real issue I'm raising is whether a tagging system is useful at all. If it is, it's automatically also useful for blogs. If it isn't, it's automatically not useful for wikis either. Or vice versa. Personally, I don't see any use at all. I'm not suddenly all excited and going to use any of the tagging services to do my searches (I actually tried a few and the results were abominable). I don't see it's "practical" at all - search engines work a whole lot better in finding information: precisely because they index content rather than meta information (apart from specialized engines that are actually built to work with predefined ontologies or predefined structures like DC). Are tagging services "successful"? If you look at the numbers of people that are tagging they probably are. But how successful are people in finding information through them? Finding information is what it's all about. --JavaWoman
- In my experience the value of tagging for actually finding stuff greatly depends on how much information there is to search through and how many users are involved. It's a basic rule that the quality of social web improves with the number of people using it. And there is always a critical mass of a) information and b) users that you need before it does work in the first place. So it's not really the question whether a certain platform (wiki or blogs) can be improved by using folksonomy but if the amount of information offered by a service could be indexed more efficiently if the users do it but then you need plenty. Furthermore I think it's important that every user can provide personal tags for any item (not that one user can only tag his own content) because usually the power of the many helps eliminate false tags. As for the differences of blogs and wikis I see some similarities but I don't think they could easily be combined. I would say wikis are more meta-services because you can do a lot more than you can do with blogs when it comes to collaboration. And while blogs are likely to become the successor of the personal homepage (or they already are) I don't think there is anything that is replaced by wikis.--YodaHome
- I got asked about tags in the wiki-part of the "patchwork portal" I created (see my TonZijlstra profile for the link), and like JavaWoman I didn't see the added value of tags in the wiki. But when I start thinking about how tags work for me in blogs and bookmarks I do see that added value. First: Tags in a page might coincide with the content of that page, but that needn't be. It is about the words people would use to describe the content for themselves. These might be different words than the content contains. Then tags augment the free search format. Second: I use tags not to so much to track blog content, but to track what people use what words. In delicious and Technorati there is always a link between tags and the person that used it (and which other tags it used). So that I can look at patterns like "which groupings of words are used to describe this item", "what other tags does person x that tagged this use". I increasingly use tags to scout out communities and individuals that are interested in the same type of thing I am, but use different words to describe that interest. And that is the starting point for creating a shared language with those communities and individuals. So to me the link tag-information item is not interesting in itself, but is interesting as part of the triangle of relationships of meaning "item-tag-person".
- So if we were to integrate tags into the wiki, I'd think of letting people add tags, and keep track of who added which tag and show it in the page. Also I would not use tags as categories or vice versa. Categories are often used to group things by type (projects, my pages), whereas tags are used more often to describe content and often grouping is not the reason for tagging--TonZijlstra
- I recently integrated tagging into Wikka for a client. I made use of Gordon Luke's FreeTag to do this which makes it quite simple. My implementation has a set of tags for each page, it does not include individual tag sets for each user, although it could with a few additions. More details about this implementation and downloads can be found at getFlossed. Sorry, I do not have a running demo. If anyone implements this publicly, let me know so I can post a link from my site.
- One more vote for tags here...
- While I can agree with JavaWoman's opinion on the relative uselessness of widespread tags in general, given the service provided by search engines, I do not second this view in the context of wikis, personal and small group information systems. But it is the very principle on debating on the "usefulness / uselessness" for everyone that I find questionable: many people find their own way to use tags, and trying to evaluate a-priori what can be done with them and whether this is useful or not is a restricting approach IMHO. Human creativity has often defeated predictions and real usages often emerge as unanticipated usages. The short message service (SMS) is a nice example: the designers of the GSM standard have been debating on its value and usefulness, the service was even about to be dropped from integration in the standard but there were some few spare bytes available in some payload part of the protocol so it was finally adopted, with very few hopes in its value. In the end, it became one of the most popular application of the mobile phone, despite the horrible ergonomy! Coming back to tagging and wikis, why do I find them nice ?
- - they are a very simple and flexible way of grouping, tracking and moving information fragments, in this case wiki page.
- - and grouping, tracking and moving are user-defined.
- Wikis are very dynamic for the content of pages, but the spatial structure is much more static: as pages names are used for references, changing names is rarely done, so wikis structure essentially evolves as a "growing graph", with very few nodes mobility in the graph.
- Tags offers this opportunity by introducing an overlaid namespace which is flexible. Tags can be used for ontology building, but they are not limited to categorization (a rather static usage): they can be used to group pages for any aspects, static or dynamic: for example, I like to use wiki for managing tasks. In this usage, assigning and updating "status" information (like "todo", "completed", "urgent") is very useful, as well as supporting quick access to pages matching a given status (like "urgent"). This is just an example of using tags which is not classical categorization.
- One may objects that this is bad practice, hackish way or whatever; it works very well for me, and I don't see any objective reason not to use them the way it helps me, and certainly not to comply some dogmatic principles ;-). We should also remember how the hypertext experts community criticized the web model when it was introduced, how messy it was considered with only one link type to support all referencing needs. The freedom, simplicity and flexibility offered by its linking model is one of the reason that let the web emerge from other hypertext systems (like gopher, which had a semantically richer linking model).
- Many people like Wikis for the same reason: simplicity and flexibility, resulting in a powerful tool with many possible usages. I think that tags well fit in this principle. In Wikka, the category system seems oriented to support the traditional ontology usage more than "tagging" (Am I wrong ?)
- Regarding JW's comment on relying on search engine to support informal tagging, I have two objections:
- - people like to use common words for tags, and as such, explicit tagging would requires the user to use non ambiguous tag name (I mean, names that would not collide with the text appearing in content of the page), such as prefixing with Tag (TagToDo, TagCompleted).
- - As tags can be used for assigning status and tracking, the tagging system has to ensure that listing the pages belonging to a tag is correct (like in "list all pages with tag 'Urgent'"). While database search would returns correct results, an external search engine would not be able to track tags/pages associations in a coherent manner.
- --HackArt
More on the Folksonomy-Wiki connection
Flat categories vs. taxonomies vs. faceted systems
More on tags (pro / contra / neutral)
- Folksonomies? How about Metadata Ecologies? - Louis Rosenfeld
Wikis and metadata
- Rhizome is a Wiki-like content management and delivery system that exposes the entire site -- content, structure, and metadata as editable RDF.
- Using Wikka as a BookmarkManager tagging framework.
CategoryDevelopmentDiscussion CategoryDevelopmentSyndication