Category System Overhaul


The category system used in Wikka needs improvement, and I'm looking for ideas and comments. -- JsnX


Current system:
User adds a wikiword category tag to each page, such as CategoryDevelopmentArchitecture.
For more information, see WikiCategory.

Issues with current system:


Proposed system:
User selects a category from a dropdown box during page editing.

Issues with this proposal:
Ideas?

Multiple categories per page
You say: this will only allow one category per page
This is not necessary. If we want to allow more categories per page, we might think of storing sets of values (I recently had the very same problem with a user management system in which users can belong to different categories):
  • Make the category field in wikka_pages accept comma separated values (see MySQL's SET type);
  • On page editing/creation, instead of a dropbox, display a list of check boxes
The wikka_category table might then contain two separate fields for each category:
  1. a system-generated unique-key (TINYINT), invisible to the user and handled by the system for cross-referencing the wikka_pages table (whose category field will then contain a simple comma separated list of numerical keys, like: "1, 5, 7")
  2. a human-readable category label (VARCHAR), like "Documentation", or "Development".
Keeping the key vs. category label separate might also help cope with i18n issues
-- DarTar


Yet another approach:
  1. categories table:
    • id
    • category
    • parent (may be NULL for top-level category)
    • maybe extra "administrative fields such as timestamp
This would take care of a categories hierarchy
  1. pagecats table:
    • page_id
    • cat_id
Now we can store many-to-many relationships between pages and categories; and everything is neatly normalized, too.
When a page is created/edited a dropdown (rendering the hierarchy!) could allow choosing one or more categories; there should be an extra option to (instead) create a new category and assign it to an existing one. Edits should of course also allow to remove a category for a page (while still allowing assigning or creating a new one). Note I'm using page_id, not page_name, in pagecats - on purpose, so the categorization belongs to the page version; a new page version should (initially) inherit all categories, of course.
That might become a tad complicated for an edit dialog; so maybe "categorization" sould be a separate dialog (handler). I'm sure there are nice examples around for dialogs for maintaining a hierarchy of categories. (Many CMS systems have something like that, I know there are demos around.)
Extra wrinkle: what to do with pages when a category is deleted? (Probably assign to the parent category.)
--JavaWoman

JavaWoman, that sounds good. And a page showing the none categoriesed wikipages could be helpfull :)
--NilsLindenberg

I am working on something very much like this, see AdvancedCategorySystem for details.
-- TimoK

JW, I confess I was thinking of a set of horizontal categories, not of a hierarchy of categories. The two approaches have different kinds of use and different pros/cons. The horizontal approach is certainly much easier to implement/maintain, the vertical approach requires some extra work, but I think it is worth the effort. I welcome the idea of a dedicated handler, instead of cluttering the edit page. -- DarTar

Kevin Yank from Sitepoint.com has an example in his book which uses checkboxes rather than a dropdown menu to allow multiple categories to be chosen.
--JamesMcl

Why a hierarchy?
The current system (admittedly not very user-friendly or at least not newbie-friendly) does allow a hierarchy of categories. I wouldn't like to lose that capability. The categories table as I proposed allows such a hierarchy without enforcing it: if the community "decides" they don't need a hierarchy they'd just not assign a new category to a parent - and that's that; it will be as flat as the users make it. (A maintenance handler could enforce a hierarchy, given such a data model, but I don't think that's necessary or even sensible.)

Another issue: conversion for an existing Wikka Wiki. A conversion utility (something definitely needed) should migrate all existing categories; and if that Wiki already has a hierarchy of categories, and the new system doesn't support that - what are you going to do? It's not easy to "flatten" and existing hierarchy into something still meaningful.

Definitely extra work, but are we in a hurry? We do have a system that actually works, even if it is hard to use. And I think categorization (and a user interface to support it) is important enough to give it careful thought.
-- JavaWoman


Perhaps we should at first define/think about, what a category-system could (should?) have for features:
- Pages should be able to belong to zero or more categories
- Cat. should be able to belong to zero or more cat.
- it should be easy to add/delete a category to/from a page
- a admin should be able to rename/delete cat.
- there could a page which lists all pages belonging to no cat.
- a admin should merge two cat./ divide a cat. into two or more
- a action like "nocategory" which prevents adding a cat. to pages
- an alphabetical index like at the page index would be usefull for large categories
--NilsLindenberg

no categories?
The developer of Comawiki wants to intoduce a system he calls "father-and-son-pages" into his Wiki. That means that a page can have pages belonging to it. With this in my mind, a re-read of my list above and the things I thought for my event-system, I come with another approach:

we dont use extra pages for categories, but make it possible for pages to belong to other pages. This would requiere a table with page_id (of the page) and belong_id (to which page does this page "belong") [needs better names, I know]. And the original page table needs a field with the number of pages belonging to a page.

Instead of a list of categories, we present a list of pages, ordered descending after the number of pages they already have. Pages without pages attached should be shown seperatly.

example: Category Development would become a page like WikkaDevelopment and would be shown very high in the list, because it would have 47 pages belonging to it.
--NilsLindenberg

Such a "father-and-son-pages" system would mean organising pages as a strict hierachy (note that this is not the same as organising them in a strict hierarchy!). It's an interesting concept but doesn't allow any kind of classification: in fact, it's completely orthogonal to a categorizing system, which I definitely would not want to give up. One could have both, but I do want at least a good categorizing system. In fact, I think that's indispensable for a Wiki! --JavaWoman

Looks like I should think a little bit about this for myself ;-) --NilsLindenberg

How about instead of belong_id use parent_id --KickTheDonkey


Idea for Category Support
I moved my suggestion here, it the better place for it. Thanks DarTar.
The Category Support is currently not perfect. Here is my suggestion:
add the Categorie to the URL in front of the wiki name:
Example:
mod_rewrite url: http://www.example.com/Categorie-Wikiname (I left out Camelcase here for easier formatting)
real url: http://www.example.com/wikka.php?categorie=Categorie&wakka=Wikiname
Implementation:
Whenever a page is called with a categorie override the configs base_url with the url including the the categorie (+ minus sign when using mod_rewrite) . The browser will translate all urls on this page to the url including the current category.
Whenever you want to set a link to a different category use InterWiki links - or something simliar to distingish between Category and InterWiki links.
Advantages:
Whenever a new page is created by clicking on AnonExisting page, the new page will inherit in the category from the linking page.
I thinks it is easy to implement - did this already partially on my page.
Disadvantages:
A page can only exist in one single Category.
Open Issues

Yet more suggestions...

Please, please, please don't change the current category implementation! It's exceedingly, wonderfully flexible and usable as it is—it just needs a few refinements. :)

A couple of suggestions (some of which have been raised before elsewhere, I'm sure):

1) There needs to be a way to link to a category page without having the page actually being put into the category. It should to be a very simple addition (from the user side) to the regular category link.

2) There needs to be a way to tell the category page how to sort a page in the category list. For example, that way a page about a person (e.g. AlbertEinstein) could sort correctly (under E instead of A).

One wiki that I think handles categories spectacularly well is MediaWiki. I don't like a lot of stuff about MW (I think it's pretty clunky after having tried it out), but I do (mostly) like how categories are implemented.

The things I like:

1) Placing a page in a category is as simple as [[Category:Foo]], and even if there isn't a page created for that category yet, if there are subcategories, it will still display the links to their pages.

2) Linking to the category without categorizing the page doing the linking only changes the link code by one character. [[:Category:Foo]] will make the link to the category page, but will not categorize the page in that category.

3) All pages that use the Category: prefix in the name are automatically listed on the Special:Categories page, which is an automatic list of all categories in the wiki. Also, the category titles are displayed stripped of the Category: that starts the page link so that they're easily readable.

4) The category pages display subcategories and other pages separately, with dividers for each letter of the alphabet (this would be a good option to make configurable, as some people may really like having the pages separate from subcategories and others might not, as well as the letter headers).
5) Sort keys are used to help correctly display pages on the category lists: [[Category:Foo|Einstein, Albert]] tells the wiki to display the link AlbertEinstein on the Category:Foo page in alphabetical order by last name, first name.


The system Wikka uses right now is very flexible, at least on the user end, and I don't think it's hard to figure out at all. One of the main reasons I chose to use Wikka for my site was because the categories were flexible and you can have multiple categories per page, and they also weren't annoyingly anchored in the layouts, which is very important for a site trying to use categories as more than metadata about the pages.

--MovieLady



  • I like that idea, it seems very logical to me. (This would be a good thing to be able to set defaults for in the wikka config file.)

  • Do you envision this as also enabling getting rid of the requirement of adding "Category" to the front of the page name? (Though, come to think of it, is there actually a requirement to do so, or is that just common usage? I can't say I've tried to make a page a category without prepending "category" on the page name.) I think that would help immensely by cleaning up the display of pages/categories on category pages by removing the clutter of repeating "category" on every link. --MovieLady


  • JavaWoman, might you point me in right direction to modify my installation (1.1.6.3) to accomplish this? I just discovered WikkaWiki yesterday, and love it, but I cannot make good use of it while adding a page to a category is accomplished in this way. I am not very proficient with php or sql but if you have just a few hints as to what the "simple changes" might be, I may be able to figure it out. Thanks very much for any help you could give. --AndCod
    • AndCod, I just brainstormed a bit with NilsLindenberg about this on the #wikka channel; the conclusion is that it would not be very hard to accomplish if you are reasonably proficient with PHP and MySQL (and steal a bit of code now used in trunk for keeping track of links between pages, and leverage it to keep track of categories for pages) but even that idea isn't fully-formed in my mind and would need some further data design to keep the possibility of a hierarchy of categories. You'd definitely need a table to keep track of relationships between pages and categories, and likely one or two more tables for relationships between categories themselves. It would need additions as well as "relatively simple changes" (that was a bit optimistic maybe). Still quite doable I think - but only for someone who is reasonably proficient with PHP and MySQL as well as reasonably familiar with Wikka's code. Meanwhile, depending on your type of site, you might get away with setting up a system of categories in advance, combined with templates and cloning so epoepl can easily create pages "in" a certain category. --JavaWoman

Faceted classification
Just a day after I told someone there probably would not be any other possible approaches than already proposed on this page, I stumble over another interesting one: FacetedClassification. There's even an (experimental) Wiki engine that uses it. I'm not proposing we do this, actually (it's too diffrent conceptually from what we already have here), but worth mentioning as an interesting approach to classification and navigation based on it. --JavaWoman


Categories and FreeMind

One of the next steps towards a better integration of FreeMind in Wikka could be the automatic generation of an XML tree of Wikka's Categories displayed as a map.
-- DarTar

CategoryDevelopmentArchitecture
There are no comments on this page.
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki