Wikipedia talk:Categorization/Archive 2

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1 Archive 2 Archive 3 Archive 4 Archive 5

Describing the relations

I hit much the same problem as Morwen a week or so ago, and backed off from categorization while waiting for the dust to settle. I saw Category:Football (soccer), and Category:Sportspeople, and thought, well, why not Category:Footballers (soccer) with both as supercategories? I decided against it, because a footballer is a sportsperson, but is not a sport. I've come to the conclusion that Categories are currently broken in one serious (but fixable) way: we've got a fantastic set of directed graphs, but none of the arrows is labelled. In other words, we're saying X is related to Y, but if you ask how, I'm going to have to kill you. This makes any kind of semantic inference across those relations impossible (apologies if I'm abusing terminology here but you get my point): we can't say Anyone in a subcategory of People (recursively) is also implicitly a member of People because that's making a huge semantic assumption about the relations, and one that just isn't true in the current state of the wiki.

The fix is to label the arrows: describe the relations. This is, in my limited understanding, what RDF does. That uses the terms subject, predicate, and object. The subject is the thing you're categorizing. The object is the category you're adding it to. And the predicate describes the relation. Predicates allow you to make semantic inferences programmatically. So far in the wiki I've seen two predicates, which I would summarise as Is an example of (John Lennon is an example of a vocalist) and Is, er, related in some way to (Musical groups are, er, related in some way to Music; 251 Menlove Avenue is, er, related in some way to John Lennon). Unless we encode this distinction in the categorization system, we can't make any inferences over the relations. And once we start encoding this distinction a whole new world of possibilities (and problems!) springs up: arbitrary relations.

I want to relate footballers to football. I should be able to use a more specific relation than "is, er, related in some way to", something like "is a participant in". It's the same relation you'd use between basketball players and basketball, but not the same as you'd use between Musical groups and Music, and especially not between 251 Menlove Avenue and John Lennon. In Category:Footballers (soccer) I want to be able to put something like:

[[Relation:Participates (sport)|Category:Football (soccer)]] [[Relation:Example of|Category:Sportspeople]]

You can see the idea. Wikipedia then encodes much more information programmatically. Incidentally, this could also be used with the tricky problem of who belongs in Category:Terrorists - you could use different relations, such as "Alleged to be" (for example!). It also, of course, means that we have a whole new "Relation" namespace to deal with - with RfD, etc, etc.

IMHO without an idea like this we can't make any meaningful semantic inferences whatsoever within the category hierarchies, and categories overall become much less useful.

(Oh, in case someone asks: yes, relations should also be related to other relations :-)

-- Avaragado 10:05, 9 Jun 2004 (UTC)

I follow your argument, but I disagree with your solution. I believe that Categories should only be hierarchical, and the title of the category is already enough to tell the reader what kinds of articles it contains. A category that is a discipline, like Sports Sport (i.e. the discipline; Sports would be a listable subcategory of this) or Music, will contain subcategories and unclassifiable articles, while a category that is a concrete noun, like football players or musicians, obviously contains a list of them. Trying to link categories to each other in every way with a relationship code of some kind would not only be too complicated for most editors and almost all readers, it's unnecessary since the links in the articles themselves do this so well in plain English (or whatever language they're in.) Instead of trying to force square pegs into round holes making Categories do everything that Articles and Lists do and more, let Categories be a quick way to reach large, simply defined, alphabetical lists of things. GUllman 00:27, 10 Jun 2004 (UTC)
You said earlier that everything under Category:People should be a person, etc, but I don't believe that's how people's brains work. People can and are making all sorts of valid associations between two things, and are expressing that using the category system. Under Category:Minerals, why shouldn't someone be able to put Category:Mineralogists? It's a valid relation - just not the same relation as the one you would prefer. Currently within Category:Food and drink you'll find Category:Vegetarianism - again, an entirely valid relation. I agree with User:Bryan Derksen that bottom-up is best here: I don't think it will be possible to keep the (over 3000 and growing) categories in strict scientific hierarchies. Therefore, the wiki either implicitly decides that categories are arbitrary unlabelled relations (meaning all talk of building tools to list all people etc, by following relations, is doomed to failure) or tries to label the relations in some way to enable the various hierarchies to live together in peace, harmony, semantic goodness, etc, etc :-) -- Avaragado 08:32, 10 Jun 2004 (UTC)
People are creating categories from the bottom up because that's the easiest way for editors to work -- they put the four Beatles together in a category then lump them together into larger categories, because few people want to attempt to create a list of hundreds, or thousands, of articles. But however it's done, we have to make it easy for encyclopedia users to navigate from the top downward. Vegetarianism is fine within the discipline of Food and drink, as long as it isn't within a subcategory that's a list of foods or a list of drinks. If someone is looking for a famous mineralogist, they wouldn't look in a category called Minerals. Both the list of Category:Minerals and the list of Category:Mineralogists should be in the discipline of Category:Mineralogy (or Category:Geology depending on how many levels of detail you have). That's what other category systems do, because it works, both philosophically and practically. See the Yahoo_directory. The Dewey Decimal System also places mineralogists (549.092) and minerals (549.2-549.7) under mineralogy (549), but don't get me started discussing L.C. Subject Headings -- the three terms are only distantly related as far they're concerned. So, putting Mineralogists inside the Minerals category doesn't help them find it; if a Wikipedia user reached Minerals by navigating down the hierarchy, they have already passed the broader categories of Geology and Mineralogy. GUllman 19:34, 10 Jun 2004 (UTC)
I think we have a philosophical difference here. You rightly talk of the importance of top-down, and of a properly understood and maintained hierarchy. I agree completely. Where we disagree is that I think the wiki can encode much more than that (without breaking the behaviour you would like). It's clear that this is what people are trying to do, but only by breaking the hierarchy in the process. I also take the view that people are more likely to start in the middle of the hierarchy than at the top: they'll google their way to a Wikipedia article, spot the categories, and jump into the tree. Where do they go from there? From a user's perspective, categories are primarily a navigational tool. Frankly it would annoy me intensely if I surfed from a footballer to Category:Football (soccer) players and then had to start from the top of the hierarchy to reach Category:Football (soccer) rather than follow a single link. That link could go in a "Related links" section of the Football players category page, sure. But that's a kludge, I'm afraid, and throws away what could be meaningful data.
I think the approach to take here is to add the ability to have relations in the category. It doesn't remove anything from the system that we have now and may add something. In the first implementation off this, all that should change is that the category page would have multiple lists, one for each relationship, rather than the single list it has now. This shouldn;t be too hard to code up and add extra possibilities without any downside (apart from the implementation time). Steven jones 07:40, 11 Jun 2004 (UTC)
Yes, Wikipedia categories effort is the same than the semantic Web effort, but without the labelled relations provided by RDF. It is an interesting experience. It will probably fail if the labelling is not made explicit, either because a weak, full of POV "is vaguely enough related to" implicit labbeling is used, or because a too restrictive strict "is a" implicit labelling is used. Marc Mongenet 14:36, 2004 Jun 30 (UTC)

Categorization of countries?

I am really confused by Category:EU countries. I notice that it contains various countries both as articles and as subcategories, i.e. Denmark and Category:Denmark. However, Category:Denmark, which is a subcat of EU countries contains Danish culture. Isn't this improper inheiritance? Wouldn't this imply that Danish culture is an EU country, or do I misunderstand the point of categorization? - DropDeadGorgias (talk) 20:41, Jun 4, 2004 (UTC)

Hi. That's exactly the same issue as above. Sometimes the categories are an 'is a' relationship, sometimes they mean 'is related to'. Morwen 21:30, 4 Jun 2004 (UTC)
Oh... you're right. This is why I was bad at analogies on the SATs. :( - DropDeadGorgias (talk) 21:31, Jun 4, 2004 (UTC)
As I observed above, the characteristic which seems to determine which relationship applies is the pluralization of the category's title. Otherwise Category:Denmark should have only a single article in it, and no subcategories (since there are no other groupings of Denmarks that I'm aware of :) Bryan 05:28, 6 Jun 2004 (UTC)

More on nontree structure of categories

(moved from main page)

I'm not sure that I can diagram this, so I'm not going to try. I'm thinking about some of the dog topics. For example, dog is a member of pets; dog is also a member of mammals; both mammals and pets are members of animals but neither is a subcategory of the other. Now, how about dog agility? It needs to go under the dog sports category, which needs to be under the dog category, because it's related to dogs. It also needs to go under the sports category, because it's a sport. It probably also needs to go under the hobby category. But dog and sports do not at any higher point in the hierarchy have a common parent; possibly hobby and sports might fall together again under leisure activities (?), but not all sports are hobbies and not all hobbies are sports. Just wanted to give another example of why something might need to be in multiple categories. Thoughts? Elf | Talk 02:51, 4 Jun 2004 (UTC)

This is very clear, and it is not disputed that some articles belong in multiple categories. (Actually, I'd dispute putting pets under animals as I know people who have pet rocks and pet plants.) Anyway, your example pales in comparison to some of the existing category structure, which includes not merely DAG's but full blown cycles. --ssd 03:47, 4 Jun 2004 (UTC)
Speaking as a public librarian, I hope you appreciate the reason why librarians need to earn a Masters degree in Library Science, or the reason why such a degree is still needed when there's Google and other miraculous computer tools. Many people think that all knowledge is hierarchical, since they see all the library books on the shelf in Dewey Decimal order, but a librarian knows that books about even a simple subject such as dogs may be found in several places. I'm learning a lot about the public's view of knowledge by reading this page as everyone here tries to "reinvent the wheel". Maybe, just maybe, the consensus of a thousand people will come up with a system of rules that the great Doctors of Library Science didn't. GUllman 16:31, 5 Jun 2004 (UTC)
More likely (or hopefully) a few Doctors of Library Science will lend a hand here when we start going astray. --ssd 22:13, 5 Jun 2004 (UTC)
Interesting stuff Gullman. Am I right in thinking that, even though a book can only be in one location various index cards are rather cheap? Is that what index cards were used for? To provide a proxy book in the filing system? --bodnotbod 02:09, Jun 12, 2004 (UTC)
First, many libraries are now computerized, and it is trivial to put a book in multiple categories in the computer. Second, why do you think a book can only be in one location? Most libraries have multiple copies of popular books anyway, and it is not unusual to file them in different locations if they cover diverse topics. That way, people browsing the shelf find them in either place, and someone who knows about the book can check the other location if the first try is checked out. --ssd 16:38, 12 Jun 2004 (UTC)

Archive of discussion on placement of category tags in articles.

At the beginning or at the end of articles (with interlanguage links) ?

Category tags should, in all cases, be the very first thing listed on the article. →Raul654 17:27, 30 May 2004 (UTC)
Strictly against this. Categories should be listed last in the article. Why? We had the same discussion with interlanguage links: it's just not pretty for a new user when he clicks edit to see a bunch of categories. Put them at the end, please. --denny vrandečić 20:50, May 30, 2004 (UTC)
They should be above the interlanguage links at the bottom in my opinion. Angela. 21:46, 30 May 2004 (UTC)
Agree. --denny vrandečić 21:52, May 30, 2004 (UTC)
Agree. James F. (talk) 21:56, 30 May 2004 (UTC)
Agree (at the end of articles). -- User:Docu
Agree, end is the place. Jamesday 01:25, 1 Jun 2004 (UTC)
Yep. Dori | Talk 05:28, Jun 1, 2004 (UTC)
Another vote for the bottom, either before or after the w:xx's. Hajor 18:40, 1 Jun 2004 (UTC)
Agree, just the position where I choosed to put them myself before coming here. andy 20:18, 1 Jun 2004 (UTC)
Disagree. Category seems more natural at the top, IMHO. Quadell (talk) 17:53, Jun 1, 2004 (UTC)
First. Interlanguage links make more sense at the bottom because they're numerous and intimidating and not going to be edited (or understood) by newer users. Categories are the opposite: not numerous, not intimidating and all users should be able to easily understand and add categories, so they shouldn't be hidden at the bottom. RADICALBENDER 00:22, 2 Jun 2004 (UTC)
They should go at the bottom, after the inter-language links. IMHO, nothing, not a table or an image, should go before the first line(s) of text in the article, just to keep from intimidating a new user. Gentgeen 00:37, 2 Jun 2004 (UTC)
IMO, they belong at the bottom, before the inter-language links (which, IMO, should only be at the bottom, not both at the top and the bottom). But best would be a user preference. Abigail 09:12, 2 Jun 2004 (UTC)
Agree — at the end, before interlanguage links. — Matt 01:01, 4 Jun 2004 (UTC)
We seem to have a strong majority opinion, so refactoring to reflect it. -- Beland 07:15, 5 Jun 2004 (UTC)
Just seeing that a (semi) concensus has been formed I thought I might add a differing oppinion. How about we have the software decide where to put them. That way it could be a preference value whether it go at the top or bottom. Additionally, everyone can argue^H^H^H^H^H discuss what the default should be. Steven jones 07:54, 11 Jun 2004 (UTC)
There are two issues here. 1) Where to put the tags on the displayed page (which could be a preference, and can even now be changed in your css). 2) where to put them in the RAW unparsed article text in the editor. The above discussion ONLY talks about this second item, and it would not really make sense to have a preference for that. --ssd 17:04, 12 Jun 2004 (UTC)
I assumed the discussion was about the first issue. If the software was configurable with respect to the first, the second seems a mute point, as the software will display it at the "right" stop regardless of where it was placed. Steven jones 02:05, 13 Jun 2004 (UTC)

Renaming Catgeories

It is apparently impossible to rename a Category article once it has been created. I am for example unable to move Catgeory:Australian MHRs (which is an incorrect form) to the correct Category:Australian federal MPs, so now I am going to have to delete them all. (This is User:Chuq's fault - I told him MHR was incorrect but he/she went ahead anyway.) Adam

Yes, you did tell me that, so I did some time researching, found the terms are close to equivalent (and if not, MHR is less ambiguous), and yesterday I added some articles into the category. Today I find that you had deleted the references from all the articles. Interesting, if you were willing to go to the effort of deleting them, why did you not just change them, which would have been a lot less effort (for both you, and for whoever adds the articles to whichever group it is decided is appropriate?) Chuq 05:06, 6 Jun 2004 (UTC)

This problem must be urgently fixed or it will cause endless fights. Adam 15:43, 5 Jun 2004 (UTC)

There's no way to change an article's categorical affiliation without directly editing the article itself (at least, not yet). Even if it were possible to move a category, doing so wouldn't list the articles in that category into the new one. Right now the only solution is to manually change every article linking to the old category to the new one, then put the old category (now an orphan) into Wikipedia:Categories for deletion. -Sean Curtin 17:26, 5 Jun 2004 (UTC)
It's possible to move an article, and that doesn't magically fix any links in other articles. That issue is solved by making the old page a redirect to the new page in case of a move. Now, if redirects for categories would work (I've found to my surprise they didn't), moving a category wouldn't be a problem. Abigail 22:15, 5 Jun 2004 (UTC)

A most unsatifactory state of affairs. This whole scheme will cause as many problems as it solves, maybe more. Adam 17:52, 5 Jun 2004 (UTC)

Oh sh**. I've wasted half a day cos I misnamed a category and added fifty people to it before someone pointed out my error. groan. Now, I am probably diligent enough to sort out my own mess. I wouldn't assume the same of others. --bodnotbod 02:13, Jun 12, 2004 (UTC)

Minor Categories?

Might it be possible to have a "minor category" (or "organizational category" or "invisible category" or whatever you want to call it)? Perhaps put a small "[see more categories]" link in the current Category box, with the link either revealing a hidden box below the current one, or taking you to separate subpage of the article with the full category list.... This way categories that are important to the reader would be visible in the current box, while those that are useful but redundant (or more useful to the editors/organizers than the readers) could be hidden from casual view, but easily reachable by those who are interested.

Even if the the minor category idea is unworkable, I suspect a [see more] functionality will be needed eventually, if certain articles end up placed in more than a small handful of categories.

What do you think? --Catherine | talk 22:21, 5 Jun 2004 (UTC)

I'm confused as to what you're suggesting. Could you provide an example? john k 08:06, 6 Jun 2004 (UTC)
I can't tell whether this will be helpful until we see how Categories are going to shake out in general, but I just thought I'd throw the idea out, in case it helped influence the evolution of the discussion.
Say you have an article on Nile Rodgers (just picking it because it's one I did research on and expanded recently). He's American -- African-American, to be precise. He's a record producer. He's a guitarist. He was a member of the band Chic. He has organized a charity foundation. He owns a prominent African-American-run business. There are a couple of different ways you could do this, but I guess I was picturing the main category box like so:
Categories: African-Americans | Chic members | Guitarists | Record Producers (etc)
And a separate, hidden-by-default Minor Categories box like so:
Minor Categories: African-American record producers | African-American guitarists | Biographies of people whose name starts with "R" | People who have worked with David Bowie (etc)
Which categories went where would be a whole 'nother level of editorial judgment (and potential editorial disagreements), but primarily the idea is as above: hide things that are useful but redundant or organizational (i.e., more useful from the category list page than from the article page). And it's always just a click away from someone who's interested in more detail.
Obviously some of the redundancy I thought of at first is taken care of by including articles in only the most specific category; I am sure there are other cases which will not be so clean, however. The other main advantage I see is that it allows the construction of more interesting but obscure or very specific "lists" without cluttering the main Category display. (Of course that might be a disadvantage, as well....) As I said, just a thought I wanted feedback on. Thanks! Catherine | talk 03:58, 10 Jun 2004 (UTC)

Ah, I understand. I agree - this would be really nice if we could do that. Any developers about? john k 04:06, 10 Jun 2004 (UTC)

Rendering on Category Pages

I've seen several complaints about the free-form structure of the list of articles belonging to a category. Take a look at Category:Lists_of_battles, for instance. The top half is the neat tree found in the old-style list. Below in the box is an automatically generated unsorted jumble of articles that belong in the category. These are the solutions I see:

1.) Merge lists of articles into category pages, but manually maintain structured copies in the editable sections. This requires more work, but at least there's only one page (to help maintain consistency between the manual and automatic lists).

2.) Put some tags in the editable section of the category page that tell the system how to create a properly sorted nested list. This is possibly easier for editors, since all the changes they need to make are on a single page, but when new articles are added to the category, they will also need to be put into the sort order by futzing with the category page, or else end up in an unsorted section. This option also makes it really easy to re-factor.

3.) Put tags in the article themselves that allow the system to sort them in a nice way. A bit of syntax that might help might be to stick sort strings after the category name with a / as a separator. So for example:

[[Category:List of battles/Chronological]]

The sort order would have to be defined somehow - it could be automatic, or specified in the category page. But it would be especially annoying to distribute the sort order among the article pages - that would make renumbering really annoying.

(Note from later on - / would be a bad choice of character, since this conflicts with the literal / in URLs.)

4.) A hybrid method, in which editors have a list-making interface similar to what is envisioned in option 2, but when the information is stored, the system goes back and writes tags as in option 3. Easily editable, easily renumbered, but all the category information is kept in one place. Harder to implement in code, though, and more resource-intensive.

-- Beland 10:12, 5 Jun 2004 (UTC)

5) For really long lists of stuff, it would be nice to simply break it up into alphabetized sections with letter headers. The software could be told to do this with a TOC tag of some sort, or it could do this automatically whenever a certain size threshold is reached. Another simple formatting option could include telling the software to put the links in a bulleted list instead of a comma-delimited string. Bryan 02:04, 6 Jun 2004 (UTC)

Layout of the article lists could use some improvement, even where they easily fit on one page, the layout from Special:Allpages/ABC may be of help. -- User:Docu
I agree with Bryan on every point. To make good use of the Category system its implementation should be improved. For example, if I where to have a list with the names of a few people it would be nice if the year of birth/death could be mentioned next to it. The way it's done on the biography pages so the year of birth should not become a link to the person. Possible way of implementing [[Category:Philosopher|Karl Marx|person|(1818-1883)]]. You could extend this system to something like metadata. With this data the system could automatically list all people, 19th century philosophers, philosophers, people with moustache.--Maarten van Vliet 19:16, 12 Jun 2004 (UTC)

The rendering on category pages has changed to make subcategory and article listings multi-column. That's nice. But it has also added big, bold letters at the start of each letter in the alphabetical listings. This is a big waste of space and looks quite ugly when there is a small number of items. See for instance: Category:Main_page or [[:Category:Wikipedia]. I think it should be obvious when a list is alphabetical, and navigation in an alphabetical list is fairly trivial - you just go up or down as appropriate (even if it's wrapped into multiple columns).

I would urge disabling the letters, at least for lists with a small number of items (perhaps less than 30 or so), if not all auto-generated lists.

I like the idea of generic metadata tags. Embedding a character into Category tags is a big kludge (and / is a bad choice of special character, too) and capability would be very useful in many other ways, as described. It also makes creating multiple structured lists inside categories a lot easier.

Given metadata in biography articles that specified birth year, name, and subfield, you could automatically create three different lists (perhaps in different subcategories) that were sorted in three different ways.

I wonder if embedded XML would be a good choice for this. There are WWW-wide XML standards being developed for just this sort of situation, and perhaps it would be good to interoperate with them. If that's too complicated, we could do something like:

[[Metadata:Person.birth_year=1906]]

...and make a list of valid key names and what they should be used for.

We would also need metatags for "Show an auto-generated list of articles/categories with the following matches in their metadata and the following sort order".--Beland 02:14, 13 Jun 2004 (UTC)

The rendering should say "1 article/subcategory in this category" not articles/subcategories.--Beland 04:59, 13 Jun 2004 (UTC)

XML would certainly a good option for creating metadata, the only problem I have with it is that we now have Wikitax, and I like Wikitax :) I do think metadata would prove to be a valuable addition to Wikipedia, not only in terms of categorisation but on other fields as well. Some aspects of Wikipedia are very structured: the albums, countries, biographies etc. These aspects all contain more or less the same information. An biography will always have the name of a person, his year of birth, place of birth, if we would incorporate this information that is always present into metadata we can create a highly flexible system. For example the metadata for an album would look something like this.

[[Type:Music album]] [[Title:The White album]] [[Artist:The Beatles]] [[Released:1968]]

With this method of metadata you can get rid of those ugly tables in each album/country entry and instead you can build a template that is filled in by Mediawiki with the given metadata. This allows that at once all album entries can change colour or can even become part of a skin. Of course this would also be very helpful in the perspective of categorisation. With these four tags I mentioned you can already make a list of all albums, albums by the Beatles, all albums in 1968, etc. --Maarten van Vliet 10:47, 13 Jun 2004 (UTC)

Contents of Categories?

What kind of things should be in Category:John Lennon? Right now it contains the actual individual John Lennon, several of his solo albums, and the bands of which he has been a member. Also the Category:Beatles article contains both this as a sub-category and John Lennon is a member of both the John Lennon and Beatles category. Basically, it's a total mess. Do you think that there is content that should exist in this category, or should it be removed? - DropDeadGorgias (talk) 13:32, Jun 8, 2004 (UTC)

The way bands should go, from what I can work out:
Category:The Beatles members contains only people who were members of the Beatles, is a child of Category:The Beatles (and Category:British musicians, etc)
Category:The Beatles albums contains only albums by the Beatles, is a child of Category:The Beatles (and Category:Rock and roll albums, etc)
Category:The Beatles contains the above two groups, and other Beatles-related items such as 251 Menlove Avenue
The Beatles is a member of both Category:The Beatles and Category:British musical groups
(Now here's where I mess up...) - Category:The Beatles is not a member of Category:British musical groups.
The reason for this is that if it was, all the articles under Category:The Beatles such as The White Album would be listed under Category:Musicial groups - when it isn't a musical group, it's an album. It should be possible (the feature isn't there yet, but the database should allow for it) to list all articles under Category:Musical groups, pr Category:Albums, or Category:People and get only the articles that are groups, albums, or people. Unfortunately it's not that easy to read it, go "oh yes, that makes sense", and do it. I've stuffed it up a couple of times.
In the same way, Category:John Lennon is not a member of Category:The Beatles members (and therefore not a member of Category:British musicians), but the article John Lennon is. Chuq 14:04, 8 Jun 2004 (UTC)
I think you're being too pedantic about the categories. It isn't a tree. Putting category John Lennon under The Beatles members does not put the stuff in that category under The Beatles mebmers. If that were the case, then it would almost never make any sense to put a category into two other categories. I don't think you can carry the relationship beyond one link all the time. --ssd 04:26, 9 Jun 2004 (UTC)
Well, I'm just following the pattern set by ScudLee's example diagram at the top of this page. Everything under "People" should be a person. Everything under "Musical group" should be a musical group. Everything under "Albums" should be an album. The category system is designed to allow this to work, so it would be pointless not to take advantage of it. -- Chuq 06:32, 9 Jun 2004 (UTC)

Category:The Beatles members? Do we really need the article? That's just ungainly. john k 04:38, 9 Jun 2004 (UTC)

Well we could just as easily put each member in Category:The Beatles and Category:British musicians individually. You could change it, if you like, but it wouldn't really achieve anything. John Lennon and Paul McCartney have something more in common with each other than John Lennon and Damon Albarn. -- Chuq 06:32, 9 Jun 2004 (UTC)

Actually, I wasn't referring to the article/category Category:The Beatles members, but to the grammatical article "The". Why not Category:Beatles members? At any rate, as I've repeatedly mentioned before, since all four of the Beatles produced music on their own as not part of the Beatles, they should all be in Category:British musicians separately anyway. john k 08:00, 9 Jun 2004 (UTC)

(John, "The" is part of this particular band's official name -- see Music standards, item 12) Cheers! Catherine | talk

Piped category tags

I had an idea and attempted to configure the category:1975 albums tag at Another Green World so that it would display on the category page as Brian Eno's Another Green World. The text of the link on the category page has not changed (still says Another Green World) but it is now alphabetized under "Brian" instead of "Another". Is this something I did wrong, a bug or a feature? I thought you could use piped links on the article page to change the link text on the category page... Tuf-Kat 20:16, Jun 8, 2004 (UTC)

I think that piping categories is used to change how the page is sorted on the category page, not how the text appears. olderwiser 22:18, 8 Jun 2004 (UTC)
Drat! Foiled again! Tuf-Kat 23:02, Jun 8, 2004 (UTC)
You're not the only one that thinks it's a good idea. -- Cyrius| 23:17, 8 Jun 2004 (UTC)
I think the piped link should be used to change the way it appears (as it would for normal piped links) and as a secondary effect, also be the sort key when listed in their respective categories. RedWolf 03:11, Jun 11, 2004 (UTC)
I think pipe links in categories are fine as is -- changing sort order, not changing headings. However, it would be very interesting if categories had an alternate view that showed the piped name ("sort keys") instead. In some categories, both would be useful. --ssd 16:41, 12 Jun 2004 (UTC)

Questions about categories

My main question is about changing a category. What I've noticed is that if you edit an article and rename the category ie if the category was incorrect or too general, it will create a link in the new category page, but the link remains also in the previous category page, even though that link does not show on the subject article page. For example Jack Nicholson was originally categorised as Category:Actors. I'd read on the categorisation talk page that Paul McCartney should be British musician, but not musician, because British musician would be a subcategory of musician, so I applied the same logic and changed Jack to Category:U.S. actors and actresses where he now appears. But in the Category:Actors page he still appears even though there should be nothing to link him there, and in the Jack Nicholson article page, the only category now visible is Category:U.S. actors and actresses. Does anyone know why that would be? Am I doing something wrong or is there a problem with the database or what?

No, it's nothing you've done wrong - unless someone else has changed it in the meantime, its just a caching issue - your machine still has the old versions of the page. I've found even forcing a full refresh doesn't help, you just need to wait a while. BTW, I noticed there are a lot of names under Category:Actors that will eventually need to be recategorised into less specific groups (as you have already done) Chuq 13:15, 9 Jun 2004 (UTC)

thanks for the info, and I will wait and see. Although some of the ones that I changed were about 4 days ago and the change still doesn't show. Not a problem though. Yes I agree there's a lot needing to be recategorised, and a lot that haven't been categorised at all. I'm sure it will all happen soon. Rossrs 13:58, 9 Jun 2004 (UTC)

Also another question which is less important but I'm trying to get my head around categories and subcategories. So .. Category:Vocalists and Category:Pop singers. To me, all pop singers are by definition vocalists, so along that line of thinking every person categorised as a pop singer should also be categorised as a vocalist. But is that the intention? Should vocalist just be for a band's vocalist? ie Robert Plant vocalist, but not pop singer. Along the same line I would categorise Belinda Carlisle vocalist (Go Gos) and pop singers, (solo). Britney Spears pop singer, but not vocalist? Would be interested to hear how anyone would interpret this. Thanks

I would think Vocalists==Singers, therefore Pop singers is a sub-category of Vocalists. Then you have the Britney thing. It doesn't matter how NPOV you want to get, or what you think of her, she can't sing live and shouldn't be classified as a vocalist. Category:Musical performers or Category:Entertainers? I think Category:Record company whores would be classifed as NPOV :P It IS a tough one though.. Chuq 13:15, 9 Jun 2004 (UTC)

That's exactly the point I think - categorising is sure to create examples that are POV. As for Britney, I'd call her a vocalist only insofar as the sounds she makes, do seem to be coming from her mouth, which is not to say that I don't like her or think she's without value, just without talent. Will be looking forward to reading your Category:Record company whores when you have it up and running. I've actually been considering one of my own, which came about when I started wondering where to place Paris Hilton. So stay tuned for Category:People:slutty bimbos  :-D Rossrs 13:58, 9 Jun 2004 (UTC)

And now I've just discovered that the category pages can't be linked from here. Which is why I've italicised them instead. As if I wasn't confused enough! :-) Rossrs 10:20, 9 Jun 2004 (UTC)

Just type [:Category:Category name]] (note the extra colon before 'Category'). Chuq 13:15, 9 Jun 2004 (UTC)

Would Hilton be Category:Socialites? Category:Heiresses? john k 23:50, 9 Jun 2004 (UTC)

I've occasionally mused that Category:Public accomodations could be useful for Paris Hilton and Paris Hilton (hotel). - Nunh-huh 05:21, 10 Jun 2004 (UTC)
Category:Accidental amateur porn movie actresses? Category:People who can fellate while they claim to be asleep? :) (Apologies if this has offended anyone) -- Chuq 01:37, 10 Jun 2004 (UTC)

Categories that are "list of" versus categories that are "articles about"

There's been a lot of discussion about what it means for an article to be in a category or in the subcategory of a category, such as whether articles in subcategories "inherit" the parent categories as well. I've been running into this problem myself as I try to figure out what should go where. Perhaps to try bringing a little order to the chaos we could work out some sort of standardized way of indicating that a category's children should or should not inherit it? A set of templates, perhaps, giving guidelines to editors about how the category is to be treated that could be inserted on category pages. Bryan 03:14, 10 Jun 2004 (UTC)

That would be wonderful. Right now, we have no clear direction, and different people are implementing competing (and incompatible) systems. Quadell (talk) 19:58, Jun 10, 2004 (UTC)
Okay, in the spirit of boldness then, I give you Template:subcategories inherit and Template:subcategories don't inherit. The names are hopefully self-explanatory, though perhaps a little too wordy; this is just for starters. Bryan 00:59, 11 Jun 2004 (UTC)

Here's the text that's currently in them:

Subcategories inherit: The subcategories of this category contain articles which are also valid members of this category but which have been divided up into more specific groupings.

Subcategories don't inherit: This category's subcategories are related to this topic, but the articles they contain are not necessarily valid members of this category directly.

This problem would be solved™ by describing the relation in the categorization system: see the Describing the relations discussion above. This is very definitely not a binary inherit/not inherit problem: there are many different ways we might want to relate/are already relating things in a formal way, so that the underlying system can make semantic judgments programmatically (including that Category:Football (soccer) players are people ("inherit") but not sports ("not inherit")). -- Avaragado 09:33, 11 Jun 2004 (UTC)
Yes, but unfortunately I'm not a developer so I'm limited to the tools which are currently available. These templates wouldn't work in all situations, but they will work in some situations so I think they can still be a useful stopgap until support is added for giving subcategories "meaning." I've experimentally placed one of these templates on Category:North American rivers, for example, where it seems to be an accurate description of the situation. Bryan 23:27, 11 Jun 2004 (UTC)

Simplest categories

My understanding is that the purpose of the category system is to allow automatic cross-checking of lists. Wouldn't the best way to do this be to make each category give one fact? Category:American musicians would be automatically generated by cross-checking Category:Musicians and Category:American people. Tuf-Kat 05:14, Jun 10, 2004 (UTC)

Or one could do it by having Category:Musicians and Category:American people both inherit the contents of Category:American musicians. I suspect it'll be hard to reach a consensus on which of the two approaches is the One True Best Way; I can see the appeal of both of them. Bryan 05:18, 10 Jun 2004 (UTC)
It would be neat to search by category like that. But first, all people must be put in their nationality category, and I notice a lot of people editing category stuff don't bother to look for nationality, etc... Has anyone done a bot to partially automate any of this? --ssd 02:39, 11 Jun 2004 (UTC)

More piped link fun

So, having discovered, much to my dismay, that piped links can not be used to change an article's display on the category page (merely its alphabetic classification for ordering purposes), I decided to do so to organize Category:Albums by artist so that Category:A Tribe Called Quest albums would be located after Category:Toto albums and before Category:Triumph albums. Unfortunately, piped links apparently do not actually cause the computer to treat the link as though it were the piped text (in this case "Tribe Called Quest albums") but actually organize it under a separate letter. This leads to the unique circumstance of there being two separate sections for the letter "T", one with all the bands whose category page begins with that letter, and one for those bands whose category page is piped so that it begins with that letter. Is this also a feature I was unaware of, or is it a bug? Tuf-Kat 22:35, Jun 11, 2004 (UTC)

Yes I'm guessing it is a bug in the sub category listing code only, using the pipe on an articles worked for me on Category:Cricket. Note the World cup listings.Steven jones 01:19, 12 Jun 2004 (UTC)

What a mess!

This talk page's project page reads more like a talk page! There appears to be no simple user guide, answering the quesiton "How do I start (or sugest) a category". Andy Mabbett 22:41, 11 Jun 2004 (UTC)

That's because there's not really any agreement on how to use, start or suggest categories... Things should settle down soon, I think (hope...). (BTW, I'm archiving this talk page because it's huge -- I don't mean to step on anyone's toes by removing discussions)Tuf-Kat 22:48, Jun 11, 2004 (UTC)
There is a user guide, which was at Wikipedia:Category when you asked this, but which is now at m:MediaWiki User's Guide: Using Categories. Both of those are linked in the Useful links section near the top of Wikipedia:Categorization. Wikipedia:Categorization is oriented more toward guidelines for which categories to use. Lucky Wizard 22:50, 21 Jun 2004 (UTC)

Putting articles "into the smallest category only" is wrong

I think that trying to use the category functionality for precise classification, searches like "Poets AND German" and whatnot is futile. It was neither designed or is suitable for it. It didn't work with the old subpage system and it won't work with something as simple as "ParentCategory=X". To do that in a useful way, we would at least need name-value pairs, if not something even more sophisticated. I don't think that our hardware and software will be up to that any time soon.

OTOH, categories are perfectly useful for bottom-up constrution of TOCs: just add [[category:European Countries]], and Luxembourg appears in the list of European countries. Except, the way we do it now, it doesn't, because we put it into "EU Countries", which is a subcategory of "European Countries". So, if I want to find something, I more or less have to know where it is.

The way it really should work, is that an article should be a member of all categories where we want it to show up in the list. So, London should be a member of "Cities of England" and "Cities of the UK" and "Cities of Europe" and "Cities of the World", while Leibnitz should be just a member of "Cities of Austria". Accordingly, Johann Wolfgang Goethe should be a member of "German Poets", "European Poets", "Poets" and "German Scientists".

This leads to every article being put into several categories: the more important the article, the more categories it will belong to.

If and when a full classification and search system is implemented, more categories per article will provide more data to be pumped. Zocky 13:46, 12 Jun 2004 (UTC)

I'm not sure how I feel about that. I agree that splitting Europe into EU and otherwise is probably silly, and there are few enough countries they could all go into Europe. However, Category:Writers would be too huge and Category:People would be worse. I think those should be split as is, and leave the top category for people who we haven't figured out how to categorize.
I think that whether something should go into a larger category should be based on it's overall relevance to the category. For example in Category:Writers you would want only the most influential and historically popular writers inside the main category AND the appropriate subcategory, because someone using the category page to browse may very likely be interested in say, Shakesphere. Perhaps a better example is that someone browsing Category:United States history would expect to see the articles of the greatest relevance to US History. So while 1968 Democratic National Convention certainly belongs in the subcategory of Category:United States history - Category:U.S. presidential nominating conventions - Category:Democratic National Conventions (assuming we want to pursue such a heirarchy...), the event itself is of enough import to go into the main category as well. At least that's how I balked the system. Obviously, different categories have should have different requirements for head category listing. --Shanoyu 09:57, 8 Jul 2004 (UTC)
But some categories are definately too deep, even within the People heirarchy. More of a concern to me is that when editors put people in the categories. they are not being thorough. They don't put them into enough categories, and they are not putting the sort tag in, and worse, they might add two categories with the sort tag, but leave the adjacent three or four without a sort tag! Anyway, this will all be sorted out eventually. --ssd 16:31, 12 Jun 2004 (UTC)

What I would like to see is something automatic, where if you go to the page on Category A, you see a list of all articles, not only in A, but in all its subcategories, subcategories of subcategories, etc. Is anyone working on this? -- BRG 15:47, Jun 18, 2004 (UTC)

RfE 964667 is probably the closest task in sourceforge and is currently unassigned. --Zigger 17:20, 2004 Jun 18 (UTC)


The other problem with only using smallest categories is you end up with examples like the novel Melmoth the Wanderer which is currently only in the category '1820 books'. Now (a) that's completely useless if you're trying to find the novel and can't quite remember the title, because you're very unlikley to know the date of publication and (b) why would someone reading about Melmoth want to know about all other books published in 1820 rather than, say, other novels of the 1820s? or other gothic novels? Or other novels by Irishmen? Articles should be in *all* the categories which people reading the article are likely to be interested in. Some common sense in inventing categories would also be no bad thing. Harry R 11:14, 8 Jul 2004 (UTC)
And, furthermore, it's ridiculous that if you want to get from Bram Stoker to Frankenstein via the category system, and you can't remember who wrote Frankenstein, you have to go up and down layers of subcategories (Horror writers -> Horror -> Horror Novels) because one is a novel and the other is a novelist. Either 'Horror novels' should be a subcategory of 'Horror writers', which is illogical but convenient, or even better perhaps, there should just be one category 'Horror fiction' which has novels and writers in the same category. Harry R 11:23, 8 Jul 2004 (UTC)
I'd also add that I've noticed a sudden rash of classification of tv series' into categories solely by which network broadcsat them (originally). This is both *very* US-centric, and useless for those who don't know that WB, CBS, whoever were involved. It may be useful for some people, but not to the vast majority. I would go around removing unusable categories like this but I have seen people then break them down even further so stand clear instead. --VampWillow 11:47, 8 Jul 2004 (UTC)

Functional versus taxonomic categories

Moved this here from the main page (but left a summary behind). --Beland 09:40, 13 Jun 2004 (UTC)

I recently built "Category:Commercial item transport and distribution" (CITD), which culls articles related to all aspects of this particular field. Another user was concerned that it might be too scattershot and non-specific, and proposed breaking it up. I think, however, that this new category points up an emerging difference in how categories may be used. Although I didn’t explicitly set out to do so, I created in CITD an example of what could be called a "functional" category, as opposed to many of the examples used on this page, which set forth a fairly straightforward taxonomic approach to pulling articles together.

In categorizing these categories as taxonomies, I mean that it takes very little external context to pull together the articles in the category, just common, low-context knowledge like the alphabet and geography. You could probably send in a bot to check keywords and pull together categories like "Companies that begin with the letter H" or "Companies in Germany."

By contrast, the CITD category itself contains substantial pieces of contextual information within it, namely, that there is such a thing as the commercial transportation and distribution industry and which particular things pertain to it. You wouldn’t pull together disparate articles such as on the companies FedEx and Hapag-Lloyd, the items pier and containerlift, and the concepts materiel and logistics, unless you already knew that they all happen to be pertinent to this particular industry. In other words, the very category itself informs the user somewhat. It would be much harder to use a bot to build a category like this, or at least its search algorithm would have to be more complex, containing a dose of contextual knowledge about what it was looking for.

A functional category like CITD is well-suited to the user who is drilling down from the main page in generalized exploration. It answers the challenge of supplying information to the user who may not even know what they are looking for. By contrast, a user more knowledgeable on a topic, say, ornithology, might be more likely to go straight to a taxonomic category like “All bird names that begin with the letter G."

I see a great use for functional categories for big human nexus events, like "Category:World War II." That category (though big enough that it's through subcategories) can eclectically collect everything from the brownshirts and the Holocaust to the Norden bombsight to Glenn Miller and Rosie the Riveter.

There is a limit to everything, of course, so I can see a functional category could be taken too far, or even used as a form of disguised original research: If you hold the hypothesis that parrots are controlling the minds of chiropractors, you might build a category that includes everything about parrots, mind control, and chiropractors. That would be an abuse of the encyclopedic genre. By contrast, I have been defending CITD on the ground that it is indeed a real and natural grouping. While it is not as clear-cut as "Companies that begin with the letter H", it is sufficiently cohesive, unitary, and "real in the world" that grouping its elements together is not an abuse of the encyclopedic genre. I’m not defending CITD here, and whether CITD itself happens to be a good functional category isn’t really the point here, but rather that functional categories do have a place beside taxonomic categories in Wikipedia.

If there were only one categorization possible for each article and thus we were working on the One Table of Contents, the dispute between functional and taxonomic categories might be more pitched, but fortunately we are not so constrained. The two types of categories may exist side by side, and users will benefit from both; users who are more focused on specific information retrieval might find Hapag-Lloyd under "Companies that begin with the letter H" or "German companies," while those who are just delving into the notion of commercial transportation might come across it while exploring the CITD category. Fortunately, then, it isn’t really a matter of one type of category "versus" the other, because there is room and use for both of them. --Gary D 23:05, 9 Jun 2004 (UTC)

Some time ago, I pulled together a few things into Category:Radio frequency propagation which includes most terms needed to describe this phenomenon. I hesitated to pull in everything and left out quite a few relevant but generic terms such as bandwidth, but perhaps I should have included them too, although they are or will be included under Category:Amateur radio. I had a very short list of these terms collected before categories were implemented with the intent of writing an article on the subject. Now the list is much larger, and it would be difficult to write an intelligible article that included all of them (unless it was quite long), but a subset might work. --ssd
If those other items would be useful to someone wanting to understand RF propagation who was browsing in the category tree, I would include them. I don't think their inclusion in other categories affects their usefulness in this category. --Gary D 19:18, 13 Jun 2004 (UTC)

Hmm...I think there's something to be said for the idea that categories are not the most logical way to deal with what you call "functional categories." List pages, the actual article on commercial item transport and distribution, and so forth, seem to me to be a better way of dealing with these kind of things. Otherwise, categories will quickly become completely out of control. I support restricting categories to what you call "taxonomic." john k 16:06, 13 Jun 2004 (UTC)

My justification for "functional categories" is that they tie together a concept or field of endeavor immediately for someone who is browsing in the category tree, and they locate it within the hierarchical context of all categorized subjects. We could do a similar tie-together with a massive list on an article page, but the user won't get there unless he chooses that article. If the grouping isn't within the tree, the user may miss it if he doesn't apprehend the significance of the one collecting entry, like "commercial item transport and distribution." But this other way, he can't miss it. Still, I may be missing important downsides to this; what is your concern with putting functional categories in the category tree? --Gary D 19:18, 13 Jun 2004 (UTC)
One thing that categories do that lists or overarching articles do not do is provide an easy way for someone to find their way from a small article (like an article on the United Parcel Service, for instance) to a bunch of related articles (say in the CITD field). With lists and articles, you can only go from main concept to supporting concepts, not the other way around. Unless you do lots of linkbacks and cross-references, which is exactly what categories are, except the maintenance is easier. -- Beland 04:01, 14 Jun 2004 (UTC)
Might be possible to utilize the best of both by including links and explanatory text to guide readers to selected high-level lists and articles in the introduction for appropriate categories. However, whether or not maintenance is actually easier for categories over lists is debatable, IMO. Renaming a list is easy, while renaming a category would involve replacing the category everywhere it occurs. Adding an article to a category is somewhat easier, provided one remembers to do it while creating or editing the article--otherwise you have to go into the article for the sole purpose of adding a category, which is not much easier than adding the item to a list (though if adding multiple categories at one time, then cats have the edge). For backlinking, yes, categories do have the advantage. IMO, the main disadvantage to categories are that they convey less contextual information than a well-structured list or overview article. They are nice for browsing, if you are simply curious or if you already have a basic familiarity with the terms used, but if you don't already know how the elements in a category are related to one another, it is not very helpful. olderwiser 17:42, 14 Jun 2004 (UTC)
All extremely good points! I rarely edit an article just to add one category--I try to add several at once. --ssd 03:16, 17 Jun 2004 (UTC)

Expand usefulness of categories and lists

I'm not sure I have an opinion on the issue directly at hand, but I have a suggestion.

List pages of various kinds and the growing number of categories all attempt to organize information in an easy-to-use-and-edit fashion that provides relevant links with appropriate context to articles. I think there's pretty widespread agreement that whether categories or lists, or both, are used, the goal should be as above. Both list pages and category pages could be better at achieving this goal. Let's brainstorm what we need and want in such a system, then pester the developers until it happens.

What kind of information should be categorizable. The fact that a person is a musician, a Canadian, a trumpeter and a jazz musician are clearly relevant. Should our system be able to accomodate less relevant tidbits (that he's left-handed, male, more than 6.5 feet tall and a 1986 Grammy Award winner for Best Jazz Album)? Should it be possible to combine categories using the MediaWiki software in a way that could create a list of Jamaican-British MPs? What about Jamaican-British MPs who voted against joining the EU? If albums are classified by both year and genre, could we automatically see what the earliest hip hop album with an article is? Could we tag more information, and see what the earliest hip hop album by a white man to go gold in Australia is? What about list-ordering? Do we have to classify something like Popes in either alphabetical or chronological order, or could we tag info to generate a list in either way, or by nationality or some other criterion? Could we take tagging a bit farther and automatically generate something like Timeline of trends in music (1970-1979) listing only events relevant to French music, or psychedelic rock? Should it be possible to contain multiple methods of categorizing things? (i.e. a purple background when using the Fladdershnit Method of classifying amphibians, and a yellow when using the Yamm Method?) Do we want to be able to include captions or other explanatory text on category pages? Could we include a caption on the article page and place it on some or all category pages (i.e. at John Lennon, format category link thusly: [[Category:British musicians|Lennon, John: (1952-1981) British lead singer and frontman for popular rock band The Beatles]])?

Anyway, these are just some questions to get started... Seems to me that a very simple system was put into place, and nobody's really satisfied with it, but since the software is constantly evolving, we have the ability to make the category system even more useful than anybody reading this discussion now is likely imagining. Tuf-Kat 19:32, Jun 14, 2004 (UTC)

I think that if a category includes only one or two items, it is probably unnecessary. (However, if there is only one article now, but we anticipate more, that's a different story: I created a category of "Shorthand systems" which has only 2 articles, but more ought to be written on other systems.) - BRG 14:52, Jun 21, 2004 (UTC)



Interesting questions.

So it appears that categories are currently being used for both semantic/taxonomic classification (A is a type of B) and navigational/functional linkages (article C is related to topic D; subcategory E is a subset of topic D, etc.).

This means that direct and indirect membership may not carry a clean semantic meaning. For instance, the articles that are members of Category:Units_of_measure or its subcategories are mostly actual units of measure (foot, volt, kilogram, etc.). But other articles include a list (Scientific units named after people), general articles (Historical weights and measures), and related articles (Dimensional analysis). This messes up queries like "show me all units of measure that begin with the letter H". It may also cause the following query to include unwanted results: "show me all article-descendants of Category:Units_of_measure that share at least one word in their titles as a title of an article-descendants of Category:Scientists".

Even worse, navigational linkages make the recursive lists of subcategories potentially uselessly large. For example, consider the following chains:

Systems of Government -> Monarchy -> Royalty -> Royals People -> Royals -> Royalty of England -> Queen Elizabeth II

This unfortunately may lead an automated search to conclude that (among other things) the concept "Royals" and the person "Queen Elizabeth II" are examples of governmental systems.

To solve this problem, linkages could be assigned types. For example (using an XML syntax):

<link type="is-an-instance-of"
      source="Queen Elizabeth II"
      target="Category:Royals of England">
<link type="is-a-subset-of"
      source="Category:Royals of England"
      target="Category:Royals">
<link type="is-a-type-of"
      source="Monarchy"
      target="Systems of government">
<link type="is-on-topic"
      source="Category:Royalty"
      target="Category:Monarchy"

That would seem to solve all the problems so far, at the cost of making the category system more complicated.

But another problem is that not all the information we might like to capture is encoded in the category system. There is information inside articles and inside lists and tables which we would also like to be machine-readable. For instance, to answer the query, "show me all Jamaican-British MPs who voted against joining the EU", we might proceed through the following steps:

1.) Extract a list of British MPs from direct membership in Category:Current_British_Members_of_Parliament. But there might be no such category. Instead, there might be a list of members of the House of Lords and a table of members of the House of Commons that shows name, party affiliation, geographical constituency, and term of office. In order to get the information we want, the entire House of Commons table needs to be machine-readable (so names, party, etc., would have to be embedded in an XML-formatted list or the functional equivalent). This is assuming that the querying mechanism knows that by "British Members of Parliament", we don't mean "people who were born in Britain who are now members of some Parliament" but "people who are members of the British Parliament" and that the British Parliament has two subparts, of which, the sub-members of type "person" are of interest. Given the difficulty of this problem, the interface would likely rely on the human inquisitor to make most of these inferences.

2.) Identify the ethnicity of each British MP. It's rather unwieldy to have a list of all people on the planet who have Jamaican ancestry (or who are left-handed, or who are between 6'4" and 6'4.99" tall). And it's unlikely that there's a table which assigns each MP an ethnicity. Such a feature might be mentioned in an individual biography, in which case all biographies would have to XML-encode or whatever the ethnicity (and lots of other properties) of their subjects in a standardized way. Assuming that ethnicity is important enough to record, which it might not be in all cases. For that matter, all British MPs might not have biographies in the database.

3.) Extract a list of people who voted against joining the EU, or a list of British MPs who voted against joining the EU. (Finding a table of the latter would allow us to skip step 1.) Same problems of making lists and tables machine-readable, and you'd probably have to rely on a human to find this table for you.

4.) Unify the lists we've made so far, hope that the names of all MPs are in the exact-same form in all places they are listed (e.g. Tony Blair, vs. Mr. Anthony Charles Lynton "Tony" Blair) or come up with some clever way to cope with them not being so. Not to mention hoping that we haven't somewhere along the line confused John Smith the former MP with John A. Smith the current MP, no relation.

It would take a lot of work to start making article text machine-readable in any way, and before doing that you'd want to establish various markup schemes.

Making lists machine-readable might be a little easier with a few clever tricks. For example, only looking at article/category links, not the explanatory text that may also be in a list item - that would give you a list of canonicalized nodes (that you could unify with lists of category members) rather than a list of flat strings. You'd still have to rely on a human to say, "Unify list A with Category:B". But that could still be quite powerful.

Regularized lists could answer some of the sample queries mentioned, like "What is the earliest hip hop album with an article?" though a sophisticated SQL-like query would have to be constructed.

I wonder how much brute-force data entry Wikipedians would be willing to do, and whether or not existing open-content databases could be used as content sources. I'm thinking of FreeDB for music, for example. There are some non-open but public sources for things like movies (e.g. the IMBD) and books (e.g. Amazon). But perhaps this is beyond the scope of the Wikipedia and deserves its own project, WikiDB, or something.

--Beland 04:34, 15 Jun 2004 (UTC)

It's not much of a contribution to the topic, but I might as well mention the idea since it's in mind; you could perhaps keep things relatively simple for the editor by introducing a couple of different category-like "namespaces" such as "Topic:", "Group:", etc. to represent the different relation types in an easily-remembered and easily-written way. Bryan 09:40, 15 Jun 2004 (UTC)

My bias:peepul

I must confess my bias throughout the category discussions has always been to think of the categories as a tool for a human user browsing through the category list, as a sort of de facto table of contents. My boosterism of functional categories has been in support of that skimming, browsing user. Reading the above description, although intriguing, I must confess I have never considered the category tree as supporting that sort of precision data-mining search. Wikipedia strikes me as more of an imprecise, people-to-people exercise in information transfer, like any traditional encyclopedia in that sense. Anarchic editing army that we are, I wouldn't think our willy-nilly articles are sufficiently structured to support data mining once you drilled in and located them, anyway. Am I missing something? Is there more to search power than is meeting my eye? --Gary D 06:53, 15 Jun 2004 (UTC)

That was my feeling - articles should be in the category which readers will be the most likely to want to browse in. I've been putting things in the Medieval Literature category (before reading this discussion) and have included poets (Chaucer), theologians (Aquinas), poems (Roman de la Rose), genres (fabliau) and technical aspects (alliterative verse) in the same category. My argument would be that, although Caedmon and Wordsworth are both English poets, someone reading about Caedmon (an Anglo-Saxon poet) is much more likely to be interested in next reading about Bede (an Anglo-Saxon historian) or Beowulf (an Anglo-Saxon poem) than about Wordsworth. Conversely, someone reading about Wordsworth is as likely to be interested in reading about Goethe or Schiller or the French Revolution as about other English poets, so a broad category 'Romanticism' is at least as relevent as the category 'English Poets' and possibly rather more relevent.
Perhaps the solution is to have two different types of marker (someone has probably suggested this somewhere). 'category' could be for things like 'Romanticism' or 'Globalization' or 'Star Wars'; a different marker could be used to tag articles with facts like 'English person' or 'poet' or 'Jedi Knight' which can be formed into hierarchies. Harry R 07:57, 4 Jul 2004 (UTC)
Continuing my earlier thought, the factual markers would only be useful if Wikipedia had software allowing one to specify a cluster of markers and ask for articles that had all of them - i.e. 'english'+'poet'+'people born in the C18th' would give you a list that would include Blake, Coleridge and Wordsworth. But actually, if that's how you're using the system, it doesn't matter if there's a mix of taxonomic and functional categories - sometimes it would be more productive to search 'Romanticism'+'poet' - which would produce Goethe as a German Romantic poet as well as Wordsworth - or even 'Romanticism'+'poet'-'English'. Harry R 08:46, 4 Jul 2004 (UTC)

Ontologies and OWL

Some of the ideas of Beland above I've also been thinking about: see Describing the relations right at the top of this page. Since I wrote that I've done some more investigations on this topic. I think I must have been a librarian in a previous life :-)

There's a lot of effort right now in the W3C to create a semantic web: effectively, to build knowledge into web pages systematically so that computer programs (such as search engines) can make semantic judgements about the content - to distinguish between Queen the musical group and Queen the title, for instance. One outcome of this effort is OWL, Web Ontology Language, an XML application which also uses concepts from RDF.

OWL lets you create an ontology for a domain: usually a hierarchical data structure describing the actual things (people, places, pizzas, camera parts, wines, whatever you want to describe - known as individuals in OWL-speak), the types of things (classes and subclasses, with a strict "is a" relationship from class to superclass), and the properties of those things (for a person, this might include their birth date, gender, place of work, and so on - and crucially, properties can link individuals to other individuals). The properties themselves can also be described with a strict class hierarchy (for wines, "has characteristic" would be the superclass of "has colour" and all those other things wine people talk about).

In OWL, each individual has one or more classes: a pizza might be a member of the "vegetarian pizza" class and the "spicy pizza" class. Part of the power of OWL lies in the ability to describe what a "vegetarian pizza" actually is: it's a pizza with no meat toppings and no fish toppings. You can also make statements about properties such as "this property is transitive" (if A and B are related along a property, and B and C are related along the same property, then with a transitive property you can infer that A and C are also related along the property - for example, "ancestor" is transitive). (This is necessarily a heavily simplified description!)

Anyone who's interested in building a true ontology (that is, something that a program can make semantic judgements about) for Wikipedia content should look at OWL. (If I get a spare few days I may beef up the Wikipedia article on the subject. If not, then read the W3C documents, though they're not for XML newbies.)

My gut tells me that OWLifying Wikipedia would be technically possible but a huge pain to introduce. Each article is an individual; there could be a "meta" tab alongside "talk" for an article's OWL data or its wikitax equivalent. A wiki category is an OWL class (but a strict "is a" hierarchy would need to be enforced here). Properties would live in a separate namespace equivalent to category/class. A category/class would also have a "meta" tab that describes the class (restricting its members semantically, etc).

Of course, there is the big overriding question: does this solve anyone's problem? Maybe not today, maybe not tomorrow...

-- Avaragado 09:24, 15 Jun 2004 (UTC)

Robot to convert list pages

I have no knowledge of how Robots work (or are "commissioned") on Wikipedia, but it would be useful to have one to convert list pages, such as List of ornithologists, to categories, and insert the relevant category link in each of the pages listed. Andy Mabbett 11:47, 14 Jun 2004 (UTC)

I agree wholeheartedly, the same occurred to me. --Woggly 08:58, 15 Jun 2004 (UTC)</nowiki>
The problem I have with that is that I am finding the lists to be inaccurate, both including things they should not and omitting things they should (see Talk:List of fantasy authors). This is true for all the lists I've looked at, but it is especially true for lists that need to be subdivided into subcategories (such as List of cryptography topics). When I do the list entries individually, I typically add multiple categories at the same time by reading the article, using the list only as a guide to help me find articles that need categorization.
Also, at least for people, articles should be added to categories with a pipe link reversing the name to Last, First. Usually this can be automated, but some names even I am not sure of, I'm not sure rules could be made to get a robot to do it right.
A robot, at the very least, should merge as many lists as sane and add as many categories as possible in one edit. If a robot did this, and did items in the highest number of lists first, then at least it would be doing a better job than a human (wider list coverage), rather than a poorer one (less accuracy). However, robot assisted categorization would be very appreciated--if I could feed a robot lists of articles and lists of categories and accept them from multiple (human) sources for merging, it might significantly speed up the work.
In short, directly importing the lists into the categories would introduce a large number of errors into the category system that may be difficult to find later, and some of which would be almost as time consuming to fix as categorizing the articles manually. A more automated way of adding the articles would certainly be useful, however. --ssd 12:22, 15 Jun 2004 (UTC)
I think lists that often contain additional information are useful. So a list of, say, Governors of New York needs to contain dates of service and party. But the category "Governors of New York" ought to simply give you the names, perhaps alphabetically, while the list page would be chronologically organized. So we have to look at lists and categories as two different tools, and it may be hard for a robot to convert one to the other. -- BRG 14:58, Jun 21, 2004 (UTC)
Certainly, some lists contain additional information, but many do not; I would like a robot to convert the latter, to save me the pain of manual converson, which I have recently experienced, more than once. Andy Mabbett 15:09, 21 Jun 2004 (UTC)
No objection in those cases. I think that in those cases, perhaps the list article need no longer be there, since the category listing provides all the same info. -- BRG 16:17, Jun 21, 2004 (UTC)

Category naming conventions

Who decides how to name a category? Is it just the first person to come along? Why is one category [[Category:Israeli people]], but another [[Category:People from Luxembourg]]? Why not [[Category:People from Israel]] or [[Category:Luxembourgeois people]]? Or just plain [[Category:Israelis]] or [[Category:Luxembourgeois]]? Why is it [[Category:Israeli actors and actresses]] but, [[Category:Cinema actors]]? What about [[Category:Cinema actresses]]? Or for that matter, who's to stop there being a [[Category:Filmstars]]? Who decided it's [[Category:Children's writers]] as opposed to [[Category:Children's literature writers]] or [[Category:Children's authors]] or [[Category:Young Adult writers]] or any other of a zillion variations?

Are there any stated conventions anywhere? And, how in the name of Dewey does one navigate the special page of Categories, to see which Categories are already in use? --Woggly 09:39, 15 Jun 2004 (UTC)

I've looked for stated conventions but haven't been able to find anything that particularly addresses these points. Navigating the special pages of Categories is a nightmare. I wanted to see how something was categorised for Scotland, and by the time I'd slowly worked my way through the alphabetic list to "S", I'd pretty much lost interest. It was then I decided I didn't care how Zimbabwe had been categorised.

As for who decides how to name a category - it does seem to be that the first into the fray makes their own decision which isn't the most practical way of doing it. I started some of the categories you mentioned - my methodology (right or wrong) was that I read through the existing list of categories and followed the format that was already being used. Of course if the very first person to name a category made a poor choice, I've perpetuated it and I'm not exactly happy with the result. For example the actors by nationality - those few that existed were "such and such actors and actresses" so I've continued with that. The generic terms for actors, seemed to group actors and actresses as "actors" (ie Television actors) so with the ones I created (ie Cinema actors) I followed that pattern. Not a good system. I'm also seeing duplicated categories - American actors and U.S. actors and actresses, for example, which no matter how I look at it, is a single category. US actors and actresses had more names in it than American actors, so I renamed everyone in American actors just to get them into the same category. Not a scientific way of doing it. I noticed a week ago there were 2500 categories. Two days ago there were 3500 categories. I can't imagine too many people wading through that mega-list to make sure they are not duplicating categories by slightly rewording the title, and we're going to have (and already have) a whole bunch of categories that should be deleted because they are better categorised/defined elsewhere. Rossrs 10:31, 16 Jun 2004 (UTC)

Good objections. Let me add Category:Science fiction authors vs. Category:Fantasy writers. Poorly picked, too late to fix 'em all without a robot. *sigh* (see above on robots). It would be nice if there was a good category search system. I have an automated tool that collects categories, and I can grep the list I've collected so far. More helpful, though, is to use the category system itself, browsing similar categories for a common super-category that might contain the category you are looking for... This would be even more useful if there were not so many orphaned categories. --ssd 12:31, 15 Jun 2004 (UTC)

I want to create a very broad category, which would contain all articles related to Cyrillic alphabet, for example Cyrillic alphabet, Saint Cyril, Russian language, etc. Should I name it simply "Category:Cyrillic" or should I go for "Category:Cyrillic topics"? Are there similar categories and how are they named? Nikola 05:21, 16 Jun 2004 (UTC)

Categories with series of books/movies

The articles in Category:Harry Potter movies have recently been adjusted so that their sort keys—rather than being Movie 1, Movie 2, etc—are now merely 1, 2, etc. The user who did it didn't like the way all the movies were sorted under M. Whilst this might look reasonable given the current system for rendering Category articles, I am worried that this might set a bad precedent. Category:Wheel of Time books has a similar set of articles, but there are 10 of them (soon to be 11 when the next books is published); they are sorted as Book 01, Book 02, etc: all therefore appear under B. I know I prefer this system, and not just because I did it (and the original Category:Harry Potter movies sorting also). I am not certain what needs to be done, but I am certain that there needs to be some discussion about it. Possibly one suggestion, following on from earlier discussions, is to suppress the large letters on a Category page if there will be only a single letter; in other words, since all the articles in Category:Wheel of Time books sort under B, don't bother showing the B. --Phil | Talk 11:48, Jun 15, 2004 (UTC)

There has been a feature request somewhere else to suppress large letter headings for any category with fewer items than some threshold (say, 30). This would accomplish the same thing, I think. --ssd 12:35, 15 Jun 2004 (UTC)

Not precisely.

  • Consider the situation of a Category containing 30 articles with Sort Keys Book 01, Book 02, ..., Book 30. These would all appear under B using the current system; this would still hold if the threshold was measured against the number of articles as opposed to the number of "sort buckets". The latter is what I want to measure.
  • The system which has been unilaterally adopted in Category:Harry Potter movies will only work for a series up to 9 items, since an article with Sort Key 10 will appear under 1 and screw up the sorting arrangement. I would prefer to use the system I originally installed, being more scalable, but am unwilling to impose it without some discussion.

--Phil | Talk 13:58, Jun 15, 2004 (UTC)

To be honest, I don't think this is what the Category system should be used for. I know nothing about Wheel of Time books, and looking at this category I see nothing to help me understand why all the books are in seemingly random order and all listed under B. It's slightly less obscure with the Harry Potter films, but equally kludgy IMHO. The fact that a given book/movie/TV series/etc logically follows on from another in some way is a property of the book, and should be unrelated to its classification. If the problem you're trying to solve is to show the ordering, that belongs in an article (as the system currently stands). In both the Wheel of Time and Harry Potter cases I'd be tempted to sort by the first "interesting" bit of the title (eg a sort key "Prisoner of Azkaban" or "Eye of the World") rather than the full title. -- Avaragado 15:26, 15 Jun 2004 (UTC)

Sorry, but to me it would appear obvious that the very point of including an optional Sort Key in the Category system is to allow articles to be sorted by something other than just their title. It would also appear reasonable to sort articles about a connected series of Books/Movies in the natural order in which they are supposed to be read/viewed. What would be the point of sorting them alphabetically? This would introduce no new helpful information to the reader. Which IMHO is what Wikipedia is all about. --Phil | Talk 16:18, Jun 15, 2004 (UTC)

However, there is nothing in the current presentation to indicate that they are sorted in any particular order--only that it is not in alphabetical order. Including the words "Movie 1" or "Book 2", etc in the titles might help with that, but IMO, that would be the wrong approach. Personally, I'd rather see a manually inserted sorted list at the top of the category (or a link to an article). And then have the category contain a broader variety of items related to the topic rather than limit it to such narrowly defined content. olderwiser 16:24, 15 Jun 2004 (UTC)

I was the one who changed the category sorting from "Book 1" to "1". Perhaps the best solution would to be to sort under "#01", "#02", etc.

This avoids having the clutter of having a separate category heading "1", "2", "3", etc. for every single book, and it also avoids the disconcerting listing of items under "B" when there is no "B" in their name (or "M" for the movies). The items would then appear under a neutral "#" category, at the head of the category list.

-- Curps 18:40, 15 Jun 2004 (UTC)

An even better solution: sort them under " 01", " 02" (with leading blank). This causes them to appear under the heading " " (space), which means it appears to be under no heading at all. I've gone ahead and done this so that people can evaluate the effect and see if it's acceptable:

Category:Harry Potter books
Category:Harry Potter movies

-- Curps 18:57, 15 Jun 2004 (UTC)

Wow, I nominate you today's Captain Kludge :-) Another downside of this sort of thing is that it would break attempts to aggregate categories sensibly. A hypothetical autogenerated "Harry Potter extravaganza" page that showed everything under Category:Harry Potter would appear to sort the results oddly. IMHO it's always a bad move long-term to fudge the raw data to achieve a particular rendering effect. But hey, whatever. -- Avaragado 20:12, 15 Jun 2004 (UTC)

Another thought on sorting categories

Maybe a modifier to the categories themselves would allow for each specific article/category to state, in its [[Category:]] tags, whether or not it's an example/subset of that category or merely related to it. For example, Category:Egyptian cities would say [[CategoryIs:Egypt]] because Egyptian cities are a part of the nation and geography of Egypt, whereas Egypt-related topics like Category:Egyptian mythology or Category:Egyptian people would say [[CategoryLike:Egypt]]. Both would display as regular categories, but a user could opt to only display articles in a category that are (or are not) examples of the category. Non-specified categories would have to display on searches for both "is" and "like". -Sean Curtin 22:10, 15 Jun 2004 (UTC)

I'm not sure I understood the above. I may be suggesting the same thing in different words, or maybe not. Currently, I find the categories to be confused and confusing. Some of the categories seem too narrow to me, and I can't see what purpose they serve. It would be nice if the categories could serve as an automatic index, replacing lists, and making it possible to sort articles by many different parameters. I'm afraid that's not at all what has been happening. For instance, when I go into [[Category:Children's writers]], I'd really like to see a list of all categorized children's writers. Instead I see this has already been divided into national subcategories, and I personally am not interested in sorting authors by country of origin - I'd like to simply view a list of all children's writers. The subcategories render this impossible. Wouldn't it make more sense to label each aspect seperately, and then find a way to cross-reference, which would automatically create the subcategory lists? If, as a rule, all people were categorized by nationality, it would be redundant to repeat the nationality as a subdivision of their profession. And vice versa: if all people were categorized by profession, it would be redundant to repeat the profession as a subdivision of their nationality.
For example: I'm all for categorizing Dr. Seuss as a children's writer, and I'm all for categorising Dr. Seuss as an American. I'm not sure it's necessary to categorize him as an American children's writer. Next someone will decide to categorize him as a 20th century American children's writer, or a dead 20th century American children's writer, or a dead male 20th century American children's writer... what I'm getting at, is that once one starts creating subcategories that are conjunctions of two or more different categories, the combinations are endless. Much more useful to use only broad category labels, and then cross-reference different labels. I think categories should be as broad as possible, and the number of categories should be limited. Categories should be like the circles of a Venn diagrams, and subcategories would be automatically created by by the cross sections of the circles. --Woggly 21:42, 16 Jun 2004 (UTC)
P.S. - I've (probably obviously) not read all the above sections yet, so sorry if I'm inadvertantly repeating in my own words what's already been said. Just throwing out ideas in the hopes someone picks up on them! --Woggly 21:47, 16 Jun 2004 (UTC)
There's been a lot of talk about a function that allows the contents of a category to be displayed including the contents of its subcategories. Although ultra-specific categories can be tricky to navigate right now, once that's enabled, it'll be much easier to see the entire contents of a category (including the contents of its subcategories in the process). It'll be much easier to implement it this way than to cross-reference multiple categories, although having the option to do that as well would also be nice. -Sean Curtin 04:36, 17 Jun 2004 (UTC)

Should articles with same title as a category, be part of that category?

I am trying to work out whether there is a convention/standard, and if not, work one out for these case, because there needs to be clear guidelines.

For example, consider the article cell biology and the Category:Cell biology. Should cell biology be a member of the category? I have seen several solutions:

  1. Foo is included in the Category:Foo.
  2. Foo is not included in the category, but the Category:Foo has a link to Foo (in the editable part of the category page) and the Foo article has link to :Category:Foo somewhere prominent in that article.

Related to this is the question about whether an article about the subcategory Category:Foo and the article Foo should be included in the (say) parent category Category:Bar. If it is included and convention #1 (above) is also followed, means that Foo appears in two places in the hierarchy. (e.g. cell biology appearing in both Category:Biology and Category:Cell biology, which seems unsatisfactory, and clashes with the guideline about the filing in the most specific category).

I'm conflicted about the best way to proceed here, but I think clear guidelines would help everybody, suggestions as to a convention, and rationales? --Lexor|Talk 12:46, 19 Jun 2004 (UTC)

Amazing but I bumped into the same dilemma: the article Nobel Prize in Literature list all Noble Prize Literature winners, year and country. the Category [[Category:Nobel Prize in Literature winners]] lists all articles categorized to Noble Prize Literature winners. So I have two questions:
  1. Why should we maintain these two similar lists?
    Because the category is alphabetized, and the article is sorted by year and has additional information. (Justification for existance of the category not implied by this.) --ssd 16:13, 19 Jun 2004 (UTC)
  2. What should be the Category of Nobel Prize in Literature ? (I changed it to [[Category:Nobel Prize in Literature winners]], but now as I think it over I am not sure I did the right thing.
^^ Dod1 15:30, 19 Jun 2004 (UTC)
I have the same problem over in the medicine area. Should the article Antiarrhythmic agents be in Category:Antiarrhythmic agents or in Category:Pharmacologic agents? My thought is it should go in pharmacologic agents, since it is part of that category. Others think otherwise. :-( Ksheka 15:46, Jun 19, 2004 (UTC)
Putting it in its own category is probably redundant, but see below, it should still be linked. Putting it in the parent category is only sensible, and a separate issue not related to putting it in its own category. --ssd 16:13, 19 Jun 2004 (UTC)

How about these for guidelines...adjust as you see fit. The final version of these should eventually be put in the policy section of this page. --ssd 16:08, 19 Jun 2004 (UTC)

  1. If the category is a list of items (which can often be determined by whether or not the category is plural), the article should only be in the category if it is also a member of the list. If it is merely related to the list or the category named after the list, the article should not be in the category. In general, this means that if a category refers to a set of concrete things, but is in itself an abstract concept, the article on the concept should not be added to the category.
  2. If the category is a collection of related terms, and the article is a related term, it is OK but not required for the article to be in the category.
  3. In all cases, if the category and the article are related, there should be a link in the category to the article. If the article is not in the category directly, a link should be placed in the category description text, and possibly a link to the category in the See Also section of the article or some other obvious place.

I would also throw in the following, but it doesn't directly speak to the question of categories with the same name of articles, but just generally to the question of relevant categorization:

  • Overall relevancy should be taken into account. One must consider relevancy from both the perspectives of the article and of the category. For instance, should the article mammal be included in the category aquatic life, because of cetaceans and sirenians and pinnipeds? The answer depends both on whether the article discusses aquatic mammals and on whether the category contains articles about other classes of animals that straddle the terrestrial/aquatic line. Sometimes, good categorization will inform good edits to article content, so that the categorization will make sense even to readers unacquainted with the more obscure points of the subject matter. In some cases, good categorization choices will inform disambiguation or article titling choices.

--TreyHarris 18:46, 20 Jun 2004 (UTC)


Moving Category pages

I am being told I am unable to move Category:Australian MHRs to the correct form, Category: Australian federal MPs. Is there a rule against moving Category pages? If so, what is one supposed to do with a wrongly-titled page? If not, what is the problem? Adam 13:45, 4 Jun 2004 (UTC)

try it without the space before "Australian". Also, note you can refer to a catagory without making the current page a member of it, by putting another colon in front of it, like this Category:Australian MHRs -- Finlay McWalter | Talk
It doesn't work with or without the space. I still want to know whether there is something preventing moving or otherwise renaming Category pages. Adam 07:49, 5 Jun 2004 (UTC)
I think it is a known problem with Categories. You probably can't delete a Category either. Until the problem is fixed you can ask an administrator to move or delete the Category. There is further discussion at Wikipedia_talk:Categorization -- Solipsist 14:47, 5 Jun 2004 (UTC)


Category rendering in history

If you look at the history for Wikipedia:Categories for deletion, you will see that think link for Category:Jewish mythology appears red and links to the "edit" page, as if it didn't exist. However, even when you click on that link, there is data there. Is this a mediawiki bug? - DropDeadGorgias (talk) 20:05, Jun 9, 2004 (UTC)

I think it is a rendering bug. If you say [[Category:somecategory]] in the comment, it seems to show up OK. But if you say [[:Category:somecategory]] like you would when mentioning the category (but not in the category), it shows up like a new article. --ssd 04:35, 17 Jun 2004 (UTC)
Sou desu nee. This is a problem because most of the headers on Wikipedia:Categories for deletion are category names (with the preceding colon). I will mail the list. - DropDeadGorgias (talk) 14:21, Jun 18, 2004 (UTC)


Category weirdness

I can't work out why some categories don't appear to be displaying properly. Take a look at the foot of Avignon and the category Category:Cities, towns and villages of France. Even though it's a populated category, it's displaying as if it was an empty article. Can anyone explain what's going on here? -- ChrisO 15:38, 10 Jun 2004 (UTC)

It seems like it has a ton of articles to me (204). Dori | Talk 15:41, Jun 10, 2004 (UTC)
It does, but try the (much smaller) Category:Landmarks of Paris and you'll see the same behaviour. The category definitely exists but it shows as a bad link. -- ChrisO 16:16, 10 Jun 2004 (UTC)
The category remains red until someone enters some text I believe (if that's what you mean by bad link). Otherwise that also seems fine to me. It shows two towns, and both those town articles show the category. Dori | Talk 16:26, Jun 10, 2004 (UTC)
It is a poor bit of the user interface: the category page is taken to exist only when someone has created a page with some blurb about the category; but meanwhile a user can get to a page which says, in effect, "hello, I don't exist, but look, I've got a bunch of links to lots of articles". And the user could be forgiven for thinking "but you do exist, you crack addled categorisation page, you do, I can see you. And what's this ugly edit box like a carbuncle on your bottom, eh?". In other words, would be better, imo, to have the categorisation page created as a blank page by some process, such that we don't get this UI 'feature' --Tagishsimon
The red category link is useful when trying to spot categories that need description articles. Also, categories with descriptions but no member articles don't show up in Special:categories even if they are members of other categories. More weirdness. --ssd 00:43, 15 Jun 2004 (UTC)
At the very least a red link indicates that the category needs a parent, something that people sometimes forget! Even if it doesn't have a description (often it's self-explanatory), it should have least have a parent... ;-) --Lexor|Talk

It's important to consider that there are two different audiences: readers and editors. Editors can use Category:Orphaned_categories to find categories that need parenting. And that category can be populated in a semi-automated fashion (or fully automated, if someone implements that). There's no need to pollute the readers' experience with either red links or a "this category doesn't exist except that it clearly does" moment. So I think both of these phenomena should be eliminated. -- Beland 23:21, 23 Jun 2004 (UTC)

If people created categories responsibly, there would never be a redlink category. A red linked category does not mean it doesn't have a parent. It means it has no description. (And if it has no description, it can't have a parent, but that's not the point.) There are many criteria that determine if a category "exists"...does it have articles? Does it have a parent? Does it have an article (description)? Only the last of these means anything to the link color. If you hate red categories so much, maybe you'd like to join me in fixing them on Category:Orphaned categories where I have lots of them listed. --ssd 23:59, 23 Jun 2004 (UTC)
Actually that's not quite correct, if a category has a parent, it will have thus have been edited at least once (to include it as a subcategory in another category), have a history, and therefore will not appear red. So as soon as it is edited (either to create a description or add a parent category), it will no longer appear red.--Lexor|Talk 18:14, 24 Jun 2004 (UTC)
I said that already. --ssd 04:14, 25 Jun 2004 (UTC)
No you didn't. You said that the only thng that matters for the color is whether there's a description. In fact, it will turn blue either if it has a parent category or if it has a description. john k 04:28, 25 Jun 2004 (UTC)
You've missed the point. It can't have a parent if it doesn't have a description. The description changes the color. But, since I originally said that, there's no point in going around this again. --ssd 04:07, 6 Jul 2004 (UTC)

More on category piping, alphabetic headers, page organization

OK, I have boldly created Category:Lists of fictional animals and redirected Lists of fictional animals there. To do this, I moved some helpful text (See Alsos and External Links) from the original Lists... page to the Category:Lists... page. With this specific page, I see the need for 3 new category features, two of which people have already stated:

  • Ability to change how the article names display. Because EVERY article in this category starts with "List of fictional", the list is very hard to scan. You really want it to read "*Birds *Cats *Dogs" etc. for ease of use.
  • Ability to turn off the alphabetic headers. At the moment, EVERY article starts with "L" (List of), so having the L repeat is actually kind of confusing. Might be somewhat mitigated by preceding option, but still for short lists like this one, having only the article names would be easier to read than having "***B*** Birds ***C*** Cats ***D*** Dogs ***E*** Elephants" etc.
  • Ability to specify where in the category page the generated list appears. I think it's useful to have See Also and External Link sections in category pages, as in this one. But at the moment they will always appear BEFORE the generated list! Not optimal. Unless someone has a better suggestion on what to do with this material. (And be polite. ;-) ) Would be nice to have a {{GenerateListHere}} kind of markup, which if not specified would default to end of article.

Elf | Talk 16:13, 21 Jun 2004 (UTC)

For the first point, write the categories out as "[[List of fictional whatever|Whatever]]". Right now it won't display "Whatever" as the name on the Category page, but it will file "list of fictional whatever" under W instead of L. -Sean Curtin 17:11, 21 Jun 2004 (UTC)
The usual Wikipedia-pages offer most of those features. Those also avoid cross-namespace redirects. -- User:Docu
Not sure I follow; are you saying not to use categories for anything? Elf | Talk 00:53, 22 Jun 2004 (UTC)
There does appear to be a way to make stuff appear after the categories. I haven't bothered to figure out how it works, but see Category:Fantasy writers. --ssd 00:36, 22 Jun 2004 (UTC)
Cool. Seems to be {{CompactTOC}}__NOTOC__ Elf | Talk 00:53, 22 Jun 2004 (UTC)
Actually, I think just the __NOTOC__ does it, but the CompactTOC is helpful if it is appropriate... --ssd 00:02, 24 Jun 2004 (UTC)