Talk:Collation

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Which is first?[edit]

Which is first, bbb or bbba? Where does bbbbbbbbbba fall in? Is there a rule for this kind of thing? I tried to read the article on lexicographical order, but couldn't understand it.Qutorial (talk) 17:30, 10 May 2009 (UTC)[reply]

Spanish "ch" and "ll"[edit]

How widely accepted is the sort of Spanish "ch" and "ll" as if they were two letters? The linked page credits the Royal Spanish Academy; do Spanish-speakers outside Spain follow their lead? Vicki Rosenzweig, Thursday, June 20, 2002

Yes, I would say any circumspect dictionary follows suit. It is strange, isn't it? Something as a letter that doesn't have its proper lexicographic place. Pallida  Mors 22:10, 23 January 2008 (UTC)[reply]

Latin alphabet[edit]

Note: Wrt. collating sequences there seems to be a duplication of effort between Latin alphabet and Collation. Perhaps it would be best to move everything about also callation of latin alphabets to "Collation"? -- Egil 20:57 May 5, 2003 (UTC)

Mc[edit]

When I briefly did a stint as a library assiant ~1985, I was taught that in the dollation of last names 'Mc' always alphabetized as 'Mac'. Thus 'McCarthy' comes before 'Mary'. Is this practice still standard and is this the right place to reference it? JRP 16:44, 10 Jul 2004 (UTC)

  • Not for all applications. Windows won't sort filenames that way. But the general practice is considered to be a sub-topic of "tailoring", which adapts the collation to specific needs. Compare the chapter "5.1 Preprocessing" in Unicode Collation Algorithm Pjacobi 17:05, 10 Jul 2004 (UTC)
In Scotland, where surnames beginning in Mac, Mc or M‘ (the last should use a "turned comma" Unicode U+2018) are common, they are sometimes treated as all one letter (as they are variant representations of the same 'word'), which is deemed to come between L and M. {The poster formerly known as 87.81.230.195} 185.74.232.130 (talk) 11:41, 8 September 2015 (UTC)[reply]

Evolution of collating order[edit]

How does a collating order evolve for a writing system? I'm slightly surprised that (e.g.) Latin has such a consistent ordering, not varying on a regional basis.

It seems to me that having a fixed order would be terribly useful as soon as you have any sort of bureaucracy, which seems to be one of the main drivers for having writing in the first place. Also, an ordering appears to be useful when teaching writing (see abcdefghijklmnopqrstuvwxyz), which I'd have thought would also be essential pretty early on to the spread of writing.

But these are just my speculations -- it might be nice to have a section summarising what's known about this topic. (I haven't researched it myself beyond a bit of Googling.)

I think you mean alphabetic writing, not syllabic scripts such as Egyptian and Babylonian. But this is indeed a fascinating topic, which I will try to discuss if someone f=doesn;t get there first--it would best go under Alphabet

but see the existing article on Greek alphabet DGG . DGG 08:31, 3 October 2006 (UTC)[reply]


JTN 20:34, 2004 Sep 22 (UTC)

Maybe this is a discussion better suited at Alphabetical order, which currently has an inconclusive disc at the Refrence Desk. --Menchi 22:53, 22 Sep 2004 (UTC)
I can't find the discussion you reference, but thanks for pointing out that topic, which I'd missed. IMO it should be merged into collation, and I've marked it thus. JTN 23:51, 2004 Sep 22 (UTC)
Here: Wikipedia:Reference desk#Who invented alphabetical order?. You're right. These 2 articles are quite repetitive of each other. --Menchi 02:19, 23 Sep 2004 (UTC)
(NB discussion now archived at Wikipedia:Reference desk archive/September 2004 II#Who invented alphabetical order? -- JTN 20:07, 2004 Oct 11 (UTC)))


Continuous Evolution of Collating Order to include Computers & Robots The Continuous Evolution of Collating Order to include all Natural Languages for use also by Computers & Robots, is about homography at the level of humans' personal voiceprints, natural logic, and mother tongue as well as any transformation into other languages, so that anyone can capture our intellectual content. This is the importance of Collation to be serving as a universal standard in harmony with creation and evolution of our communications and comprehension capabilities by symbolic instructions codes.

Merge with alphabetical order[edit]

I merged alphabetical order into this page, as was suggested on the duplicates page. Alphabetical order contained was consisted mostly of example and there was very little non duplicate information, so I thought a simple redirect would do. I added the mention of Roman numerals, contained in the alphabetical order article but not in Collation to the Numerical sorting of strings category. Comments and criticism welcome as usual; I know I tend to be pretty radical with my merges. --Phils 16:27, 22 Oct 2004 (UTC)

This is indeed a bold merge. One potential problem is that the sense of 'collation' used by textual critics (say) has nothing to do with 'alphabetical order' (one of the redirects that leads to this page). But perhaps that would best be solved by creating a new article entitled "Collation (textual criticism)". Something to do. Oldhamlet 18:07, 10 January 2006 (UTC)[reply]

Alphabetical order within Wikipedia[edit]

This is probably not the place to ask (I am new but keen) but is there a way of converting articals containing long lists into alphabetical order ... I am thinking of eg the special page magazines in UK where there is a list of around 1000 magazines in seemingly randomAbtract 09:06, 14 May 2006 (UTC) order[reply]

as a librarian[edit]

I really think the library and the computer science meanings are not quite the same. There is a common ground, for the characters a library uses must be converted to a form that a computer can sort. But there are many ways of doing that, not necessarily algorithmically (e.g. "filed as" --we'd call that a 'filing title" To decrease the labor involved, lbrarians--like every one else--try to accept as much of the computer character set order as possible. Similarly it is the librarians, not the computer scientists, who are particularly concerned with the filing order in other languages--including the uncomfortable fact that some of them have different incompatible rules. (whereas librarians do not have to deal with non alphabetic symbols. My suggestion is that this article should be divided, (tho there will be seome duplicate content) and the different meanings at the top are significant enough for a formal disamb. page. DGG 08:31, 3 October 2006 (UTC)[reply]

seconded. There is no space on the existing page for collation as in sorting printed pages into page order for binding as it immediately launches into collation orders and algorithms —Preceding unsigned comment added by 213.68.15.100 (talk) 13:51, 9 January 2008 (UTC)[reply]
One of the wonders of electronic text is that you can also insert material, and in general easily reshuffle or reorganize things. To say that "there is no space on the existing page" since it "immediately launches into" something is quite meaningless. A possible impediment to discussing collation in the article from a librarian's viewpoint is the lack of citable reliable sources; given such, there is no problem. However, for most uses, a more important consideration than convertability to a form that a computer can sort is what ordering principle is helpful to a human reader in looking up an entry or lemma. Given such an ordering principle, we can instruct librarians and computers alike to sort the material accordingly.  --Lambiam 23:57, 9 January 2008 (UTC)[reply]

Merge proposal[edit]

The article Alphabetical order is largely identical to the section in Collation of the same name, dealing not only with the order of the letters of the alphabet, but also with more general collation issues. It is a waste of energy to maintain two versions of the same information.  --LambiamTalk 17:19, 1 May 2007 (UTC)[reply]

Foreign words in English lists[edit]

I would like to know, how are foreign words, which include various diacritics, treated in English language lists? There is a List of country names in various languages, and the alphabetical order of various language alternatives of the same name does not seem to follow any rule as far as the diacritics is concerned. What is the correct order eg. for Armenia, Armênia, Armènia, Arménia and Armenía? Jan.Kamenicek 11:11, 24 May 2007 (UTC)[reply]

Merge Collating sequence stub to here[edit]

My feeling is that the stubby Collating sequence article is not worthy of an existence on its own, but should be merged into this article (which could use a more informal introductory overview section of the various issues).  --LambiamTalk 09:17, 13 June 2007 (UTC)[reply]

Well, it could make sense to keep Collating sequence separate, and move Alphabets derived from the Latin: Collating sequences there... FilipeS 23:22, 20 September 2007 (UTC)[reply]

Order[edit]

What I want to know is why the alphabet is in the order it is. Why is it ABC.. etc and not TGS.. or any other order. 222.155.172.13 07:01, 2 December 2007 (UTC)[reply]

See Latin alphabet. FilipeS (talk) 23:42, 9 December 2007 (UTC)[reply]

Number of Letters in Spanish[edit]

Although the article makes reference to "the thirty-letter alphabet of Spanish", with only three more than its 26-letter English counterpart (rr not technically being considered a letter) wouldn't the correct number of letters be 29? See: Spanish_language#Writing_system. Martin.fish (talk) 01:35, 4 April 2008 (UTC)[reply]

Social effects of alphabetical order[edit]

Would this be the right page to discuss some of the interesting social side-effects of traditional alphabetical order? I'm thinking of things like the rush to name one's company "AAA-1 Construction" or what-have-you so that it is first in the phone book, or the phenomenon of increased votes for political candidates who appear earlier on the ballot. (Harper's Index once published a statistic suggesting that an alarming number of people simply tick the first box on a ballot without reading the whole list.) I'd also be interested in seeing anything that's been published on the psychological effects on schoolchildren who are consistently picked first or last for projects, presentations, and the like. --Hapax (talk) 18:02, 2 September 2009 (UTC)[reply]

Chinese[edit]

How are dictionaries/rosters ordered in Chinese?

By pinyin (for PRC China). That is - Chinese spelled in Latin letters, and then ordered by standard Latin alphabetic order (with a few peculiarities, reflecting the nature of Chinese characters). e.g. 汉语 comes before 黑色 which comes before 去过 because their pinyin forms are hànyǔ, hēisè and qùgùo (and H comes before Q, and ha- comes before he-). This also has the effect of listing some characters twice, because they may have more than one pronunciation in different contexts. If you don't know how to pronounce a character, you have to look it up by its radical in an index (usually at the start). Some dictionaries are ordered differently, but in the PRC this is the most common way to do it. - Estoy Aquí (talk) 12:31, 18 March 2011 (UTC)[reply]

Re: Alphabetical sorting of numbers[edit]

"When numbers are used as names, rather than for their numerical properties, it is common[citation needed] to sort them alphabetically as they would be spelled. For example, the movie 1776 would be between Seve Ballesteros and Severus Snape. If a number is in a foreign term, it is alphabetized as it would be spelled in that language; for example, 24 heures du Mans would be between Vinge's Singularity and Vinh Airport, reflecting the French 'vingt quatre'."

Interesting problem: What do you do with 2001: A Space Odyssey? Is it spelled out as "two thousand one ... "; or "two thousand and one ... "; or "twenty-oh-one ... "; or "twenty-ought-one ... " or "two zero zero one ... " and so on. WHPratt (talk) 18:07, 21 June 2010 (UTC)[reply]

Or even the example with 1776: why is it sorted like "seventeen..." instead of "one thousand seven hundred..."?? whoever wrote that is full of shit until proven otherwise.
I'm not sure how accurate that is. Here (Ireland) anyway, virtually all retailers "alphabetise" films with numbers at the start or the end, and by their numerical value. - Estoy Aquí (talk) 12:34, 18 March 2011 (UTC)[reply]
Yes, that's the way the computer would sort them, and for that reason, it will probably become standard. (Well, almost . . . unless the software is smart enough to line up the digits, it will sort a title like 60 Minutes To Doomsday after 2001: A Space Odyssey, because even though 2,001 is more than 60, it starts by comparing the "2" to the "6".) WHPratt (talk) 15:51, 18 March 2011 (UTC)[reply]

Radical and stroke ordering with Chinese characters[edit]

People familiar with Chinese characters are aware that a person looks up a character with an unknown pronunciation by finding the radical, then counting the number of strokes in the remaining part of the character. However, when two characters have the same radical and the same stroke count, what determines which character is listed first? --69.171.160.211 (talk) 02:43, 12 November 2011 (UTC)[reply]

"Sort key" listed at Redirects for discussion[edit]

A discussion is taking place to address the redirect Sort key. The discussion will occur at Wikipedia:Redirects for discussion/Log/2020 December 26#Sort key until a consensus is reached, and readers of this page are welcome to contribute to the discussion. 𝟙𝟤𝟯𝟺𝐪𝑤𝒆𝓇𝟷𝟮𝟥𝟜𝓺𝔴𝕖𝖗𝟰 (𝗍𝗮𝘭𝙠) 22:01, 26 December 2020 (UTC)[reply]