Talk:Merge sort

This is the talk page for discussing improvements to the Merge sort article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Archives: 1

Daily pageviews of this article

A graph should have been displayed here but graphs are temporarily disabled. Until they are enabled again, visit the interactive graph at pageviews.wmcloud.org

Computing Mid‑importance

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
Mid	This article has been rated as Mid-importance on the project's importance scale.

Computer science High‑importance

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science articles

High

This article has been rated as High-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

In-place merge sort has the same time complexity as the out-of-place merge sort[edit]

The article mentions a running time of $O(n\log ^{2}n)$ for in-place merge sort, which is wrong. It should be $O(n\log n)$ . — Preceding unsigned comment added by Dhruvbird (talk • contribs) 04:40, 16 March 2015 (UTC)[reply]

Mergesort does not require 2n space[edit]

Two sorted arrays can be merged in place in O(n) time, using O(1) space (temporary variable to swap two elements). To implement mergesort, this merge needs to be repeated log(n) times. You can easily implement mergesort bottom-up stably and in-place. Start with n=1, and merge the slices from i*n to i*(n+1) for i=0 until you exhaust the array. For n=1 this sorts pairs of elements. Then multiply n by 2 and restart until n >= size of the array. — Preceding unsigned comment added by 87.198.115.102 (talk) 07:53, 26 June 2013 (UTC)[reply]

It is a lot more difficult than you imagine, see [1] for instance. The problem is you start possibly having a number of strings to merge rather than just two or three after moving a few elements around. I don't think one can rightly just call it a mergesort after that but more of a qualified type of mergesort with a complicated algrithm. By the way new discussions should be put at the end, just click on 'new section' to start one up. Dmcq (talk) 08:54, 26 June 2013 (UTC)[reply]

I would second the first posters observation. Its nothing more than an afternoon's coding to get the solution. Think about how it works for a linked list and I you'll get inference for static data, just as the first poster observed. Merge sorts are not in a category of algorithms considered hard to implement, even when they are hybrid. — Preceding unsigned comment added by 80.254.148.163 (talk) 17:44, 13 June 2014 (UTC)[reply]

Natural Mergesort[edit]

This type of merge sort I read about first in 'Scheme and the Art of Programming', George Springer and Daniel P. Friedman.
They call it a natural mergesort.

Here is a first stab (in python) at a bottom up , inplace merge sort:

def count_sorted( items, start=0 ):
  for x in range(start+1,len(items)):
    if items[x-1]>items[x]:
      return x - start ;
  return len(items) - start ;

def reinsert( items, val, start, end ):
  for x in range( start, end-1 ):
    if items[x+1] > val :
      items[x] = val
      return
    else:
      items[x] = items[x+1]
  items[end-1] = val


def merge_sublists( items, left, right ):
  for x in range(0,left):
    if items[x] > items[left]:
      val = items[x]
      items[x] = items[left]
      reinsert( items, val, left, left+right )

def merge( items ):
  left= count_sorted(items)
  while left < len(items):
    right = count_sorted( items, left )
    merge_sublists( items, left, right )
    left = left + right

A possible optimization is that when the longest ordered sublist is one, find the longest reverse ordered list and reverse it.

-David Medlock, Jan 13, 2006

Is that timsort? --Damian Yerrick (talk | stalk) 17:45, 19 December 2011 (UTC)[reply]

This is Theta(n^2) with Theta(n) merges and Theta(n) complexity in reinsert. 188.120.84.157 (talk) 04:42, 26 April 2014 (UTC)[reply]

The source of the algorithm in question[edit]

It is claimed that the algorithm was invented by von Neumann in 1945. However the reference is to the Knuth book from 1998 (not the first edition, by the way!!!). The reference to an original work performed by von Neumann and published in 1945 is necessary. If the algorithm presented is described only in the Knuth book from 1998, it's from 1998, not from 1945!Riemann'sZeta (talk) 20:11, 12 April 2011 (UTC)[reply]

A primary reference is not required. Knuth is WP:RS and the statement should be WP:V. If you are claiming that Knuth does not say von Neumann invented the algorithm in 1945, then that would be a WP:V attack. It does not sound like that is your dispute; instead you merely want a document with a 1945 date. That is not a WP requirement for a citation. Glrx (talk) 20:30, 12 April 2011 (UTC)[reply]

It has been suggested that Von Neumann got the idea for merge sort from punched card collators which date back to 1937. The primary operation of a collator is to merge two sorted decks of punched cards, essentially one pass of a bottom up merge sort. collators. Rcgldr (talk) 16:18, 3 February 2020 (UTC)[reply]

Adding Non-Recursive pseudocode examples[edit]

As an algorithm that can be done recursively and non-recusively, it seems to me that added an example of a non-recursive merge sort algorithm might make this article more complete. I myself am not particularly apt at writing clear and understandable pseudocode, but I feel an example hear would be a good idea.OceanicMeerkat (talk) 05:52, 27 January 2014 (UTC)[reply]

The example bottom up psuedo code near the top of the article is non-recursive. Rcgldr (talk) 08:19, 21 February 2014 (UTC)[reply]

The non-recursive merge sort is considerably faster than the recursive merge sort (40% faster). However it derives from tail end recursion removal and is quite tricky. The system I# for Java has an implementation of the non-recursive merge sort in the Java Language. The system I# for C# has the non-recursive merge sort in C#. — Preceding unsigned comment added by 1.129.96.33 (talk) 08:01, 8 July 2016 (UTC)[reply]

Top-down implementation is plain wrong[edit]

Just try to write that for the real computer. One of the obvious errors is that it splits [0, mid], [mid + 1, high] but then merges [0, mid - 1], [mid, high] — Preceding unsigned comment added by 86.57.194.198 (talk) 06:28, 18 March 2014 (UTC)[reply]

Did I fix it? Glrx (talk) 21:43, 26 March 2014 (UTC)[reply]

Assuming you added the comment about iEnd being the end of the array and not part of the array it's fixed. The example started off as actual working code, and most of the changes simply removed the variable type declarations. The split is [0, mid], and [mid, high], with the second index of each pair being the "end" of the sub-array and not part of the sub-array, so the actual split indices would be [0, mid-1], [mid, high-1]. Rcgldr (talk) 18:58, 29 April 2014 (UTC)[reply]

My edits. Glrx (talk) 23:43, 29 April 2014 (UTC)[reply]

Left half and right half in Top-down merge[edit]

Conceptually if you divide anything into two halves around a (truncated integer) midpoint then the first half will be "begin" to "midpoint", the second half will be "midpoint + 1" to "end". For example, begin = 0, end = 1, midpoint is 0, "left half" is 0, "right half" is 1. With the current code "left half" would be 0, "right half" would be 0 and 1, i.e. the entire list.
In Cracking the Coding Interview, p.118, which uses very similar code to the WP example, the first half is "begin to midpoint", the second half is "midpoint + 1 to end", as I would expect.
I changed right half to midpoint + 1, but my change was reverted by User:Glrx with the comment "inclusive/exclusive indices". Later in the WP code there is a comment: "// left half is A[iBegin :iMiddle-1] // right half is A[iMiddle:iEnd-1 ]." But I don't think that works with truncated integers; in my example of a two element list, left half would be from 0 to -1, i.e. you've walked off the array. It would work with integers which round up, but that's not the case with C style languages, which appears to be the style the code is written in.
Also, even if we were rounding up, the comment implies that the midpoint is only in the right half, whereas with the code as written, it's actually in both halves. Can anyone enlighten me what's going on here? --Merlinme (talk) 20:32, 8 December 2014 (UTC)[reply]

Hmm. Ok, I get it. If you pass in an "end" of 2 for the size two list example, then middle is 1, and it does work. I'm not sure it's the most intuitive way of doing things (why do it differently to the commonly understood binary search approach?) but it does work. --Merlinme (talk) 20:48, 8 December 2014 (UTC)[reply]

I've just had a look at "Introduction to Algorithms" (CLRS). In my edition MERGE-SORT is p.34, and uses q = ((p+r) / 2), followed by MERGE-SORT(A, p, q), MERGE-SORT(A, q + 1, r). I've already mentioned Cracking the Coding Interview, p. 118. See also: [2], which references " Algorithms in Java, Parts 1-4, 3rd edition by Robert Sedgewick. Addison Wesley, 2003" and "Programming Pearls by Jon Bentley. Addison Wesley, 1986." That web page uses "m = n / 2, sort a[1..m], sort a[m+1..n]".

If there is a standard idiom in the reference books for a particular algorithm then I'm really not convinced Wikipedia should be using an alternative idiom. At best it's confusing. It could even be argued that it's Original Research (what would you reference it to?), although I'm not sure I'd go that far. --Merlinme (talk) 21:03, 8 December 2014 (UTC)[reply]

"inclusive/exclusive indices" seems to be the norm for most languages where the first index of an array is 0. For example, the calls to C++ STL (Standard Template Library) to std::sort or std::stable_sort are first iterator, ending iterator. C / C++ qsort() parameters are pointer to array, ... , array size. Typical code that loops across an array uses something like for ( i=0; i<size; i+=1). HP / Microsoft STL <algorithm> stable_sort, which is an almost bottom up merge sort splits the array / vector into two parts using (A, start, mid), (A, mid, end). The split is done so that the temp array only needs to be half the size of the original array, otherwise, it's a bottom up merge sort. Rcgldr (talk) 00:56, 23 November 2015 (UTC)[reply]

Bottom-up implementation using lists[edit]

This is a new article section. It's the primary method used to sort linked lists so it should be included in the article. Implementations of C++ STL (standard template library) function std::list::sort()use this algorithm. Rcgldr (talk) 00:37, 23 November 2015 (UTC)[reply]

My objection is not to including this algorithm, but to the presentation. About half of the pseudocode is spent implementing an optimization, rather than the basic algorithm. QVVERTYVS (hm?) 11:06, 29 November 2015 (UTC)[reply]

I shouldn't have reverted outright, though. My apologies for that. QVVERTYVS (hm?) 11:18, 29 November 2015 (UTC)[reply]

The article is fine now. Including a second and overly detailed example of merge() wasn't needed, so I removed it, leaving the key part of the algorithm which is what the section should be focused on. Rcgldr (talk) 11:42, 29 November 2015 (UTC)[reply]

I agree that the optimized version was inappropriate.

I object to the code churning: editing whose purpose is changing existing style and names to another editor's preferred style.

Glrx (talk) 05:09, 1 December 2015 (UTC)[reply]

Is this in reference to changing the zeroes to nil? If so, I'm not sure which would be more clear to the target audience.Rcgldr (talk) 22:32, 3 December 2015 (UTC)[reply]

No. Glrx (talk) 05:23, 5 December 2015 (UTC)[reply]

Then what part of this section was code churned? It's a new section and follows the style of the top down merge sort for list section. Rcgldr (talk) 23:55, 5 December 2015 (UTC)[reply]

This section have to be clearly renamed as C++ implementation. Or deleted. It uses "node.next" which has no sense otherwise. Naming C++ linked list as "node" does not make C++ code a pseudocode or of any general use example. 109.194.226.45 (talk) 13:31, 14 March 2019 (UTC)[reply]

Animation example[edit]

Anyone else think the animation example has extraneous comparisons once one of the sides (the left side in all cases) runs out of cells? I thought the point of the whole algorithm is to take advantage of the fact that the partitions being merged are sorted. — Preceding unsigned comment added by 88.192.22.239 (talk) 14:56, 14 April 2016 (UTC)[reply]

Also - the animation appears to be wrong. It does not follow the Top-Down example code because it keeps going back to other, parallel arrays and working on them INSTEAD OF fully sorting the left subarray AND ONLY THEN going back and sorting the right subarray. — Preceding unsigned comment added by 73.157.112.58 (talk) 02:14, 18 December 2021 (UTC)[reply]

Could we have a description in English?[edit]

The article is completely obscure to those of us who can't read computer code. You break down the list into sub-lists, then you merge them - but how does that work? How do you merge two sorted lists into one? The article just assumes that's easy, as far as the text is concerned. The diagram also shows lists simply being merged and emerging as sorted, without explanation of the merge process. Cyclopaedic (talk) 09:07, 2 August 2016 (UTC)[reply]

I strongly second that. The *one* thing this article is supposed to do it doesn't: Explaining the merge, as in merge sort. Nearly everything else in this whole long article isn't actually that special to merge sort. The merge step is. — Preceding unsigned comment added by 2003:F3:EF19:EB00:5D7F:F4D5:2433:6888 (talk) 08:51, 9 April 2023 (UTC)[reply]

Expression for the midpoint[edit]

I've reverted the midpoint formula back to the clearer (a+b)/2. The a+(b-a)/2 form is not obvious to most people.

WP should be describing algorithms. In a pseuodcode implementation, the widths of the numbers are not specified because the operations are abstract. The calculations are presumed to not overflow.

Also, the URL https://research.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html is a blog. It is not a reliable source.

At Binary search algorithm#Implementation issues, a big point is made of the Java library bug, but that claim is disingenuous. There's a reason the bug was not discovered for years: it would only occur if somebody did something impractical. It would only be serious issue when one uses 32-bit indices in a 64-bit address space. But indices in 64-bit address spaces should be 64-bit numbers. In a 64-address space, I might want to make an array of more than 2³² records; I should be able to use binary search on that array. If the system is only allowing 32-bit indices, then I have bigger problems.

Now the meat.

The numbers used in the algorithm are indices rather than byte pointers. In rational cases, the size of a record will be at least 4 bytes. So in the 32-bit address space case, the record indices will be 30-bit numbers. Overflow will not be an issue. Same for the 64-bit address space.

The Java bug manufactured an irrational case: it made an array of 1.1 billion single bytes and tried to search that array; the numbers overflowed and generated an array-out-of-bounds reference. Think about that array of bytes for a minute. A binary search of an array with 1.1 billion items that can only have 256 distinct items?

WP should not be emphasizing such didactic corner cases. Glrx (talk) 20:02, 10 October 2017 (UTC)[reply]

Variants[edit]

This section includes the statements: "... producing an in-place merge algorithm that can be combined with a standard (top-down or bottom-up) merge sort ... " "the notion of "in-place" can be relaxed to mean "taking logarithmic stack space", because standard merge sort requires that amount of space for its own stack usage", but logarithmic stack space is only used by top down merge sort, not by bottom up merge sort. Rcgldr (talk) 16:43, 9 December 2017 (UTC)[reply]

@Rcgldr: May be an in-place stable merge routine requires itself a logarithmic size of stack? Then combining it with a constant-stack-size bottom-up algorithm would produce a logaritmic-stack-size stable mergesort, wouldn't it? --CiaPan (talk) 17:49, 9 December 2017 (UTC)[reply]

@CiaPan: My point is that the article states that "standard merge sort requires that amount of space", not standard merge sort modified to be in place. Rcgldr (talk) 00:06, 21 December 2017 (UTC)[reply]

Problem with Top-Down example - result is on arbitrary side?[edit]

Hoping I'm not missing something here, but after going through it many times I think there's something missing in the Top-Down example as of the May 24 2018 edit.

If you do the sort with an array of size 16, then the resulting sorted array ends up in the original array (the one that was "A" in TopDownMergeSort at the top).

...but if you do the sort with an array of size 8, then the resulting sorted array ends up in the work array (the one thas was "B" in TopDownMergeSort at the top).

There is no piece at the end that figures out if the result lies on the wrong array, and copies it over if that's the case.

Jlkeegan (talk) 03:57, 15 June 2018 (UTC)[reply]

I can't see what you mean. Possibly you miss that the actual arrays referenced by symbols A and B swap on each level of recursion, as TopDownSplitMerge(B[], iBegin, iEnd, A[]) invokes itself with

    TopDownSplitMerge(A, iBegin,  iMiddle, B);  // sort the left  run
    TopDownSplitMerge(A, iMiddle,    iEnd, B);  // sort the right run

As a result, A means a source array for split and a destination array for merge at the current level of recursion which can be either the same as the destination array of the whole processing or the working (temporary) copy of source data made at the beginning. And similary (but opposite) for B, which is a temporary source array for merging at the current level of recursion. Whether it is this or that way depends on whether it is even or odd level of recursion. --CiaPan (talk) 08:51, 15 June 2018 (UTC)[reply]

Ping added: @Jlkeegan: --CiaPan (talk) 10:52, 15 June 2018 (UTC)[reply]

Nope, I saw that. It's tough to see the problem if you're just reading the code - if you walk through it as a real example, and keep track of which actual array is changed at what point, and you pay attention to how many levels deep you go on input of different sizes, you can see the problem.

TopDownMerge (the only method making changes) decides what array to write to based on which array was passed in as its fourth argument. That argument is provided by the first argument of the TopDownSplitMerge call above it, which was provided by the fourth argument of the TopDownSplitMerge above that. The number of TopDownSplitMerges in the stack determines how many times it was switched from one array to another.

Do it out on paper, and keep track of what ACTUAL array things get written to. If you have an array of size 16, the result will indeed reside within the initial "A" array provided to TopDownMergeSort at the beginning. But if you have an array of size 8, the result will instead end up on the initial "B" array provided to TopDownMergeSort at the beginning. (The fact that the parameter names are all A and B is confusing - ignore that and keep track, or do this all out with a deck of cards).

There needs to be some step at the end to check to see if the depth was even or odd, and then in one of those cases to copy the resulting sorted array from B to A (if the final merge had been from A to B). (Again, unless I'm wrong, but I'm 99% sure I'm not). Jlkeegan (talk) 17:38, 15 June 2018 (UTC)[reply]

@Jlkeegan: Just DON'T do it on paper, but compile the code and do it in a computer. That's where the code is intended to run, after all. But, if you insist on manual execution, try shorter arrays first, 2- and 3-items for a beginning. And remember to complete it – do not break at the deepest level of recursion, but go on until the return to where you started.

Anyway, whatever you did on your paper, it doesn't change the code, right? And the code clearly shows that (after completion of all the recursive calls) the top-most instance of TopDownSplitMerge passes its A[] argument as the last parameter to TopDownMerge. As a result, the very last merge puts the merged data to the original destination array. Which is correct. It doesn't matter how long the array is – whether it contains two, three, eight, sixteen or a million items – and how many levels of recursion are involved, the final merge always puts data into the correct array. --CiaPan (talk) 10:43, 18 June 2018 (UTC)[reply]

animation[edit]

@CiaPan: Copied from my talk page:

Hi, you have removed File:Merge sort animation2.gif here: Special:Diff/887750333.
Why do you think it's not a mergesort depiction? --CiaPan (talk) 3:10 am, Today (UTC−4)

(I removed it in one other place; not sure why I didn't give as detailed an explanation here). My issue is that it doesn't show the actual merging of sublists, which instead appears to happen instantaneously. In contrast, the animation in the infobox does show the merging. –Deacon Vorbis (carbon • videos) 12:22, 15 March 2019 (UTC)[reply]

@Deacon Vorbis: Of course the image does not show every detail of data transformation, it just shows a picture of the whole array at the end of each processing stage (here a processing stage being the merging routine). Similary File:Bubble sort animation.gif or File:Insertion sort animation.gif show an array state after each item reaches its destination place in the sorted part, but not after every swap. Additionally, the merging requires an external array, so the same presentation method simply can't show the whole process. Anyway it still is a presentation of mergesort. --CiaPan (talk) 13:01, 15 March 2019 (UTC)[reply]

Well, I might argue that those aren't ideal either. Slightly off-topic, but those have better alternatives available: c:File:Sorting bubblesort anim.gif and c:File:Insertion sort.gif. I realize that a change in presentation would have to be made due to the use of an external array, but it could certainly be done. Ehh, but until someone (possibly me, but let's be realistic) actually does that, if you really think it's better to include, then feel free to put it back; I won't continue to object. –Deacon Vorbis (carbon • videos) 13:27, 15 March 2019 (UTC)[reply]

The deleted animation is showing the merge sort of an array of random integers. The caption before I changed it didn't indicate at all what was being sorted. The deleted animation is actually more realistic than the other one in this article. In an actual merge sort, the array to be sorted would be broken up into smaller subarrays and each subarray would be sorted with insertion sort. Then the subarrays would be merged. This is exactly what the animation is showing.

There is no good reason to remove this animation. It just needed a more descriptive caption. It has been in this article since at least 2013. Jrheller1 (talk) 23:49, 19 March 2019 (UTC)[reply]

@Jrheller1 and Deacon Vorbis: Not precisely the same GIF file, but the same animation has been there for over 12 years, since the end of November 2006 - see Special:Diff/90680635#Analysis. :) CiaPan (talk) 07:40, 20 March 2019 (UTC)[reply]

current animation (not the deleted one) should be updated[edit]

The currently still visible animated image conflicts with the algorithms presented in the article. What the animation depicts is a merge sort implemented using a queue instead of a stack (recursion). A queue based implementation results in a breadth first repeated splitting of the array until the size of runs is reduced to 1 element per run. The queue approach is not normally done because the queue space complexity is O(n), and the queue approach is not mentioned anywhere in the article. A stack approach is depth first, left first: the array split in two, the left half split in two, the left quarter split in two, ..., then and only when the leftmost two runs of size 1 are produced does the merge process begin, following the stack chain up and dawn in a depth first, left first order. A simpler fix would be to show a bottom up approach. In one step, the initial array is separated into n runs of size 1 (this is really not a step, as the array is just treated as n runs of size 1 without any actual data manipulation), then typically even and odd runs are merged in breadth first order. This would only require removing the intermediate split steps shown in the animation. Rcgldr (talk) 21:42, 21 March 2019 (UTC)[reply]

@Rcgldr: I'm afraid you're wrong. Please verify my understanding of your description.

The recursive (stack) implementation makes subarrays sorted in the following order:

          [0  1]
          [0  1][2  3]
          [0  1  2  3]
          [0  1  2  3][4  5]
          [0  1  2  3][4  5][6  7]
          [0  1  2  3][4  5  6  7]
          [0  1  2  3  4  5  6  7]
          [0  1  2  3  4  5  6  7][8  9]
          [0  1  2  3  4  5  6  7][8  9][10  11]
          [0  1  2  3  4  5  6  7][8  9  10  11]
          [0  1  2  3  4  5  6  7][8  9  10  11][12  13]
          [0  1  2  3  4  5  6  7][8  9  10  11][12  13][14  15]
          [0  1  2  3  4  5  6  7][8  9  10  11][12  13  14  15]
          [0  1  2  3  4  5  6  7][8  9  10  11  12  13  14  15]
          [0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15]

Is that correct?
Does it agree with the animation?

OTOH the iterative (queue) implementation would sort subarrays like this:

          [0  1][2  3][4  5][6  7][8  9][10  11][12  13][14  15]
          [0  1  2  3][4  5  6  7][8  9  10  11][12  13  14  15]
          [0  1  2  3  4  5  6  7][8  9  10  11  12  13  14  15]
          [0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15]

Is that correct?
Does it agree with the animation?

CiaPan (talk) 22:31, 21 March 2019 (UTC)[reply]

Stack or top down recursive merge sort operations occur in this order (depth first, left first), using {} to indicate merged.

          [0  1  2  3  4  5  6  7]
          [0  1  2  3]
          [0  1]
          [0][1]
          {0  1}
                [2  3]
                [2][3]
                {2  3}
          {0  1  2  3}
                      [4  5  6  7]
                      [4  5]
                      [4][5]
                      {4  5}
                            [6  7]
                            [6][7]
                            {6  7}
                      {4  5  6  7}
          {0  1  2  3  4  5  6  7}

Bottom up order would be:

          [0  1  2  3  4  5  6  7]
          [0][1][2][3][4][5][6][7]
          {0  1}
                {2  3}
                      {4  5}
                            {6  7}
          {0  1  2  3}
                      {4  5  6  7}
          {0  1  2  3  4  5  6  7}

The animation shows a queue ordering, with some of the operations shown in parallel:

          [0  1  2  3  4  5  6  7]
          [0  1  2  3]
                      [4  5  6  7]
          [0  1]
                [2  3]
                      [4  5]
                            [6  7]
          [0]
             [1]
                [2]
                   [3]
                      [4]
                         [5]
                            [6]
                               [7]
          {0  1}
                {2  3}
                      {4  5}
                            {6  7}
          {0  1  2  3}
                      {4  5  6  7}
          {0  1  2  3  4  5  6  7}

Rcgldr (talk) 23:54, 21 March 2019 (UTC)[reply]

(ec) @Rcgldr: Sure, but diving into smaller and smaller sublists does not change the order of data, so it is invisible in animation. What we can see are results of merging sublists, that is snapshots at returns from recursive calls. So once again, please see the data ordering changes presented in the animation. Is it a depth-first or breadth-first algorithm? --CiaPan (talk) 23:55, 21 March 2019 (UTC)[reply]

@CiaPan: The animation shows an array of size 8 split into 2 sub-arrays of size 4, which are then split into 4 sub-arrays of size 2, which are then split into 8 sub-arrays of size 1. These splits are done in parallel, followed by breadth first merges. This most closely corresponds to a queue based merge sort, except that queue based splits are done serially (FIFO), not in parallel, and queue based merge sort is not discussed anywhere in the article. For a bottom up merge sort, there is no repeated splitting of an array, instead an array of size n is immediately treated as n runs of size 1 (there are no actual splitting steps). If the animation is to show a bottom up merge sort, the animation should show the array of 8 split into 8 sub-arrays of size 1 in a single step. Rcgldr (talk) 01:44, 22 March 2019 (UTC)[reply]

Rcgldr is discussing the animation still visible in the article. The deleted animation is showing a tiled merge sort. See the section "Optimizing merge sort". It looks like subarrays of about size 16 are sorted with insertion sort and then the sorted subarrays are merged. Maybe the deleted animation should be moved down to this section with caption indicating that it is a tiled merge sort. Jrheller1 (talk) 01:55, 22 March 2019 (UTC)[reply]

@Jrheller1: - I separated this into a new section, since it's discussing a different animation. Rcgldr (talk) 03:02, 22 March 2019 (UTC)[reply]

Bottom up implementation - swapping reference to A and B[edit]

Consider that C, C++, C#, Java, Python, all handle array names as references, and can implement swap(A,B) which just swaps the references, so why not change the article to use swap(A,B), and put in a comment about using copy for programming languages that wouldn't support a swap. Rcgldr (talk) 18:22, 10 June 2019 (UTC)[reply]

@Rcgldr: In C and descendants A and B are not 'references' to objects but rather names of variables. You can't just swap the meaning of the two to make them refer to opposite data. You'd have to replace pure arrays with pointers to arays and that would overwhelm the whole code with lots of unnecessary dereference operatos (prefix asterisks). As a result – IMHO – the code would be less readable, whilst readability should be one of the main aims of presenting the code in encyclopedia. Leave minor optimizations to programming handbooks. --CiaPan (talk) 06:10, 11 June 2019 (UTC)[reply]

@CiaPan: Top down example already has minor optimization, it copies A[] to B[], then reverses the parameters on the recursive calls so that the direction merge changes with the level of recursion. Rcgldr (talk) 09:51, 15 June 2019 (UTC)[reply]

@Rcgldr: Additionally, you'd have to count loops and make another conditional swap at the end of the routine to make sure the sorted data eventually land in the correct array. Or return that array as an explicit result of the routine. --CiaPan (talk) 06:12, 11 June 2019 (UTC)[reply]

@CiaPan:I meant when A[] and B[] are passed as parameters to a function. In C or C++, arrays passed as parameters are treated as pointers, myfun(int A[], int B[], ...) is the same as myfun(int *A, int *B, ...). Using sizeof(A) within myfun returns the size of a pointer, not the size of the array, so for C or C++, a 3rd parameter would be needed for the number of elements. Bottom up merge sort could use a counter for the number of passes and do a copy if the number of passes are odd, or similar to the top down example, which always copies at the start, bottom up could always copy at the end. Rcgldr (talk) 13:40, 11 June 2019 (UTC)[reply]

I may have been a bit too WP:BOLD[edit]

Er... I just had a look at this talk page and noticed the extensive discussion of multiple versions of example code, after I just rewrote it.

As the commit message says, the motivation for the rewrite was TopDownSplitMerge(), which was:

// Sort the given run of array A[] using array B[] as a source.
// iBegin is inclusive; iEnd is exclusive (A[iEnd] is not in the set).
void TopDownSplitMerge(B[], iBegin, iEnd, A[])
{
    if(iEnd - iBegin < 2)                       // if run size == 1
        return;                                 //   consider it sorted
    // split the run longer than 1 item into halves
    iMiddle = (iEnd + iBegin) / 2;              // iMiddle = mid point
    // recursively sort both runs from array A[] into B[]
    TopDownSplitMerge(A, iBegin,  iMiddle, B);  // sort the left  run
    TopDownSplitMerge(A, iMiddle,    iEnd, B);  // sort the right run
    // merge the resulting runs from array B[] into A[]
    TopDownMerge(B, iBegin, iMiddle, iEnd, A);
}

This expects the recursive calls to TopDownSplitMerge to take their input from A[] and leave their results in B[], so the call to TopDownMerge() can copy them back into A[]. This implements an in-place sort. But didn't we just say that it needs TopDownSplitMerge to be an out-of-place sort?

Upon even more reflection, it might be that the comment is seriously misleading and the code actually works in an incredibly confusing way because A[] and B[] are duplicates of each other so we recurse down to the bottom and find the source data in either A or B depending on the number of levels of recursion and then things work on the way back up. But if I've stared at it for half an hour and still can't understand it, that's not a good pedagogical example. (And q:Norm Schryer's wisdom applies.)

Almost as significantly, it uses a temporary buffer equal to the input size, which is a completely impractical implementation. We want to avoid confusing micro-optimizations (like special-casing small arrays), but using a temporary buffer of realistic size has a pretty fundamental effect on how the code is organized. You want to give an example whose structure is recognizably similar to a practical implementation.

The list code wasn't in such bad shape, but I didn't like how the top-down and bottom-up list code used different abstractions for the lists. Such gratuitous differences make it difficult for a reader to see the significant differences. One thing I specifically did was use the same core Merge operations in both examples.

I spent a while thinking about it and think I came up with pretty concise and clear implementations which don't leave out anything important (fiddly corner cases like zero-sized input are handled). But given the amount of discussion which has taken place here, I probably should have proposed it here for discussion rather than just editing it into the article. Oops.

One thing I'm not happy with and still needs fixing is the function names; the array and list sort functions currently have the same names. Fortunately, that's a sufficiently minor point that it doesn't do significant harm to the article pending resolution. I'm also not happy with the obscure bit-twiddling in the lsbit() function. I put it in its own function and commented the hell out of it, but I can't find an existing article to link to.

Another problem which needs fixing is that the tape-based merging description refers to the bottom-up example code, but the example code (both before and after my rewrite) uses a depth-first merge order. Tape merging operates breadth-first. I at least added a few words about the difference (cut short before I digressed into cache-aware multi-way merges), but the tape merging reference needs fixed. Update: I fixed this.

Anyway, sorry for dropping a bomb into the article. I'm used to editing quiet articles where the talk page is overrun with tumbleweeds and asking for comments there is a months-long process. so it's WP:BOLD or go home. I should have checked this one first. 196.247.24.22 (talk) 00:50, 22 December 2019 (UTC)[reply]

Okay, it's been a month with no comment of any sort. I guess I didn't need to worry so much. 196.247.24.22 (talk) 22:42, 22 January 2020 (UTC)[reply]

The reason for no comments is probably due to the fact it's been about a year since the last change to the pseudocode and some of the members were not expecting any changes and not monitoring the article. I was away or busy with other projects and missed the changes that were made. Based on prior discussions on this, here and at other members talk pages, I reverted the examples back to their prior state. I'm not a fan of the mix between the top down and bottom up for linked list examples, but prior discussions wanted the top down left alone. Maybe now is the time to revisit this, but I'll have to somehow contact the prior members involved in this. I did add an explanation for how the top down for array merge sort works. Please sign up for a Wiki account so that you have a talk page providing a means for others to communicate with you. Rcgldr (talk) 16:27, 3 February 2020 (UTC)[reply]

Your code (from Special:Permalink/931876827#Top-down implementation) seems incorrect. It looks like if fed with a two-items array {0, 1} it will return {1, 0} as a result. --CiaPan (talk) 10:12, 4 February 2020 (UTC)[reply]

@CiaPan: For others reading this, Ciapan is referring to a prior edit by user:196.247.24.22 that was undone, and reverted back to a known working version: Special:Permalink/938979353#Top-down implementation Rcgldr (talk) 15:12, 4 February 2020 (UTC)[reply]

That's right, I am sorry for expressing myself in such ambiguous way.
Thank you, Rcgldr, for explaining things. --CiaPan (talk) 15:51, 4 February 2020 (UTC)[reply]

Minor bug correction[edit]

I'm leaving a note here since obviously there's been a lot of work done but the change I'm making is rather minor. I've skimmed over the most recent discussion in particular and don't think I've missed any discussion of this but of course I could be mistaken.

In doing an implementation, I noticed an off-by-one error: in checking for size one, the difference of the indices is compared against 2. But the indices are inclusive I believe, so if we have, for instance, 0 & 1, the difference is 1 but the size is 2. I think this bug slipped in when going from a function which checked length to using array indices where it's calculated but I haven't checked the history to confirm that.

So, I'm going to change the value there to 1 from 2 as I believe that's the correct comparison (that is: if b-a < 1, then we have 1 (or fewer) elements and don't need to split/merge).

Écrivan renaissant (talk) 06:28, 17 July 2020 (UTC)[reply]

@Écrivan renaissant: As for:

But the indices are inclusive I believe

I'd say it's usually better to know than to believe.

And the comment just above the function TopDownSplitMerge you modified explicitly says:

// Sort the given run of array A[] using array B[] as a source.
// iBegin is inclusive; iEnd is exclusive (A[iEnd] is not in the set).

Please revert your change. --CiaPan (talk) 13:54, 22 July 2020 (UTC)[reply]

@Écrivan renaissant: Another editor fixed that: special:diff/968920370. --CiaPan (talk) 20:30, 5 August 2020 (UTC)[reply]

Incomplete reference[edit]

In sub section Use with tape drives, there seems to be an incomplete reference (currently note 11):

<ref>Selection sort. Knuth's snowplow. Natural merge.</ref>

--Mortense (talk) 21:00, 20 November 2020 (UTC)[reply]

@Mortense: That's been added by Glrx in this edit Special:Diff/471918025 on 17 January 2012.

Glrx is not very active these days, so I would not expect a prompt reply from them - but, who knows...? --CiaPan (talk) 00:20, 22 November 2020 (UTC)[reply]

@Mortense: Same expression appears in a more detailed citation:

Donald Knuth, The Art of Computer Programming, Sorting and Searching, Volume 3, 1973. The "snowplow" argument. p. 254

in the Tournament sort article, section #Common application. That was also added by Glrx in Special:Diff/341048803 on 31 January 2010 and expanded (by adding the "snowplow") in Special:Diff/426315622 on 28 April 2011. --CiaPan (talk) 00:33, 22 November 2020 (UTC)[reply]

I have no access to The Art of Programming ~~at the moment~~, but I found quite precise description of the plow here:

URL: http://didawiki.cli.di.unipi.it/lib/exe/fetch.php/magistraleinformatica/ir/ir10/notes.pdf

Author: Paolo Ferragina

Title: unknown

Chapter: 4

Chapter title: Sorting Atomic Items

Pages: 4-4 and following

HTH. --CiaPan (talk) 00:53, 22 November 2020 (UTC)[reply]

@Mortense and Glrx: I've added the reference, at last: Special:Diff/1022865342. Better late than never. CiaPan (talk) 23:30, 12 May 2021 (UTC)[reply]

Merge sort inspired by punched card collators?[edit]

For example, one of the operations of the 1937 IBM 077 punched card collator was to merge two already sorted decks of cards. It could also remove duplicates. However, I haven't been able to find out if merge sort was inspired by the merge operation of such collators. Rcgldr (talk) 18:09, 12 May 2021 (UTC)[reply]

@Rcgldr: Some hints on source are given in the How did Von Neumann come up with his Merge Sort algorithm? thread at the 'History of Science and Mathematics' StackExchange site:

https://hsm.stackexchange.com/questions/12549/how-did-von-neumann-come-up-with-his-merge-sort-algorithm

This thread at Quora seems relevant, too (but I didn't read it): How did Von Neumann discover the merge sort? Was there any predecessor to the merge sort?

https://www.quora.com/How-did-Von-Neumann-discover-the-merge-sort-Was-there-any-predecessor-to-the-merge-sort

CiaPan (talk) 21:50, 12 May 2021 (UTC)[reply]

The analysis of the worst-case number of comparisons is incorrect[edit]

The analysis of the worst-case number T of comparisons of keys performed by Merge sort is incorrect.

It begins from an incorrect recurrence relation

$T(n)=2T(n/2)+n.$

on T. First, the said recurrence relation is different for the worst case, the average case, and the best case. Second, it is formally invalid, except for the cases when n is a power of 2 (because T requires integer arguments and n/2 is not an integer if n is odd). Third, it mishandles the basis case of n = 1. And the reference ^[1] that is supposed to justify it is invalid as it does not imply it. [This sentence was added by M A Suchenek (talk) 20:14, 8 June 2021 (UTC).][reply]

The correct recurrence relation the worst-case number $T_{worst}$ of comparisons of keys performed by Merge sort on n distinct elements is

$T_{worst}(n)=T_{worst}(\lfloor n/2\rfloor )+T_{worst}(\lceil n/2\rceil )+n-1$ for $n\geq 2$ and

$T_{worst}(1)=0.$

As I mentioned before, the n-element array cannot be split on two subarrays of equal size n/2 if n is odd, simply because an array cannot have a size that is not an integer number. For instance, if n = 3 then the array is split on two subarrays of sizes $\lfloor 3/2\rfloor =1$ and $\lceil 3/2\rceil =2$ , and not on two arrays of size 1.5.

Also, the worst-case number of comparisons of keys while merging two sorted lists of the total size n is n-1 and not n. For instance, merging two 1-elemet lists (thus having a total of 2 elements) requires 1 comparison of keys and not 2 comparisons.

The comment that Knuth's analysis is incorrect because Knuth allegedly used "slightly suboptimal" version of Merge sort was left unexplained. As a result, it remains unclear what Merge sort is being analyzed in the section "Analysis". The standard version of Merge sort that satisfies the standard (given above) recurrence relation for $T_{worst}$ has the exact worst-case performance (and not "slightly" better than) given by Knuth. M A Suchenek (talk) 21:18, 1 June 2021 (UTC)[reply]

Ok, but we would need a source that meets Wikipedia guidelines to do anything with this. Article preprints (especially ones you have written yourself) don't meet that bar. See also Wikipedia:No original research - MrOllie (talk) 01:40, 2 June 2021 (UTC)[reply]

The bar is the truth, particularly if it has been proved, and not how it was published. Reference to and quotations from arxiv.org are not "original research". On one hand, you removed factual material that has elementary, easy to verify proofs in the referenced material, while on the other hand you keep unsubstantiated or false statements (like that Knuth formula was incorrect) with no reference to any source, whatsoever. Perhaps, you may consider meeting your own standards? M A Suchenek (talk) 08:56, 2 June 2021 (UTC)[reply]

M A Suchenek, No, that is the opposite of how Wikipedia works. See Wikipedia:Verifiability, not truth. I urge you to read WP:NOR and WP:RS. arxiv.org preprints have absolutely no editorial controls. Under the philosophy you are espousing here, anyone could write anything, upload it to arxiv, and then use that as a source. The existence of other improperly sourced content is a reason to fix that content, not to add more improperly sourced content. Espeically not when you have a conflict of interest with the source you're adding. - MrOllie (talk) 11:50, 2 June 2021 (UTC)[reply]

You are mistaken. See below. Moreover, the references that you inserted to the article, section Analysis, are irrelevant as they do not support the false claims that were left with no references in its previous version. M A Suchenek (talk) 20:14, 8 June 2021 (UTC)[reply]

Self-contradictory argument, absurd assertion, etc.[edit]

MrOllie: Your argument (above) is self-contradictory and, therefore, invalid.

First, you removed my insertion of relevant material with published proof claiming that the reference was not "reliable". Then you justified the removal with a reference to Wikipedia:Verifiability, not truth page that is in itself unreliable, unverifiable and, as a matter of fact, comprises of advises and opinions of some individuals of unknown credibility, as the following quote (with emphases added) from the top of that page Wikipedia:Verifiability, not truth clearly indicates:

"This is an essay on the Verifiability policies.

It contains the advice or opinions of one or more Wikipedia contributors. This page is not an encyclopedia article, nor is it one of Wikipedia's policies or guidelines, as it has not been thoroughly vetted by the community. Some essays represent widespread norms; others only represent minority viewpoints."

You continued justifying removal of my addition by pointing to Wikipedia page WP:NOR "No original research" while the entire article that I added to has a disclaimer on the very top stating:

"This article possibly contains original research. Please improve it by verifying the claims made and adding inline citations. Statements consisting only of original research should be removed." (Emphasis added.)

Thus you don't pass your own standards that you are trying to impose on me.

Then you go on claiming that the proved truthfulness of my insert doesn't matter because its published proof does not come form a "reliable" reference. This assertion of yours that the truth is secondary to credibility of the instrument of its publication is not just absurd; it goes against the mainstream methodology of modern science and - particularly - mathematics. The exact formula for the best-case performance of Merge sort has a published (in arxiv.org ^[2]) proof that is verifiable by anyone with sufficient college math preparation and enough IQ. Removing it was an act of suppression of objective (and verifiable) knowledge under doctrine that it comes from an unapproved by you source. Such an action falls into category of censorship.

Moreover, you apparently use double standard which material fits for inclusion in Wikipedia pages and which does not. On one hand, you removed objectively true material that I posted, despite that it has a published proof^[2], but at the same time you let false information in the same section of the article (as I indicated above) to remain included despite that it has no relevant reference to any credible source that would validate it, and despite the direction on the top of the article (quoted above) that such information has to be removed. So when it comes to obviously false statements in the article, like a claim (with a mostly irrelevant reference) that the recurrence relation for Merge sort is:

T(n)=2T(n/2)+n

or the comment (with an irrelevant reference) that Knuth's analysis is incorrect because Knuth allegedly used "slightly suboptimal" version of Merge sort (which statement contradicts the standard definition of Merge sort that is characterized by the recurrence relation (see ^[3])

T_{worst}(n)=T_{worst}(\lfloor n/2\rfloor )+T_{worst}(\lceil n/2\rceil )+n-1

for

n\geq 2

and

T_{worst}(1)=0

)

you are oblivious to the fact that they are not only lacking references but are false, yet you give yourself an mandate to remove my contribution because the reference with verifiable proof that I provided^[2] does not meet your absurd standard of source credibility.

Once again, you don't obey the very standards that you expect others to follow. I wonder where does your "authority" to impose those standards on others while exempting yourself from them come from?

Your comment about my "conflict of interest" is not just an unproven allegation, but it is utterly absurd. (I suppose you do not understand what is the legal meaning of the term "conflict of interest".) Your misguided comment implies that I am somehow not allowed (by what authority?) to refer to proofs that I have constructed and published^[2]. Based on clearly double standard that you exhibited in this discussion and your obvious bias that favors false statements of your liking over provably true statements that you reject, I suppose that your accusing me of "conflict of interest" is psychological projection.

Your comment about anyone being able to publish anything on arxiv.com is false, as it is irrelevant. Because a proof is a proof no matter how it was published, as long as it was published in a way that allows unrestricted and stable public access to it.

Due to your misguided actions, the article Merge sort, section Analysis, is of substandard quality.

M A Suchenek (talk) 20:14, 8 June 2021 (UTC)[reply]

M A Suchenek, Please see WP:COI, which has been linked on your user talk page for about a week now. Legal meanings have nothing to do with it (why would they?). Linking your own preprint is absolutely what Wikipedia's COI policy has in mind. We cannot source things to user generated content. I think you'll get much further on Wikipedia if you read and try to understand the policy pages I have been linking rather than ignoring them and fillibustering the talk page with irrelevancies. MrOllie (talk) 20:32, 8 June 2021 (UTC)[reply]

You keep repeating your absurd allegations. Here are quotes from WP:COI:

"Conflict of interest (COI) editing involves contributing to Wikipedia about yourself, family, friends, clients, employers, or your financial and other relationships."

and

"Editors with a COI, including paid editors, are expected to disclose it whenever they seek to change an affected article's content. Anyone editing for pay must disclose who is paying them, who the client is, and any other relevant affiliation; this is a requirement of the Wikimedia Foundation."

There is nothing there that would suggest that referring to one's own published proofs of mathematical facts qualifies as "conflict of interest". Got it?

You are certainly entitled to your flawed opinion, but you have no authority to decide what Wikipedia is or is not, or to misinterpret its policies.

On this forum, I have specifically debunked your self-contradicting and otherwise invalid arguments that you keep repeating while the article (section "Analysis") still contains misinformation and irrelevant references. If this is something that is too difficult for you to understand then, perhaps, you should find yourself something else to do. M A Suchenek (talk) 17:35, 9 June 2021 (UTC)[reply]

@M A Suchenek: Re There is nothing there that would suggest that referring to one's own published proofs of mathematical facts qualifies as "conflict of interest". Got it? – you are wrong (and rude, too.) Citing yourself not always qualifies as COI, but it well may be considered as such, and the policy says it explicitly. Please see WP:SELFCITE. It may also qualify as self-promotion, which is one of things Wikipedia is not.

BTW, you may have forgot that, when you point your finger at someone, your other three fingers point at yourself. Take care not to fall into a scope of your last paragraph: if polite resoving a conflict, according to Wikipedia policies, is too difficult for you, then, perhaps, ................

CiaPan (talk) 22:25, 9 June 2021 (UTC), uninvolved[reply]

@CiaPan: Here is a relevant quote from WP:SELFCITE:

"Using material you have written or published is allowed within reason, but only if it is relevant, conforms to the content policies, including WP:SELFPUB [Self-published sources], and is not excessive".

My self-citation was relevant, did not violate content policies, and was not excessive.

The above makes your and MrOllie's insinuation that my edits were violating Wikipedia policy an absurd allegation based on opinion and not supported by facts.

Also, as I have indicated before, the "argument" that was used in order to justify suppression of mathematical truth by means of deletions of my edits was and instance of psychological projection: MrOllie was accusing me of using "unreliable" sources while basing his opinions on what Wikipedia has clearly declared as unreliable sources. Now you project on me his projection - a typical defensive tactics of someone who is wrong on facts.

The difference between you and me is substantial: I consider mathematical truth the over-riding objective and you apparently don't. And if you consider pointing it out "rude" then so be it. If you are so smart then go ahead and find a mathematical error in the published and readily available elementary material that I quoted (see the section "Disputed material", below) and in my detailed indication of errors in the article (see the beginning of the section "The analysis of the worst-case number of comparisons is incorrect", above) rather than hiding behind your misinterpretations of Wikipedia policy.

The bottom line is, that you keep inventing things and resorting to unproven opinions and fallacious arguments, while the article in question still contains misinformation in chapter "Analysis" (false statements and irrelevant reference that does not support them), as I have indicated in detail - but you seem indifferent to it. It appears that your main objective here is not to assure truthfulness of Wikipedia articles but to win an argument regardless of what the truth is. M A Suchenek (talk) 17:14, 20 June 2021 (UTC)[reply]

M A Suchenek, As I've told you several times, the use of arxiv preprints does, in fact, violate content policies. MrOllie (talk) 18:12, 20 June 2021 (UTC)[reply]

@MrOllie: You have been wrong on these, as I indicated, above. Please, show me specific Wikipedia "content policies" (as opposed to yours or somebody else's opinion) that "the use of arxiv preprints does [...] violate". M A Suchenek (talk) 19:10, 24 June 2021 (UTC)[reply]

Disputed material[edit]

For the record, here is the body of my insert that was deleted:

The recurrence relation on the best-case number $T_{best}$ of comparisons of keys performed by Merge sort on n distinct elements is given by:

T_{best}(n)=T_{best}(\lfloor {\frac {n}{2}}\rfloor )+T_{best}(\lceil {\frac {n}{2}}\rceil )+\lfloor {\frac {n}{2}}\rfloor

for

n\geq 2

,

and

T_{best}(1)=0.

To understand it, it suffices to note that in the best case, each element of the shorter of the two merged lists must be compared to an element of the other list.

Solving the above recurrence relation turns out quite tricky, and is more involved than solving the recurrence relation for the worst-case. It has been done, for instance, in ^[2]:

T_{best}(n)={\frac {n}{2}}(\lfloor \log _{2}n\rfloor +1)-\sum _{k=0}^{\lfloor \log _{2}n\rfloor }2^{k}Zigzag({\frac {n}{2^{k+1}}})

where $Zigzag$ is a continuous real function that oscillates like a zigzag between 0 and ${\frac {1}{2}}$ and is given by:

Zigzag(x)=min(x-\lfloor x\rfloor ,\lceil x\rceil -x)

.

In particular, for relatively small n, $2\times T_{best}(n)$ is substantially larger than $T_{worst}(n)$ (or, in other words, $T_{best}(n)$ is substantially larger than half of $T_{worst}(n)$ ), as the following inequality derived for any n in ^[2] shows:

{\frac {n-1}{2}}\leq 2\times T_{best}(n)-T_{worst}(n)\leq n-1

.

As n, and $log_{2}n$ , converge to ∞, the ratio of above difference to $T_{best}(n)$ converges to zero, that is, for large n, $T_{best}(n)$ is roughly equal to, although larger than, half of $T_{worst}(n).$

References[edit]

References

^ Cormen et al. (2009, p. 36)
^ ^a ^b ^c ^d ^e ^f Suchenek, M.A., "Best-case Analysis of MergeSort with an Application to the Sum of Digits Problem", https://arxiv.org/pdf/1607.04604v1.pdf
^ Suchenek M.A>, "Elementary Yet Precise Worst-case Analysis of MergeSort", https://arxiv.org/pdf/1702.08443.pdf

I just randomly stumbled upon the OR tag. I have to agree with MrOllie. You can't use your own unpublished research as a citation here. I'd cite the relevant wiki policy, but MrOllie did this already. Stix1776 (talk) 10:59, 23 June 2022 (UTC)[reply]

possible error[edit]

I am new to C++ and Wikipedia editing, (have been loosing sleep trying to find these bugs) so do not have the confidence to make this edit:- I think there is an error in the TopDownMerge, the first if statement I think should not read

   if (i < iMiddle && (j >= iEnd || A[i] <= A[j])) {

but

   if (i <= iMiddle && (j > iEnd || A[i] <= A[j])) {

Reasoning:- If the array has 2 elements i(0) will be equal to iMiddle(0) and j(1) will be equal to iEnd(1). On the first pass the first statement will now be true, the second will be false. The size of each element now determines which is copied into the other array first.

Rthepb (talk) 14:42, 25 November 2021 (UTC)[reply]

@Rthepb: That's wrong. When the array has two elements then iBegin equals 0, iMiddle equals 1 and iEnd equals 2. --CiaPan (talk) 21:50, 26 November 2021 (UTC)[reply]

ping pong merge sort[edit]

Is this section needed or should the section note that most implementations of merge sort already incorporate a similar concept. The top down merge sort example (for arrays) avoids copy back by changing the direction of merge for each level of recursion (it does do an initial copy, but mutually recursive functions can be used to avoid the initial copy). The bottom up merge sort example mentions this in a comment that swapping pointers would avoid the shown copy back loop. The first two passes of a bottom up merge sort are the equivalent of a ping pong merge sort. Alternating direction of merge operations for merge sort dates back to the early 1960's (or earlier) for tape sorts using merge sort or polyphase merge sort. Rcgldr (talk) 18:34, 10 January 2023 (UTC)[reply]

A ping-pong merge involves merging 4 segments at once, which in turn allows parallelism on modern architectures. It wouldn't have made sense back in the 60's. Better sources on the subject should become available in the coming years, and I presume they will refer to it as ping-pong merging. Yugodocs (talk) 14:11, 15 January 2023 (UTC)[reply]

@Yugodocs: - parallelism is typically more generic. Say there are 8 cores available. The array to be sorted is split into 8 evenly (or nearly so) parts. Each of the 8 parts is sorted (assume via merge sort). After all 8 parts are sorted into 8 sorted runs, 4 cores are used to merge 4 pairs of sorted runs into 4 sorted runs, 2 cores used to merge 2 pairs of sorted runs into 2 sorted runs, 1 core used to merge 1 pair of sorted runs into the final sorted run. Example of a Windows multithreaded merge sort, this one from 2016. My first version was around 2012, when I bought my first 4 core desktop. IBM mainframes dating back to the late 1960's supported true multi-tasking, but I don't know when the first multi-threaded merge sort was implemented on an IBM mainframe. Sort (Unix) first released in 1971, defaults to 16 threads for the initial internal memory merge sort phase (reads a group of records, creates and uses a multi-threaded sort to sort an array of pointers to records, then writes the records according to the array to create sorted runs), then defaults to a single-threaded 16 way merge for the external merge phases. Github source code for Sort shows a date of 1988, I don't know where prior versions are archived. Rcgldr (talk) 19:56, 15 January 2023 (UTC)[reply]

I was referring to Instruction-level parallelism and Memory-level parallelism. The pdqsort author has mentioned the former while the quadsort author has mentioned the latter. It's not related to multi-threading as far as I know. Yugodocs (talk) 01:11, 16 January 2023 (UTC)[reply]

@Yugodocs:Doing a web search for ping pong merge, I found a Microsoft article for improvements to Patience sort, which they call Patience+. The point of that article is how to improve Patience sort, which is based on the UK Patience card game, similar to US Solitaire card game. Ping pong merging will merge all (not just 4) sorted runs to an alternate container, then merge back to the original container on the next pass. Merge sorts have been using ping pong merging since the 1960's. The original implementation of Patience sort was not using ping pong merging and that was one of the improvements. The improved Patience sort, called Patience+, is mostly useful for sorting nearly sorted data. I'm not aware of any standard libraries (sush as C++ std::sort or std::stable_sort or std::list::sort) that implement it. Rcgldr (talk) 19:35, 17 January 2023 (UTC)[reply]

@Yugodocs: - pdqsort is a pattern defeating quicksort. quadsort is a 1000+ line (.h + .c) version of merge sort with a lot of optimizations.The author compares it to some version of C++ std::stable_sort(), which usually are not that optimized. The Dinkumware | HP | Microsoft implementation of std::stable_sort is based on 1994 code, defaults to allocating a temp array 1/2 the size (or smaller) of the source array, always uses a pointer to function for compare, no optimization to set the size of runs created by insertion sort, ..., which affect performance. It uses insertion sort to create sorted runs of 32 elements, then switches to bottom up merge sort. Getting back to the main point, traditional merge sorts have done the equivalent of ping pong merging since the 1960's. Rcgldr (talk) 18:43, 18 January 2023 (UTC)[reply]

In that case the section should add a mention of an early merge sort that incorporates a ping pong merge?

@Yugodocs:This Wiki article already does: Merge_sort#Use_with_tape_drives ... IBM 729 (late 1950s). This is also mentioned at quadsort github under Credits: "The ping-pong merge had been independently implemented in wikisort prior to quadsort, and likely by others as well." Rcgldr (talk) 13:29, 19 January 2023 (UTC)[reply]

The memory level parallelism of the quad merge appears notable, the paired branchless merge the author mentions in the quadsort article. There's also this article: https://www.researchgate.net/publication/255567817_Branch_Mispredictions_Don't_Affect_Mergesort

There is definitely a close to 100% performance improvement for quadsort over std::stable_sort, which the author attributes to combining a branchless merge with writing to two memory regions in the same loop, and something about loop unrolling. Yugodocs (talk) 19:46, 18 January 2023 (UTC)[reply]

@Yugodocs: - I converted quadsort to compile with Visual Studio (VS doesn't allows return void), and compared to my generic hybrid insertion sort | merge sort programs, and runs between 1.5 (Intel 3770K desktop) to almost 2.0 times as fast (Intel 10510U laptop) with pseudo random data, but 1.6 times slower in the unusual case of a lot of duplicates. I'm not sure why the duplicate case is an issue, for a typical merge sort, it's similar to already sorted data, where the number of compares is about n/2 instead of n-1, and only (1 in n/2) branch mis-predictions. quadsort is very fast with already sorted data, so it probably just needs some additional optimization for duplicates. It's 1000 lines of code with a lot of optimizations, most of which reduce number of compares, some of which greduce number of moves. Scandum mentions that branchless parity and galloping merges that operate from both ends of an array are the key optimizations, although the rest of the optimizations contribute to the performance. Rcgldr (talk) 03:20, 23 January 2023 (UTC)[reply]

Order of Complexity of Merge Sort[edit]

Worst case of time complexity for this algorithm is n*lg(n) according to Introduction to Algorithms. Not n*log(n) as noted in Analysis section. --Siavoshkc (talk) 17:22, 24 September 2023 (UTC)[reply]

@Siavoshkc: Those are the same. Changing a logarithm base is equivalent to multiplying by some constant factor, and constant factors are meaningless in the order of complexity. --CiaPan (talk) 19:06, 24 September 2023 (UTC)[reply]

Thanks! It was bugging my mind for a while. Siavoshkc (talk) 08:11, 7 November 2023 (UTC)[reply]

Recursive Merge Sort for Doubly Linked List - no list scanning to split sub-lists[edit]

Instead of top down scanning of lists to split them in have, for a doubly linked list, a list size integer value is divided by 2 for each level of recursion, until a base case where size == 1, where a pointer | iterator to the next node is returned. The recursive merge sort updates (pass by reference) a pointer | iterator to the beginning node and returns a pointer | iterator to the ending node of a sorted list. Visual Studio 2022 switched to this method for std::list::sort. The following is a C++ example using iterators. Iterator names: li = left run iterator, ri = right run iterator = end left run iterator, ei = end right run iterator. Merge will splice (move and insert) sub-lists (multiple nodes) from right run, to left run for right run values < current left run value, else it just advances left run iterator. Splice requires a doubly linked list.

template <typename T>
typename std::list<T>::iterator Merge(std::list<T> &ll,
                             typename std::list<T>::iterator li,
                             typename std::list<T>::iterator ri,
                             typename std::list<T>::iterator ei)
{
    std::list<T>::iterator si;
    std::list<T>::iterator ni;
    ni = (*ri < *li) ? ri : li;
    while(1){
        if(*ri < *li){
            for(si = std::next(ri); si != ei && *si < *li; si++); 
            ll.splice(li, ll, ri, si);
            ri = si;
            if(ri == ei)
                return ni;
        } else {
            if(++li == ri)
                return ni;
        }
    }
}

template <typename T>
typename std::list<T>::iterator SortListR(std::list<T> &ll,
                                 typename std::list<T>::iterator &li,
                                          size_t size)
{
    if (size == 1)
        return std::next(li);
    std::list<T>::iterator ri;
    std::list<T>::iterator ei;
    ri = SortListR(ll, li, size-size/2);
    ei = SortListR(ll, ri,      size/2);
    li = Merge(ll, li, ri, ei);
    return ei;
}

template <typename T>
void SortList(std::list<T> &ll)
{
    if (ll.size() < 2)                  // return if nothing to do
        return;
    SortListR(ll, ll.begin(), ll.size());
}

Rcgldr (talk) 02:14, 24 April 2024 (UTC)[reply]

[1] Cormen et al. (2009, p. 36)

[SucBestMergeSort-2] ^ ^a ^b ^c ^d ^e ^f Suchenek, M.A., "Best-case Analysis of MergeSort with an Application to the Sum of Digits Problem", https://arxiv.org/pdf/1607.04604v1.pdf

[SucWorstMergeSort-3] Suchenek M.A>, "Elementary Yet Precise Worst-case Analysis of MergeSort", https://arxiv.org/pdf/1702.08443.pdf

[1]

[2]

[3]