Simple Merge Sort
Hello,
I am currently working on an assignment for one my courses at school that involves analyzing the different complexities of sorting algorithms. There is a piece of my assignment that is giving me a good deal of grief, specifically the merge sort algorithm.
I've searched around the web but the examples I have found are regarding collections of objects, or are modified from the original merge sort to be faster (and thus harder to understand). Can someone point me in the direction of an example of a simple merge sort for an array of integers?
[571 byte] By [
xWagona] at [2007-9-29 12:23:36]

I forgot to add, thanks in advance for your time and attention. It is much appreciated
You can write this in about a dozen lines. Here's some
pseudocode.
copy the first half of the array,
copy the second half,
sort each half (use recursion)
merge the sorted halves
Each line needs a new method (although they could be
written inline, but that's not as clear).
You don't have to use recursion, but it's easier.
Because the size of the array is halved at each step,
the depth of recursion is only about lg array.length
and you won't run out of stack space.
In addition to TerryMoores advise I would suggest that you limit your
merge sort such that it only handles data whose number of elements is
a power of two. That should be all you need to analyse the performance
of the algorithm and it means you can avoid all the extra little bits
that are needed to handle data of arbitrary length.
Remember all whole powers of two can be divided by two a whole
number of times. obvious ain't it :)
matfud
Thank you kindly, this has been helpfull
No need to copy the array - you can merge in-place. I think.
> No need to copy the array - you can merge in-place. I
> think.
I'd love to know how. A usually quoted advantage of quicksort
is that it sorts in place. I think it's possible to only copy
half of the array. First I'd copy the front half to another
array. If the halves were already sorted, then the halves
could be merged back into the original array. Well, you can
do the same recursively for each half. The net effect is to
copy the whole array (except for the very last element), but
only once over all recursions. I'm sure the same could be
done iteratively too.
I haven't tried this though.
> > No need to copy the array - you can merge in-place. I
> > think.
> I'd love to know how.
As you wish.
package pmurray_at_bigpond_dot_com.javaforums;
import java.util.Random;
/**
* Merge routine.
* @author Paul Murray
* @version 1.0
*/
public class Main6
{
// STATICS //////////
static final Random rnd = new Random();
public static void main(String[] args) {
new Main6(args).go();
}
// INSTANCE //////////
final String[] args;
Main6(String[] args) {
this.args = args;
}
void go() {
int[] ary = new int[100];
check:
for (int j = 0; j < 10000; j++) {
// make up some random data
int v;
v = 0;
for (int i = 0; i < 50; i++) {
v += rnd.nextInt(10);
ary[i] = v;
}
v = 0;
for (int i = 50; i < 100; i++) {
v += rnd.nextInt(10);
ary[i] = v;
}
// perform the merge
merge_in_place(ary, 0, 50, 100);
for (int i = 0; i < 99; i++) {
if (ary[i] > ary[i + 1]) {
System.out.println("Array is not sorted");
break check;
}
}
}
}
void merge_in_place(int[] a, int start, int mid, int end) {
while (start < mid && mid < end) {
if (a[start] <= a[mid]) {
start++;
continue;
}
// ok. we have a value at the start of the first block
// which is greater than the value at the start of the second
// block. How many items in the second block are greater than
// that first value?
int item_ct;
for (item_ct = 0; mid + item_ct < end; item_ct++) {
if (a[start] <= a[mid + item_ct]) {
break;
}
}
// we have 2 situations. Either item_ct is longer than the
// entire size of the first block, or it is not.
if (item_ct <= mid - start) {
// allright! lets swap the items out of the second block
// into the first block;
for (int i = 0; i < item_ct; i++) {
int xx = a[start + i];
a[start + i] = a[mid + i];
a[mid + i] = xx;
}
// but now that we have done this, the second block is no longer
// sorted, right? It has two blocks, one from the start of the
// first block, and one left there previously. So let's merge them
merge_in_place(a, mid, mid + item_ct, end);
// and now we make note of the fact that we have put some items
// where they are supposed to go
start += item_ct;
}
else {
// hmm, the corrct block from the second half is longer than the
// entire first half. We need to roll that block up to the start location.
// Seeing as that takes care of the entire remaining chunk of the start
// block, we handle the remaining merging by tail recursion.
rollLeft(a, start, mid + item_ct, mid - start);
start = mid;
mid += item_ct;
}
}
}
// yes, there are much beter ways to do this. Why not roll your
// own? You didn't ask for a rolling algorithim.
void rollLeft(int[] a, int start, int end, int ct) {
while (ct > 0) {
int zz = a[start];
System.arraycopy(a, start + 1, a, start, end - start - 1);
a[end - 1] = zz;
ct--;
}
}
}
> > > No need to copy the array - you can merge
> in-place. I
> > > think.
>
> > I'd love to know how.
>
> As you wish.
> ...
Thanks, that's quite clever. Whether it's worth the trouble is
another question--even after doing the rollLeft a bit better.
I didn't doubt that the arrays could be merged in situ, but I
thought it would need to insert the second half in the first
element by element, probably decreasing the efficiency to
insertion sort levels.
BTW, there's a bug. You forgot to sort the two halves first.
With this small change the test works fine.
Naah, that recursive algorithm does too much work. Try this:
void merge_in_place2(int[] a, int start, int mid, int end) {
while (start < mid && mid < end) {
if (a[start] <= a[mid]) {
start++;
}
else if (mid - start == 1 && end - mid == 1) {
// a simple swap
int zz = a[start];
a[start] = a[mid];
a[mid] = zz;
mid = end;
start = mid;
}
else if (false && mid - start == 1) {
int zz = a[start];
/**
* @todo binary search the second block and insert zz
*/
}
else if (false && end - mid == 1) {
int zz = a[mid];
/**
* @todo binary search the first block and insert zz
*/
}
else {
int zz = a[mid];
System.arraycopy(a, start, a, start + 1, mid - start);
a[start] = zz;
start++;
mid++;
}
}
}
> BTW, there's a bug. You forgot to sort the two halves first.
> With this small change the test works fine.
It's not a bug: the method merges in place an array that has two sorted blocks in it. ie: it works according to spec. How you go about getting an array to look like that in the first place is a different question. The main routine generates test data appropriate for what the method does.
> > BTW, there's a bug. You forgot to sort the two
> halves first.
> > With this small change the test works fine.
>
> It's not a bug: the method merges in place an array
> that has two sorted blocks in it. ie: it works
> according to spec. How you go about getting an array
> to look like that in the first place is a different
> question. The main routine generates test data
> appropriate for what the method does.
When I ran it, it didn't work until I sorted the two halves
first. It's possible that I made an error in reformatting
(for some reason my computer can't always copy so I saved
the page and had serious reformatting to do).
I think xWagon is beginning to think that merge sort is
complicated--which it isn't. It's just merging in place
that can be tricky.
Here's the basic method:
public void sort(int[] a, start, end)
{ int mid = (start+end)/2;
sort(start, mid); sort(mid, end);
merge(a, start, mid, end);
}
where we have been discussing how to do the merge.
The more I think about it the more I believe that it's
going to be much easier to copy the elements from start
to mid (System.arraycopy is the easiest way) to make
room for elements from the second block. Say this
creates an array b.
Then merge is simply comparing the elements at the
head of b and the second block of a, and copying the
smaller into the start of a. Then keep working down
both arrays in the same way.
void merge(int[] a, start, mid, end)
{ int[] b = new int[mid-start);
System.arraycopy(a, start, b, 0, mid);
for(int i = start, j = mid, k = 0; j < end; i++)
{ a[i] < b[k]? a[i] = a[j++]: a[i] = b[k++];
}
}
I think that should work but you might have to tweak
the indices. The indices in each array or subarray
are strictly less than the upper bounds given.
Now is merging in place worth the trouble?
Certainly not if you have enough free memory.
Let's look at how much work is involved in merging.
If the array consists of distinct values in random
order, there is an even chance that the first
value is already in position, and similarly for
the last value. There is one chance in eight that
three elements at one end are already in position,
and one chance in four that three or more are in
position. Likewise for the other end. All values
between the maximum number already in position at
each end need to move--and that means that, in a
large array, almost all must be moved, so the best
we can hope for is order n. The method above does
2n assignments. It moves some values already in
position unecessarily, but this number is rarely
large except when merging small chunks. Small
chunks do get merged much more often than large
chunks, though. In that case what savings are
possible. In the extreme case, merging two arrays
of length 1, we need do no work if the values are
already in order, otherwise swapping requires 3
assignments. I do three every time. Its possible,
then that I double the necessary work.
It's not clear to me that swapping in place will
help here. But there is a way to save this work
with copying--just start and end beyond the parts
that are already in place. The in-place algorithms
previously given try to do just this, but copying
does it more easily.
Having posted a buggy first draft, I should make amends
by posting a correct version.
This uses a modified version of Paull Murray's test code.
import java.util.Random;
/**
* Merge sort.
* @author Paul Murray
* @author Terry Moore
* @version 1.0
*/
public class MergeSort
{ /* STATICS */
static final Random rnd = new Random();
public static void main(String[] args)
{ (new MergeSort(args)).go(100); }
/* INSTANCE */
final String[] args;
MergeSort(String[] args) { this.args = args; }
void go(int len)
{ int[] ary = new int[len];
check:
for (int j = 0; j < 10000; j++)
{ /* make up some random data */
for(int i = 0; i < len; i++) ary[i] = rnd.nextInt(1000);
/* perform the sort */
mergeSort(ary, 0, len);
for (int i = 0; i < len-1; i++)
{ if(ary[i] > ary[i + 1])
{ System.out.println("Array is not sorted");
break check;
} } } }
void mergeSort(int[] a, int start, int end)
{ if(start < end-1) // can't merge one element
{ int mid = (start + end)/2;
mergeSort(a, start, mid); mergeSort(a, mid, end);
merge(a, start, mid, end);
} }
void merge(int[] a, int start, int mid, int end)
{ int len = mid-start; int[] b = new int[len];
System.arraycopy(a, start, b, 0, len);
int i, j, k;
for(i = start, j = mid, k = 0; j < end && k < len; i++)
a[i] = (a[j] < b[k])? a[j++]: b[k++];
// copy remainder of b if some left over
if(k < len) System.arraycopy(b, k, a, i, len-k);
}
}
Like a bulldog, I haven't let go of this--perhaps it's time to do so.
Here's my last word.
I think I can prove that merging in situ is not worth while. The proof
is not rigorous--I don't even define "worth while". I doubt if there's
much point in questioning received wisdom when it's the result of years
of experience. But, heck! where would we be if no-one ever questioned
anything?
If you add the following two lines at the start of the merge method,
it could improve efficiency a little by not moving elements that are
already in place (idea suggested by Paul Murray's merge_in_place
method). However, in a test, it actually made the 7% method slower
for 100 items, and 3% slower for 1000 items.
for(int m = a[mid];start < mid && a[start] <= m; start++);
for(int m = a[mid-1]; --end >= mid && a[end]>= m;); end++;
A further improvement is to switch to an insertion sort when
(end - start) is small enough. This seems to be about 40% faster when "small enough" means between 20 and 60.
With the two lines above added, all elements between start and end must
move. To move the first element of the second block to the front of the
first we must move the first element out of the way. By copying all
the first block out we make as much room as necessary. The first block
has to be copied twice, the rest copied once (i.e. 1.5 copies per value).
We could save memory (on average) by lazy copying rather than eager
copying, i.e. only copying an element just before we have another to put
in its place. This would require a queue of removed values, and the queue would rarely be as long as the first block if the elements start in random order.
On the other hand, if we try to merge in place, the removed elements have to be put back into the array. The vacant spaces aren't the final resting places, so the elements need to be moved more than once. Paul's
clever idea is to merge again to move them closer to their final positions. I suspect they will ripple along the array like a bubble
sort as each merge takes place on a smaller subarry (but not as inefficiently, perhaps more like insertion sort). As it's likely that only short blocks will remain contiguous, subsequent merges are likely to merge very unequal blocks, and that is inefficient.
i think this would help!
public static void mergesort(Comparable [] row) {
Comparable [] helprow = new Comparable [row.length];
mergesort(row, helprow, 0, row.length-1);
}
private static void mergesort(Comparable [] row,
Comparable [] helprow,
int left, int right) {
if (left < right) {
int midden = (left + right) / 2;
mergesort(row, helprij, left, mid);
mergesort(row, helprow, mid+1, right);
merge(row, helprow, left, mid, right);
}
}
private static void merge(Comparable [] row,
Comparable [] hulprow,
int left, int mid,
int right) {
int w1 = left, w2 = mid + 1, hw = left;
while (w1 <= mid && w2 <= right)
helprow[hw++] = row[w1].compareTo(row[w2]) < 0 ?
row[w1++] : row[w2++];
if (w1 > mid)
while (w2 <= right)
helprow[hw++] = row[w2++];
else
while (w1 <= mid)
helprow[hw++] = row[w1++];
for (hw = left; hw <= right; hw++)
row[hw] = helprow[hw];
}
>I think I can prove that merging in situ is not worth while. The proof
>is not rigorous--I don't even define "worth while"
I suspect that it really depends on what your data looks like - whether it is fully random, already in a rough sort of order, or made up of a number of blocks. If your data looks like this:
A: 1 2 3 12 13 14 27 28 29
B: 5 6 7 8 19 20 21 50 51 52
Then a method that moves around blocks at a time may perform well. Liek a lot of things, the answer is "it depends".
> >I think I can prove that merging in situ is not worth
> while. The proof
> >is not rigorous--I don't even define "worth while"
>
> I suspect that it really depends on what your data
> looks like - whether it is fully random, already in a
> rough sort of order, or made up of a number of blocks.
> If your data looks like this:
>
> A: 1 2 3 12 13 14 27 28 29
> B: 5 6 7 8 19 20 21 50 51 52
>
> Then a method that moves around blocks at a time may
> perform well. Liek a lot of things, the answer is "it
> depends".
I agree, I was referring to data starting in random order.