java.lang.StackOverflowError on Quick Sort Algorithm

i was given the following quick sort algoritm as part of project. We have to Analysis its performance against other sorting algoritms.

it works fine for randomly sorted arrays.

However when i run a reverse array i get a java.lang.StackOverflowError.

I've heard this can be caused by infinite recursion but don't see whats going wrong with this algoritm. i believe it was taken from some book or other so its probably well known. Any help would be cool.

staticvoid quickSort(int[] a,int left,int right)

{

// Sort a[left卹ight] into ascending order.

if (left < right)

{

int p = partition(a, left, right);

quickSort(a, left, p-1);

quickSort(a, p+1, right);

}

}

staticint partition(int[] a,int left,int right)

{

// Partition a[left卹ight] such that

// a[left卲-1] are all less than or equal to a[p], and

// a[p+1卹ight] are all greater than or equal to a[p].

// Return p.

int pivot = a[left];

int p = left;

for (int r = left+1; r <= right; r++)

{

int comp = compareTo(a[r],pivot);

if (comp < 0)

{

a[p] = a[r];

a[r] = a[p+1];

a[p+1] = pivot;

p++;

}

}

return p;

}

staticint compareTo(int num1,int num2)

{

if (num1>num2)

{

return 1;

}

elseif (num2>num1)

{

return -1;

}

else

{

return 0;

}

}

[3117 byte] By [mpento02a] at [2007-10-1 8:41:42]
# 1

> i was given the following quick sort algoritm as part

> of project. We have to Analysis its performance

> against other sorting algoritms.

> it works fine for randomly sorted arrays.

> However when i run a reverse array i get a

> java.lang.StackOverflowError.

> I've heard this can be caused by infinite recursion

> but don't see whats going wrong with this algoritm. i

> believe it was taken from some book or other so its

> probably well known. Any help would be cool.

>

>

> static void quickSort(int[] a, int left, int right)

> {

> // Sort a[left卹ight] into ascending order.

> if (left < right)

> {

> int p = partition(a, left, right);

> quickSort(a, left, p-1);

> quickSort(a, p+1, right);

> }

> }

>

> static int partition(int[] a, int left, int right)

> {

> // Partition a[left卹ight] such that

> // a[left卲-1] are all less than or equal to a[p],

> ], and

> // a[p+1卹ight] are all greater than or equal to

> to a[p].

> // Return p.

> int pivot = a[left];

> int p = left;

> for (int r = left+1; r <= right; r++)

> {

> int comp = compareTo(a[r],pivot);

> if (comp < 0)

> {

> a[p] = a[r];

> a[r] = a[p+1];

> a[p+1] = pivot;

> p++;

> }

> }

> return p;

> }

>

> static int compareTo(int num1,int num2)

> {

>

> if (num1>num2)

> {

> return 1;

> }

> else if (num2>num1)

> {

> return -1;

> }

> else

> {

> return 0;

> }

>

> }

yeah, because you are user recursion... you can increase the stack size when you start your JVM. But I suggest change your algorithm not to use recusion.

yue42a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 2

> quickSort(a, left, p-1);

> quickSort(a, p+1, right);

The reason you more likely to get StackOverflow error when sorting a completely reversed data set is because each iteration of your method is garranteed to be a "head"-recursion (rather than tail-recursion), the quickSort(a, p+1, right) is always pushed onto stack... thus your stack overflows.

yue42a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 3

The problem is your pivot.

Worst case scenario of quicksort is O(n*n), and in this scenario you have a recursion that goes n levels deep (requiring a huge stack size). The worst case in your implementation is simply a reverse sorted array.

You could solve the problem by selecting a medium three pivot, and thereby make it virtually impossible for the worst case to happen.

Ragnvald_id2a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 4

One thing I've noticed about quicksort is that it doesn't do well with sorting large arrays of similar numbers. If you have 100,000 integers ranging from 1-10, quicksort, in my experience, is likely to give you a StackOverflowError because so many of the numbers are similar (you'll have thousands of 7s). One thing I do to avoid this is to add a "fixed length random integer string" to the end of each integer before you sort. Then remove it after the quicksort. For example, lets say one of the numbers in our array is 7, add a random 3 digit number to it, such as 519, and the number becomes 7519. Do this for every single integer in your array and quicksort most likely won't give you a StackOverflowError, within reason of course. Just don't forget to go through the array and remove the 3 digit numbers. I'm sure there are other sorting methods out there that could handle large arrays of similar numbers better than this addition to quicksort, but when you've already written a quicksort algorithm and don't feel like writing a more efficient one, it can come in handy.

Smile,

Snotty :)

snottitudea at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 5

> You could solve the problem by selecting a medium

> three pivot, and thereby make it virtually impossible

> for the worst case to happen.

While this suggestion is a fix, and will make it virtually impossible, it is not THE correct fix.

The real reason that you blow out the stack is because you have too many subproblems on the stack. Yes this was partially caused by choice of a bad pivot, but that can happen regardless of the method of choice of the pivot.

The stack blow up problem occurs when you defer the short problem and tackle the long problem first. So if you have an N long array. and by bad choice of pivot you split into a 1 long problem and an N-1 long problem. If you did the 1 long problem first, you would be done with it and back to the N-1 problem without making the stact too deep.

However if you hang onto the 1 long sub problem and wade into the N-1 long problem first, you could just make the same mistake again and again, carving off a 1 long problem, pushing it onto the stack and wading into the big problem.

In it's worst case, Not only is the runtime N*N but the depth of the stack was N.

On the other hand, if you always do the smallest subproblem first. The smallest problem MUST be less that half the size of the array you started with (a perfect split would be exactly half the size) and as a result the stack depth is BOUNDED by the log base 2 of the original array size.

The reason that they smack you with this problem in school is because they want you to learn that this is ALWAYS the case for recursive problems, namely you will be less consumptive of the stack if you can arrange your recursion to solve the smallest subproblem first.

So choice of piviot is very important. In practice this will solve your problem from a probablilistic standpoint but it does NOT make the problem impossible and if I were grading your papers, I would not give you full credit if you discovered the pivot trick but did not discover the rearrangement of subproblems trick.

This important consideration has been listed in every presentation of quicksort in algorithm texts at least since Sedgwick if not in Knuth.

marlin314a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 6

I do not quite get your point marlin...

> The stack blow up problem occurs when you defer the short problem and tackle the long problem first.

I cant see how. The stack gets blown up if you handle the short problem first as well. I ran my test program with array sizes 6000, 7000 and 8000. With array size 6000 both forwards and backwards sorted arrays works just fine, and with size 8000 both cause a stack overflow. (But I admit, with 7000, only the backwards sorted array cause an overflow.)

public static void main(String[] args)

{

int[] ia = new int[Integer.parseInt(args[0])];

try

{

System.out.println("backwards...");

assign(ia, false);

quickSort(ia, 0, ia.length-1);

System.out.println(" ...ok");

}

catch(Throwable th)

{

System.out.println(th.toString());

}

try

{

System.out.println("forwards...");

assign(ia, true);

quickSort(ia, 0, ia.length-1);

System.out.println(" ...ok");

}

catch(Throwable th)

{

System.out.println(th.toString());

}

}

public static void assign(int [] ia, boolean fBackwards)

{

for(int i=0; i< ia.length; i++)

{

if(fBackwards)

{

ia[i] = ia.length - i;

}

else

{

ia[i] = i;

}

}

}

Ragnvald_id2a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 7

I will post the Sedgwick algorithm. It is not recursive, it explicitly uses a stack (actually just an integer array) to hold the subproblems and you can see where it examines the size of the subproblems.

This will bound the stack usage to lg N, however you should still do the median of 3 trick as the other poster suggested to also eliminate the systematic run time degradation that you will see on sorted files.

The median of 3 trick is to look at 3 values, the left most, the right most and the middle element of the array, toss out the extremes and use the one that is left as the pivot. This give the pivot a much better chance of splitting the list into two nearly equal subproblems. In the particular special case of an already sorted array this trick causes a perfect split.

However when I re-read your original post, you say that you were given that quicksort program to analyze. If your problem really is analysis rather than constructing a decent algorithm, well you have already done the important part. You discovered that it is a piece of junk.

The reason is that the bad choice of pivot (which happens at every single stage for a sorted list) combined with a blind recursion, leads to both an order N^2 sort time and an order N stack consumption. The order N stack consumption is what leads to the crash.

Quicksort is a simple algorithm, but it is NOT trivial. It is easy to get it wrong as the one that you were handed plainly shows.

Sedgwick's non-recursive qsort

void qSort(Comparable[] foo){ // from Sedgwick's Algorithms

int[] a = new int[50];

int l = 0; int r = foo.length; int p = 2;

while(p != 0){

if(r>l){

int i = partition(foo,l,r); // finds pivot and moves elts

// now we have two sub problems - push the long one

if((i-l)>(r-i)){

a[p] = l; a[p+1] = i-1; l = i+1;

} else {

a[p] = i+1; a[p+1] = r; r = i-1;

}

p += 2;

} else { // finished so do a subproblem

p -= 2; l = a[p]; r = a[p+1];

}

}

}

marlin314a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 8
Could you post your partition method as well? (I cant get your code to work....)
Ragnvald_id2a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 9

Here is an example of an adequate partition. Adequate in that it is optimized for clarity rather than speed.

static int partition(Comparable[] foo, int l, int r){

// first move median elt to front

int m = (l+r)/2;

order(foo,l,m);

order(foo,m,r);

order(foo,m,l);

Comparable pivot = foo[l];

int knownLow = l;

int unknown = l+1;

while(unknown <= r){

if(foo[unknown].compareTo(pivot) < 0){

knownLow++;

swap(foo,knownLow,unknown);

}

unknown++;

}

swap(foo,l,knownLow); // put pivot in place

return knownLow;

}

static void order(Comparable[] foo, int a, int b){

if(foo[a].compareTo(foo[b]) > 0) swap(foo,a,b);

}

static void swap(Comparable[] foo, int a, int b){

Comparable t = foo[a];

foo[a] = foo[b];

foo[b] = t;

}

marlin314a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 10

Nice code. But it just does not work.

First I got array index out of bounds exception, because your qSort calls partition with an exclusive r, and the partition() expects an inclusive.

Then I changed your qSort to call partition(foo, l, r-1), but that didnt help much. Hand 4,3,2,1 to your qSort, and it sorts it to 2,1,3,4.

I still find what you write interesting, so I would be glad if you are able to provide working code to illustrate it (no median three needed, its your neat little qSort that interests me, I think it has a bug in it....)

Ragnvald_id2a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 11

Yes, the initial algorithm had a bug. I initialized r to foo.length I should have initialized it to foo.length-1. r was supposed to point to the last element.

You are wrong in your assertion that "no median three needed" With out that step in there the execution time will be quadratic on a sorted list. Sorted lists are common enough that it is foolish to have a version of quicksort that does not deal appropriately with sorted lists.

marlin314a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 12
You misunderstood me. ;-) No median three needed... - Because I want to test how pushing the long problem combined with a really bad choice of pivot works. For a real world usage, I am the first to recommend median three.
Ragnvald_id2a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 13
And thx for the bugfix. Now I got it right!
Ragnvald_id2a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 14

Not that this has anything to do with Java, but my dad, a retired electrical engineer, has been trying for some years to get me to learn J, just as I have been trying to get him to learn Java. (J is the successor to APL - www.jsoftware.com ) Neither one of us has succeeded in convincing the other yet.

None the less, out of the blue he sends me the following mail that included quicksort written in J. I thought I'd pass it on so you could see the beauty of some of the alternatively languages that you could be learning.

-

qsort =: ]`(($:@:((}.<:{.)#}.)),{.,($:@:((}.>{.)#}.)))@.(*@#)

While J is terse, it is so logical that the meaning of any expression is clear at a glance.

Carpe papillam, OM

--

Well there you have it. I don't see the conditional where he selects the median of 3 so I will have to rag him about that.

I think while I'm at it, I'll also tell him that he should have written it in brainfuck, which is an even simpler if less succinct language. (Google it and check it out!)

Enjoy!

marlin314a at 2007-7-9 22:56:43 > top of Java-index,Other Topics,Algorithms...
# 15
can send or post ur full program to me, coz i want to know the full program and wat is the output for? i am interest with this title. Thanks
zendica at 2007-7-9 22:56:45 > top of Java-index,Other Topics,Algorithms...