Distribution Counting Sort
Hi all,
I've been given an algorithm and am supposed to code it using Java. The thing is I'm not quite sure how to go about implementing the algorithm itself and would really appreciate it if somebody could extend their help.
ALGORITHM DistributionCounting(A[0...n-1],L,u)
//Sorts an array of integers from a limited range by distribution counting
//Input: An array A[0....n-1] of integers between L and u ( L<= u)
//Output: Array S[0...n-1] of A's elements sorted in nondecreasing order
for j = 0 to u-1 do D[]j] = 0//initialize frequencies
for i = 0 to n-1 do D[A - L] = D[A -L]+1//compute frequencies
for j =1 to u-L do D[j] = D[j-1] + D [j]
for 1=n - 1 downto 0 do
j = A -l
S[D[j] - 1] = A
D[j] = D[j] - 1
return S
Cheers
I think you need the Java Tutorial. http://java.sun.com/docs/books/tutorial/index.html
int n = A.length;
// count frequencies
for(int i = 0; i<n; i++ ){
D[A[i]-L]++;
}
int k = 0; //index into D
for(int i = 0; i><n; i++){
while(D[k] == 0){k++;} // locate value in D
S[i] = k+L; // restore value to original range and add to S
D[k]--; // remove value from D
}
>
Thanks for the reply marlin314, but there's are 4 for loops in the algorithm. Sorry to be such a bother... but I don't quite understand the algorithm which is why i wasn't able to do it. Is there any difference between ur coding and the algorithms?
well, for one thing I didn't bother to include the step of initializing the array D to zero. I also didn't bother to create the D array. I kind of assumed that you could maybe do at least that much.
Also the algorithm that you listed in psuedo code is missing information in places, particularly when it mentions the array A since it leave off subscripts. I don't know who's fault that is but I assumed that that was part of the source of confusion.
So I didn't actually try to replicate your algorithm instead I wrote some code to do as you asked.
The basic idea is not very hard. Suppose you have a long list that just happens to have only a few elements like say 5's 6's and 7's. Sure you could do a standard sort algorithm but, when you are all finished the final sorted list is nothing but a bunch of 5's followed by a bunch of 6s followed by a bunch of 7s.
How many of each number? Well you just count. You rip through the original list and when ever you see a 5 you bump up D[5] and whenever you see a 6 you bump up D[6]. By the time you are done D sub 5,6 and 7 are the counts of the number of 5s 6's and 7s.
Well, that is not exactly true, if you did it exactly that way you would have wasted slots 0, 1, 2, 3 and so on. So to avoid the waste. the count of 5's was put in slot 0, the count of 6's was put in slot 1 and the count of 7 was in slot 2. You take the number that comes in, like 6, subtract L from it, which in my example is 5 and that tells you to increase the count in slot 1. The numbers were biased downwards by the lowest number, but basically other than a shift of L, the index of the D array is the number that you were counting. The Value in the D array is the count of how many of that number you saw.
Once you have slots that count how many of each number you need to emit, You need to start another loop to emit the numbers. I start off with k set to the first slot index. If you add L to k that is the first number that you need to emit.(So in my example k is 0 L is 5 the first number you will be emitting is k+L which is 5) How many do you need to emit? You need to emit D[k] of them. Every time you emit a number you decrease the count in in D[k] when the count goes to zero, you are done with that number, move K up to the next number and start emitting a bunch of those.
That is what a distribution count sort does.
I can't imagine why a pofessor would assign a distribution count sort to someone that is new to programming. It is not very practical. In fact in my 35+ years of programming I have never once had to do it. I think the only reason that it even has a name is that it is used in analysis of algorithms to show that the standard result that sorting takes O(n*log(n)) is based on having numbers that basically distinct or require log(n) bits to represent. If your numbers are NOT essentially distinct (as in the case where they were only the numbers 5,6 and 7) then you can do what we just did here, count in linear time and then output in linear time and thus get a linear sort algorithm.
Don't be the least concerned if you have no clue as to what I am talking about, I am merely commenting for any other folks that happen along, see this post and wonder if this distribution count sort is some fancy new kind of better sort.
It is not.
Hope that helps.
ALGORITHM DistributionCounting(A[0...n-1],L,u)
//Sorts an array of integers from a limited range by distribution counting
//Input: An array A[0....n-1] of integers between L and u ( L<= u)
//Output: Array S[0...n-1] of A's elements sorted in nondecreasing order
for j = 0 to u-1 do D[]j] = 0 //initialize frequencies
for i = 0 to n-1 do D[A - L] = D[A -L]+1 //compute frequencies
for j =1 to u-L do D[j] = D[j-1] + D [j] //reuse for distribution *this was the part that I missed out on.. sorry*
for 1=n - 1 downto 0 do
j = A -l
S[D[j] - 1] = A
D[j] = D[j] - 1
return S
We're supposed to analyse the algorithm and I'm piled up in books trying to look for the solution. Thanks for your help marlin134. I really appreciate it.
OK - here is my simplistic analysis
You are told
> //Input: An array A[0....n-1] of integers between L
> and u ( L<= u)
Step 1:
> for i = 0 to n-1 do D[A - L] = D[A -L]+1 //compute
> frequencies
Since A is defined to be an array when you get to step 1 you are adding an array A to an integer L. This will not compile. Furtheremore you are in a loop where i is your loop variable and yet the formula for has no dependence on the variable i. Step 1 is missig a subscript.
Yes, I know perfectly well what is missing and where, BUT my analysis is that the supplied junk is NOT an algorithm and will not run as is.
I'm so so so sorry. I was the one who messed up the algorithm. This is the correct one. I'm so sorry. I can be a real knucklehead at times. Sorry.
ALGORITHM DistributionCounting(A[0...n-1],L,u)
//Sorts an array of integers from a limited range by distribution counting
//Input: An array A[0....n-1] of integers between L and u ( L<= u)
//Output: Array S[0...n-1] of A's elements sorted in nondecreasing order
for j = 0 to u-1 do D[]j] = 0 //initialize frequencies
for i = 0 to n-1 do D[A - L] = D[A -L]+1 //compute frequencies
for j =1 to u-L do D[j] = D[j-1] + D [j]
for 1=n - 1 downto 0 do
j = A -L
S[D[j] - 1] = A
D[j] = D[j] - 1
return
ALGORITHM DistributionCounting(A[0...n-1],L,u)
//Sorts an array of integers from a limited range by distribution counting
//Input: An array A[0....n-1] of integers between L and u ( L<= u)
//Output: Array S[0...n-1] of A's elements sorted in nondecreasing order
for j = 0 to u-1 do D[]j] = 0 //initialize frequencies
for i = 0 to n-1 do D[A - L] = D[A -L]+1 //compute frequencies
for j =1 to u-L do D[j] = D[j-1] + D [j]
for 1=n - 1 downto 0 do
j = A -L
S[D[j] - 1] = A
D[j] = D[j] - 1
return
I'm not sure why the proper one isn't coming out.. btu there should be a at the end of the A...
ok..i found out why.... [ i ].... sorry.....
ALGORITHM DistributionCounting(A[0...n-1],L,u)
//Sorts an array of integers from a limited range by distribution counting
//Input: An array A[0....n-1] of integers between L and u ( L<= u)
//Output: Array S[0...n-1] of A's elements sorted in nondecreasing order
for j = 0 to u-L do D[]j] = 0 //initialize frequencies
for i = 0 to n-1 do D[A[ i ] - L] = D[A[ i ] -L]+1 //compute frequencies
for j =1 to u-L do D[j] = D[j-1] + D [j]
for 1=n - 1 downto 0 do
j = A[ i ] -L
S[D[j] - 1] = A[ i ]
D[j] = D[j] - 1
return S
This is the correct algorithm.... sorry......the [ i ] changed everything to italic. i'm sorry.. can somebody now please help me out with this algorithm?
Message was edited by:
JuppieJux
Message was edited by:
JuppieJux