standard deviation

Hi,

I have an application that calculates standard deviation of an image.

the image is in RGB space and calculate standard deviation for each channel separately...

this is the code

for (int i=0; i<256; i++)

{

DeviationtempR+=Math.pow(((IstoRed[i]*i)-MediaR),2);

DeviationtempG+=Math.pow(((IstoGreen[i]*i)-MediaG),2);

DeviationtempB+=Math.pow(((IstoBlue[i]*i)-MediaB),2);

}

StandardDeviationR=Math.sqrt(DeviationtempR/256);

StandardDeviationG=Math.sqrt(DeviationtempG/256);

StandardDeviationB=Math.sqrt(DeviationtempB/256);

but the result is wrong:

example:

R: 586.0909853745022

G: 726.5127611796387

B 620.566268917065

why?

thanks in advance

[889 byte] By [Fabio_Aa] at [2007-10-2 21:51:06]
# 1
Why are you multiplying by i?
RadcliffePikea at 2007-7-14 1:06:56 > top of Java-index,Other Topics,Algorithms...
# 2
because the deviation standard is:the sqrt of sum of square of differences between amount x level and average...however i try without *i and the result are value still larger....
Fabio_Aa at 2007-7-14 1:06:56 > top of Java-index,Other Topics,Algorithms...
# 3

> because the deviation standard is:

> the sqrt of sum of square of differences between

> amount x level and average...

But you have not calcualted the average!

>

> however i try without *i and the result are value

> still larger....

If you have N values indicated by x[i] then

mean = sum(x[i])/N

variance = sum(sqr(x[i]-mean))/(N-1);

STD = sqrt(variance).

Message was edited by:

sabre150

sabre150a at 2007-7-14 1:06:56 > top of Java-index,Other Topics,Algorithms...
# 4

What results are you expecting? How do you know the results you get are wrong?

Here's some code I wrote that's known to work, you can use it to test your assumptions.

public class OnePassMean {

double mean = 0.0;

double var = 0.0;

int n = 0;

public void addObservation(long obs) {

addObservation((double) obs);

}

public synchronized void addObservation(double obs) {

double delta = obs - mean;

++n;

var += ((double) n - 1) / ((double) n) * delta * delta;

mean += delta / (double) n;

}

public synchronized void reset() {

mean = 0.0;

var = 0.0;

n = 0;

}

public synchronized double getMean() {

return mean;

}

public synchronized double getVariance() {

if(n != 0) {

return var / (double) n;

}

else {

return 0.0;

}

}

public synchronized double getStandardDeviation() {

return Math.sqrt(getVariance());

}

public static void main(String[] args) {

if(args.length == 0) {

System.out.println("Usage: OnePassMean n0 [n1 n2 ...]");

System.exit(0);

}

OnePassMean opm = new OnePassMean();

for(int i = 0; i < args.length; ++i) {

try {

opm.addObservation(Double.parseDouble(args[i]));

}

catch(NumberFormatException e) {

System.out.println("Ignored: " + args[i]);

}

}

System.out.println("Avg: " + opm.getMean() + " Var: " + opm.getVariance());

}

}

Note this calculates the population variance, not the sample variance as in sabre's forumla.

RadcliffePikea at 2007-7-14 1:06:56 > top of Java-index,Other Topics,Algorithms...
# 5
hi,I use your test and have 725,...how result.... but is possible 725 how standard deviation?I use gimp or photoshop and the standard deviation is always between 40-100... 725 is too large....
Fabio_Aa at 2007-7-14 1:06:56 > top of Java-index,Other Topics,Algorithms...
# 6

What does the phrase "known to work" mean for the posted code?

You can compute a running mean, as you are doing. But in order to compute your variance, don't you need to know the mean of the entire sample when you computer your squared deltas? The way you are doing it now you don't know the final mean at the time you are calculating each new term for the variance. You are adjusting your mean as you go along and taking variances from a moving target.

Not knowing much statistics, I don't know what a "population variance" is and to be fair, it may well be that this calculation you are doing here is correctly computing that particular statistic, but I fail to see the use of this calculation which certainly seems to be dependent on the order of the inputs.

Am I mistaken?

marlin314a at 2007-7-14 1:06:56 > top of Java-index,Other Topics,Algorithms...
# 7

> What does the phrase "known to work" mean for the

> posted code?

Well, the algorithm is provably correct, is numerically stable, and the code has been tested fairly well, though no warranty express or implied...

>

> You can compute a running mean, as you are doing. But

> in order to compute your variance, don't you need to

> know the mean of the entire sample when you computer

> your squared deltas? The way you are doing it now

> you don't know the final mean at the time you are

> calculating each new term for the variance. You are

> adjusting your mean as you go along and taking

> variances from a moving target.

Without doing all the algebra out, it is possible derive the formula for a moving variance just like for a moving mean. The variance doesn't so much depend on the final mean of the entire sample, its really just dependent on the data points, just like the mean. The algorithm in the code doesn't actually recalculate the variance at each step, rather it accumulates (Var * N). This is to avoid potential "catastrophic cancellation" when subtracting near equal quantities in floating point arithmetic. This method is only a little less numerically accurate than the standard two pass algorithm where you calculate the mean first, then the variance.

> Not knowing much statistics, I don't know what a

> "population variance" is and to be fair, it may well

> be that this calculation you are doing here is

> correctly computing that particular statistic, but I

> fail to see the use of this calculation which

> certainly seems to be dependent on the order of the

> inputs.

I always get the two confused, but the sample variance divides by (N-1) instead of N. This is supposed to compensate for the fact that your using a sample to estimate the variance for a population. I don't know the exact idea behind that.

> Am I mistaken?

Well, ever since Sun added the ability to edit your posts, nobody ever makes mistakes ;-)

RadcliffePikea at 2007-7-14 1:06:56 > top of Java-index,Other Topics,Algorithms...
# 8

> I always get the two confused, but the sample

> variance divides by (N-1) instead of N. This is

> supposed to compensate for the fact that your using a

> sample to estimate the variance for a population. I

> don't know the exact idea behind that.

>

Nearly right. If the true mean of the theoretical distribution were used then one would divide by N but by using the estimated mean you reduce the number of degrees of freedom associated with the sum of squares by 1. You can think of this as the uncertainty introduced by using the estimated mean rather than the true mean adds to the uncertainty of the variance estimate.

Of course all of this relies on having independent sample values.

sabre150a at 2007-7-14 1:06:56 > top of Java-index,Other Topics,Algorithms...
# 9
Once again, why are you multiplying by i?
ktm5124a at 2007-7-14 1:06:56 > top of Java-index,Other Topics,Algorithms...
# 10

>

> > Am I mistaken?

>

> Well, ever since Sun added the ability to edit your

> posts, nobody ever makes mistakes ;-)

The algebra is very cute! I did the induction and now agree that your math was fine. Sorry for doubting.

I had never seen that rearrangment of the calculation before.

I was mistaken and it apperar that your only mistake was in your claim that "nobody ever makes mistakes"

Oh well, we can't all be perfect all of the time.

marlin314a at 2007-7-14 1:06:56 > top of Java-index,Other Topics,Algorithms...