Returning a ratio

Hey all

I am trying to calculate the "popularity" of an item. Items can be rated using integers from 0 to 5. What I do now is that I only calculate the average of all the ratings for an item, this is of course not enough because the number of times an item has been rated should also be an important factor when deciding an item's popularity.

(An item with the average rating of 5 which has been rated 20 times should be more popular than an item which has the average rating of 5 and has been rated 10 times for example).

What is the best way for returning the avg Rating/n Ratings as a Double? I can always return the ratio as a String, but I don't know if that's the smoothest way of doing it. I am thinking of doing: avg rating * n ratings / n ratings. Is this a good way of returning the ratio as a Double or are there more accurate algorithms for doing it?

Code so far (will return Double when I've figured out which algorithm to use)

public Integer calculatePopularity(List<Integer> values){

if(values.size() == 0){

return 0;

}

try{

Integer result = 0;

for(Integer i : values){

result = result + i;

}

return result * values.size()/values.size();

}catch (Exception e){

e.printStackTrace();

return 0;

}

}

[1873 byte] By [syncroa] at [2007-10-2 17:34:35]
# 1
multiplying by a number and then dividing by the same number will just cancel each other out
rkippena at 2007-7-13 18:51:48 > top of Java-index,Other Topics,Algorithms...
# 2

You need to come up with some weighting function yourself, and this function would take an average rating and put a 'reliability skew' (between 0 and 1) on it. I'm sure there is material for this online, because I'm certain that this paradigm must come up all the time, what with the wide variety of sites that support some sort of user rating or another.

Just thinking out loud, this might possibly some function on the domain [0, +infinity) that zeroes at x=0, then tends exponentially to 1 as x-> +inifinity. The initial steepness of the funtion would depend on you, how quickly you see a rating getting reliable as more and more people are adding to it.

Something like

rating = (1-e^(-(steepness)*raters)) * average rating

Of course, there are plenty of other functions you could use, or you could modify this exponential. You can for example round out curve that this relation gives more by adding a coefficient to the exponential term, and then adjusting the -(steepness)* raters with a +k to move the zero back to x=0.

Not being an applied mathematician or a sociologist, I don't know how to quantify and model 'reliability' but, as I say, there is no doubt material on this on the web

fragorla at 2007-7-13 18:51:48 > top of Java-index,Other Topics,Algorithms...
# 3

Just had another think. You could also factor in the standard deviation of your results, penalising the 'reliability' function if the standard deviation is high. Take, for example the two data sets {3,3,3,3,3} and {1,5,2,4,3}. Both have the same N and the same average. However, the first set shows more homogeneity, which is a positive sign for reliability- in other words, users are agreeing about the rating. This in turn should send the reliability closer to one, and the overall rating closer to 3. The second set shows a level of disagreeance, which is a negative sign for reliability; given the low number of ratings and the large SD, the reliability of the average should is quite low and consequently the overall rating should be lower than the first set. (Of course, if the two sets maintain their averages as the number of ratings increases, both should tend close to 3).

fragorla at 2007-7-13 18:51:48 > top of Java-index,Other Topics,Algorithms...
# 4
*disagreement (not disagreeance!)*tend closer to 3
fragorla at 2007-7-13 18:51:48 > top of Java-index,Other Topics,Algorithms...
# 5

Thanks fragorl, your insight on the "homogenity" of the rater's is something I have to consider. I have made a separate weight parser which reads a weighting scheme from an XML file, I will with the help of the weights try to appreciate both the skewness of the ratings and also the reliability of them.

syncroa at 2007-7-13 18:51:48 > top of Java-index,Other Topics,Algorithms...