Simple K-means implementation

I have looked around and found nothing.

I am just trying to come up with a good way of starting off the centroid centers for a k-means algorithm I am working on, giving them random double values doesn't seem to be producing good results.

I am trying to find a simple implementation of the algorithm.

[319 byte] By [mm_treoa] at [2007-10-2 5:47:22]
# 1

choose first k elements of the data, or a random set of k elements from the data. This will cure complete abandonment issues.

Of course, K-means doesn't always give good results. The means can get abandoned, they can collapse, your choice of K can be inappropriate, and of course your data might not have any clusters.

marlin314a at 2007-7-16 1:57:00 > top of Java-index,Other Topics,Algorithms...
# 2

> I am just trying to come up with a good way of

> starting off the centroid centers for a k-means

> algorithm

1) Get n random values and choose the one that is at the most distance from the previously chosen centroid center

2) Run the algorithm a few times with random centroid centers and evaluate the cluster formation for each run

Learnablea at 2007-7-16 1:57:00 > top of Java-index,Other Topics,Algorithms...
# 3

I think the problem of finding a good starting point for k-means is just as difficult as the clustering problem itself. Whatever technique you find there will be a data set that makes you technique perform very poorly. Running k-means with a few different random starting points and combining the results may be a better approach.

RadcliffePikea at 2007-7-16 1:57:00 > top of Java-index,Other Topics,Algorithms...