K-means avec matlab

Signaler
Messages postés
1
Date d'inscription
lundi 1 décembre 2008
Statut
Membre
Dernière intervention
1 décembre 2008
-
cs_Bkarim
Messages postés
1
Date d'inscription
mercredi 15 janvier 2003
Statut
Membre
Dernière intervention
9 juin 2011
-
Je dois programmer k-means avec matlab......PLEASE HELP ! Merci

4 réponses

Messages postés
2
Date d'inscription
dimanche 21 mai 2006
Statut
Membre
Dernière intervention
16 mars 2009

il faut que tu suivre ce diagrame
Messages postés
2
Date d'inscription
dimanche 21 mai 2006
Statut
Membre
Dernière intervention
16 mars 2009

Step 1. Begin with a decision on the value of k = number of clusters



Step 2. Put any initial partition that classifies the data into k clusters. You may assign the training samples randomly, or systematically as the following:


<ol>
<li>Take the first k training sample as single-element clusters
</li>
<li>Assign each of the remaining (N-k) training sample to the cluster with the nearest centroid. After each assignment, recomputed the centroid of the gaining cluster. </li>
</ol>

Step 3 . Take each sample in sequence and compute its distance from the centroid of each of the clusters. If a sample is not currently in the cluster with the closest centroid, switch this sample to that cluster and update the centroid of the cluster gaining the new sample and the cluster losing the sample.



Step 4 . Repeat step 3 until convergence is achieved, that is until a pass through the training sample causes no new assignments.


 


If the number of data is less than the number of cluster then we assign each data as the centroid of the cluster. Each centroid will have a cluster number. If the number of data is bigger than the number of cluster, for each data, we calculate the distance to all centroid and get the minimum distance. This data is said belong to the cluster that has minimum distance from this data.



Click here to see how this k-means algorithm algorithm is implemented in codeor if you prefer numerical example (manual calculation) you may click here.


Since we are not sure about the location of the centroid, we need to adjust the centroid location based on the current updated data. Then we assign all the data to this new centroid. This process is repeated until no data is moving to another cluster anymore. Mathematically this loop can be proved to be convergent. The convergence will always occur if the following condition satisfied:


<ol>
<li>Each switch in step 2 the sum of distances from each training sample to that training sample's group centroid is decreased.
</li>
<li>There are only finitely many partitions of the training examples into k clusters. </li>
</ol>
Messages postés
3
Date d'inscription
jeudi 3 mars 2011
Statut
Membre
Dernière intervention
23 novembre 2012

Messages postés
1
Date d'inscription
mercredi 15 janvier 2003
Statut
Membre
Dernière intervention
9 juin 2011

Merci beaucoup , vous faites un grand travail qui nous aident de plus en plus