5.5.3. The Ward’s algorithm

<< Click to Display Table of Contents >>

Navigation:  5. How To use « TIMi – StarDust module »? > 5.5. How to create a segmentation? >

5.5.3. The Ward’s algorithm

 

One major drawback of the KMeans algorithm is that you have to “guess yourself“ how many segments your dataset contains. This is difficult. This is why we use the Ward’s algorithm to find the exact number of segments. The Ward’s algorithm is used after the K-means algorithm. The Ward’s algorithm is an iterative algorithm. It works this way:
 

1.starts with the segments found by the K-Means algorithm.
 

2.find the two closest segments STARDU~1_img321 and STARDU~1_img322 amongst all the segments still available.
 

3.delete the segments STARDU~1_img321and STARDU~1_img322and replace them by a new segment STARDU~1_img325 that is the sum of the segments STARDU~1_img321and STARDU~1_img322(this implies computing the position of the center of STARDU~1_img325segment).
 

4.if there are still some segments available go back to point 2.
 

 
Thus, at each iteration of the Ward’s algorithm, the number of segments is decreased by one. We can now easily select how many segments we want by selecting how many iterations of the ward’s algorithm we will do.