<< Click to Display Table of Contents >> Navigation: 5. How To use « TIMi – StarDust module »? > 5.5. How to create a segmentation? > 5.5.3. The Ward’s algorithm |
One major drawback of the KMeans algorithm is that you have to “guess yourself“ how many segments your dataset contains. This is difficult. This is why we use the Ward’s algorithm to find the exact number of segments. The Ward’s algorithm is used after the K-means algorithm. The Ward’s algorithm is an iterative algorithm. It works this way:
1.starts with the segments found by the K-Means algorithm.
2.find the two closest segments and
amongst all the segments still available.
3.delete the segments and
and replace them by a new segment
that is the sum of the segments
and
(this implies computing the position of the center of
segment).
4.if there are still some segments available go back to point 2.
Thus, at each iteration of the Ward’s algorithm, the number of segments is decreased by one. We can now easily select how many segments we want by selecting how many iterations of the ward’s algorithm we will do.