ABSTRACT
Clustering is to group objects together so that they are as homogenous as possible within the same cluster while most distinct in different clusters. This paper uses a two-phase clustering methodology that integrates the self-organizing maps (SOM) algorithm in the first phase with the k-means algorithm and the minimum spanning tree-based (MST-based) clustering in the second phase. The MST-based clustering is used because it is efficient to solve tree-type problems and tends to be less sensitive to the geometric shape of data. Two types of data transformations including min-max normalization and z-score normalization are employed to deal with the situation where magnitudes of real-life data differ sharply. We compare clustering results in terms of Davies-Bouldin (DB) value and Wilk's lambda value. According to the results by using the data of Wire Bond machines from a Taiwanese IC packaging foundry, we find that applying the k-means algorithm in the second phase to the data with min-max normalization is better in terms of jointly considering DB value and Wilk's lambda value. Despite that applying the MST-based clustering in the second phase does not outperform the k-means algorithm; however, we find that the former prevails over the latter in terms of detecting outliers especially when normalized data are used.