Читать книгу Data Analytics in Bioinformatics - Группа авторов - Страница 40

2.3.3.2 Cluster Center Initialization Algorithm (CCIA)

In traditional k-means algorithm the centroids of clusters are randomly selected initially, yielding low quality clusters. As an attempt in producing high quality clustering CCIA is introduced. This algorithm can be referred as an enhancement to k-means algorithm in which centroids (k) are accurately selected which are distant from the outliers. This selection process is done by computing mean and standard deviation on the data elements. CCIA now examines most nearest set of elements from the dataset (D) and fabricates into a new dataset (X) after discarding the outliers. Later those set of elements are eliminated from the dataset (D). This process is iterated until the dataset (X) equals with (k).

Drawback—This algorithm is incapable in dealing with closely connected clusters and cannot deal with high dimensional data [14, 15].

Подняться наверх