Initialization of cluster refinement algorithms: A review and comparative study

被引:32
作者
He, J [1 ]
Lan, M [1 ]
Tan, CL [1 ]
Sung, SY [1 ]
Low, HB [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117543, Singapore
来源
2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS | 2004年
关键词
D O I
10.1109/IJCNN.2004.1379917
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Various iterative refinement clustering methods are dependent on the initial state of the model and are capable of obtaining one of their local optima only. Since the task of identifying the global optimization is NP-hard, the study of the initialization method towards a sub-optimization is of great value. This paper reviews the various cluster initialization methods in the literature by categorizing them into three major families, namely random sampling methods, distance optimization methods, and density estimation methods. In addition, using a set of quantitative measures, we assess their performance on a number of synthetic and real-life data sets. Our controlled benchmark identifies two distance optimization methods, namely SCS and KKZ, as complements of the K-Means learning characteristics towards a better cluster separation in the output solution.
引用
收藏
页码:297 / 302
页数:6
相关论文
共 23 条
[1]  
ALDAOUD M, 1994, 9434 U LEEDS SCH COM
[2]   A NEAR-OPTIMAL INITIAL SEED VALUE SELECTION IN K-MEANS ALGORITHM USING A GENETIC ALGORITHM [J].
BABU, GP ;
MURTY, MN .
PATTERN RECOGNITION LETTERS, 1993, 14 (10) :763-769
[3]  
BRADLEY PS, 1998, P 15 INT C MACH LEAR, P91
[4]   ART-2 - SELF-ORGANIZATION OF STABLE CATEGORY RECOGNITION CODES FOR ANALOG INPUT PATTERNS [J].
CARPENTER, GA ;
GROSSBERG, S .
APPLIED OPTICS, 1987, 26 (23) :4919-4930
[5]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[6]  
FORGY E, 1965, WNAR M U CAL RIV
[7]   Tabu search algorithm for codebook generation in vector quantization [J].
Franti, P ;
Kivijarvi, J ;
Nevalainen, O .
PATTERN RECOGNITION, 1998, 31 (08) :1139-1148
[8]   Randomised Local Search algorithm for the clustering problem [J].
Fränti, P ;
Kivijärvi, J .
PATTERN ANALYSIS AND APPLICATIONS, 2000, 3 (04) :358-369
[9]   THE COMPLEXITY OF THE GENERALIZED LLOYD MAX PROBLEM [J].
GAREY, MR ;
JOHNSON, DS ;
WITSENHAUSEN, HS .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1982, 28 (02) :255-256
[10]   ART-C: A neural architecture for self-organization under constraints [J].
He, J ;
Tan, AH ;
Tan, CL .
PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, :2550-2555