Abstract:
A technique that uses a weighted divide and conquer approach for clustering a set S of n data points to find k final centers. The technique comprises 1) partitioning the set S into P disjoint pieces S1, . . . , SP; 2) for each piece Si, determining a set Di of k intermediate centers; 3) assigning each data point in each piece Si to the nearest one of the k intermediate centers; 4) weighting each of the k intermediate centers in each set Di by the number of points in the corresponding piece Si assigned to that center; and 5) clustering the weighted intermediate centers together to find said k final centers, the clustering performed using a specific error metric and a clustering method A.
Abstract:
A technique that uses a weighted divide and conquer approach for clustering a set S of n data points to find k final centers. The technique comprises 1) partitioning the set S into P disjoint pieces S1, . . . , Sp; 2) for each piece Si, determining a set Di of k intermediate centers; 3) assigning each data point in each piece Si to the nearest one of the k intermediate centers; 4) weighting each of the k intermediate centers in each set Di by the number of points in the corresponding piece Si assigned to that center; and 5) clustering the weighted intermediate centers together to find said k final centers, the clustering performed using a specific error metric and a clustering method A.
Abstract translation:一种使用加权分割和征服方法来聚集n个数据点的集合S以找到k个最终中心的技术。 该技术包括:1)将集合S划分成P个不相交的部分S 1。 。 。 ,S u> 2)对于每个块S i确定k个中间中心的集合D i i i i, 3)将每个片段S i中的每个数据点分配给k个中间中心中最接近的一个; 4)通过分配给该中心的相应片段S i i中的点的数量对每个集合D i i i中的每个k个中间中心进行加权; 和5)将加权中间体聚类在一起以找到所述k个最终中心,使用特定的误差度量和聚类方法A进行聚类。