首页 /研究 /Clustering With Constraints: Feasibility Issues and the <i>k</i>-Means Algorithm
OTHER

Clustering With Constraints: Feasibility Issues and the <i>k</i>-Means Algorithm

Ian Davidson, S. S. Ravi

发表年份
2005
引用次数
279

摘要

Recent work has looked at extending the k-Means algorithm to incorporate background information in the form of instance level must-link and cannot-link constraints. We introduce two ways of specifying additional background information in the form of δ and ∊ constraints that operate on all instances but which can be interpreted as conjunctions or disjunctions of instance level constraints and hence are easy to implement. We present complexity results for the feasibility of clustering under each type of constraint individually and several types together. A key finding is that determining whether there is a feasible solution satisfying all constraints is, in general, NP-complete. Thus, an iterative algorithm such as k-Means should not try to find a feasible partitioning at each iteration. This motivates our derivation of a new version of the k-Means algorithm that minimizes the constrained vector quantization error but at each iteration does not attempt to satisfy all constraints. Using standard UCI datasets, we find that using constraints improves accuracy as others have reported, but we also show that our algorithm reduces the number of iterations until convergence. Finally, we illustrate these benefits and our new constraint types on a complex real world object identification problem using the infra-red detector on an Aibo robot.

关键词

Cluster analysisComputer scienceConstraint (computer-aided design)AlgorithmConvergence (economics)Key (lock)Vector quantizationMathematical optimizationMathematicsArtificial intelligence

相关论文

查看 OTHER 分类全部论文