The Importance of Good Starting Solutions in the Minimum Sum of Squares Clustering Problem
This work addresses the clustering problem for machine learning and operations research, but it is incremental as it focuses on improving starting solutions rather than introducing a new paradigm.
The authors tackled the challenge of finding good starting solutions for the minimum sum of squares clustering problem, proposing three algorithms that achieved five new best known solutions and matched the best known for 18 out of 19 medium to large instances.
The clustering problem has many applications in Machine Learning, Operations Research, and Statistics. We propose three algorithms to create starting solutions for improvement algorithms for this problem. We test the algorithms on 72 instances that were investigated in the literature. Forty eight of them are relatively easy to solve and we found the best known solution many times for all of them. Twenty four medium and large size instances are more challenging. We found five new best known solutions and matched the best known solution for 18 of the remaining 19 instances.