Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Determining number of clusters in K-clustering analysis - Silhouette scores

    Hi all,

    In the past couple of months I have discovered and used STATA's k-clustering code.
    As I get more familiar with clustering analysis per se, I learned that there are a few methods to estimate the optimal number of clusters for a certain dataset.

    One of the suggested methods is the calculation of the Silhouette scores (or coefficients). Other methods are the Elbow method and the Gap analysis.

    I have done several searches but I was not able to find STATA commands to calculate any of these scores.
    Does anyone know if anything is available?

    Many thanks
    Nicola

  • #2
    Using version 18, under Methods and formulas (for Kmeans and kmedians cluster analysis) there is a reference to: Makles, A. 2012. Stata tip 110: How to get the optimal k-means cluster solution. Stata Journal 12: 347–351. This would appear to be the "Elbow method"; but I have limited experience with these routines

    Comment


    • #3
      Thank you very much John,

      I will try to have a look. I did read something about the Elbow method. By my reading Silhouette scores might be preferred, but I will double check.
      Thanks for the reference !

      Best,
      Nicola

      Comment

      Working...
      X