Hi,
I'm doing a sequence analysis with optimal matching and cluster analysis (Ward's method).
To determine the optimal number of clusters I use the command "clustermat stop" with the Calinski-Harabasz-Index. The index shows the highest value for 4 clusters. Thus, I use the command "cluster generate" to build 4 clusters.
The problem now is that the sequences in the four clusters are not congruent with the dendrogram. The program puts (randomly) sequences in clusters. The squences in one cluster are not similar. What can I do to form the correct clusters (including similar sequences) that are in agreement with the dendrogram???
My commands for the cluster analysis:
sqclusterdat
clustermat wardslinkage SQdist, name(wards) add
cluster tree wards
clustermat stop wards, variables(wards*)
cluster generate g4wards = groups(4), name(wards)
sort g4wards ID
tabulate g4wards
sqclusterdat, return keep(wards* g4wards)
Thanks for your help in advance,
Johanna
I'm doing a sequence analysis with optimal matching and cluster analysis (Ward's method).
To determine the optimal number of clusters I use the command "clustermat stop" with the Calinski-Harabasz-Index. The index shows the highest value for 4 clusters. Thus, I use the command "cluster generate" to build 4 clusters.
The problem now is that the sequences in the four clusters are not congruent with the dendrogram. The program puts (randomly) sequences in clusters. The squences in one cluster are not similar. What can I do to form the correct clusters (including similar sequences) that are in agreement with the dendrogram???
My commands for the cluster analysis:
sqclusterdat
clustermat wardslinkage SQdist, name(wards) add
cluster tree wards
clustermat stop wards, variables(wards*)
cluster generate g4wards = groups(4), name(wards)
sort g4wards ID
tabulate g4wards
sqclusterdat, return keep(wards* g4wards)
Thanks for your help in advance,
Johanna
Comment