Error running a kmeans cluster analysis - error message: factor variables and time-series operators not allowed

Marco Giansoldati

Join Date: May 2014

Posts: 79
#1

Error running a kmeans cluster analysis - error message: factor variables and time-series operators not allowed

14 Oct 2019, 10:46

Dear Members,

I am trying to learn how to perform a cluster analysis. I wish to apply it to the the level of agreement for a set statements, measured via a Likert scale that goes from 1 to 4.

I have 20 variables each indicating how a certain feature of electric cars is perceived as a barrier to their purchase. These 20 variables can take only the following values: 1, 2, 3, 4. With respect to the proposed statement, 1 indicates that the individual completely disagrees with it, 2 that she partially disagrees, 3 that she partially agrees, and 4 that she totally agrees. They are stored in my database in the following fashion (the image indicates one of the 20 variables).

My idea to run a cluster analysis and check if I am able to group individuals into meaningful associations.

I tried to run the following command

Code:

cluster k planning anxiety k(3)

but I got the following error message:

factor variables and time-series operators not allowed
r(101);

I am stuck at this point and I got no results.

This is probably trivial issue, and I do apologise if this is the case. Looking into this forum and online I was not able to find a solution to this problem.

I would be very grateful if any of you could provide me with an insight.

Marco

Last edited by Marco Giansoldati; 14 Oct 2019, 10:51.
Tags: None
Rich Goldstein

Join Date: Mar 2014

Posts: 4462
#2

14 Oct 2019, 10:50

the "k(3)" is an option and needs to be preceded by a comma
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2469
#3

14 Oct 2019, 10:51

I think you should do something like

Code:

cluster kmeans planning anxiety, k(3) name(cl1) cluster kmedians planning anxiety, k(3) name(cl2)

HTH
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35697
#4

14 Oct 2019, 11:15

If I follow this correctly, then your two variables fall into a 4 x 4 classification. A table of cross-combinations showing up to 16 frequencies -- or any graph showing the same -- is going to keep all the information and is likely to be far more informative than a cluster analysis.
Comment
Marco Giansoldati

Join Date: May 2014

Posts: 79
#5

14 Oct 2019, 11:27

Thank you very much Rich Goldstein. My apologies for the mistake. Many thanks FernandoRios for the precious help.
Comment
Marco Giansoldati

Join Date: May 2014

Posts: 79
#6

14 Oct 2019, 11:45

Dear Nick Cox, thank you very much for your post.

I have 20 "barriers" on the purchase of an electric car (from practicality to driving pleasure), which can take values of either 1, 2, 3, or 4. My idea was to run a cluster for example in the following fashion

Code:

cluster k practicality-driv_pleas, k(3) name(cluster1)

and then look at the socio-economic characteristics of the respondents. These encompass, for example, gender (two levels), education (3 levels), occupation (3 levels), self-declared level of expertise with electric cars (2 levels), electric car's driving experience (two levels), etc.

The levels I put within the brackets stem from the fact I had to group some levels to perfom a Chi-square test and have sufficient numerosity.

I performed so far a series of cross tabulations, but I am not completely sure if I got your kind suggestion.

Do you think it would be useful to perform the following command:

Code:

tabstat practicality-driv_pleas, by(cluster1)

?

I would actually be interested in performing a sort of graphical analysis of the clusters to better visualize if they carry a message.

Many thanks Nick.
Comment

Announcement

Error running a kmeans cluster analysis - error message: factor variables and time-series operators not allowed

Comment

Comment

Comment

Comment

Comment