Hello Statalisters,
I wish to obtain an enhanced version of the dataset provided by the - describe, replace - command. In this enhanced dataset, I want to include whether the variable is continuous or categorical.
I know these adjectives means nothing for Stata, which is why I made a set of assumptions to define whether a variable is continuous or categorical.
A variable should be continuous if and only if :
- It has more than 20 categories
- It is not a string
- There is no constant scale between its category, no matter the scale
So now, suppose I have the following dataset:
I would like one more binary variable called "cat_var" equal to 1 if the corresponding pre-describe variable was following the 3 conditions above.
I have no idea whether what I'm asking is feasible or not, but in any case, I'd appreciate any lead on this matter ! If you have other suggestions of criteria to better identify continuous / categorical variables (even if it will always be more or less imperfect), please feel free to share
I wish to obtain an enhanced version of the dataset provided by the - describe, replace - command. In this enhanced dataset, I want to include whether the variable is continuous or categorical.
I know these adjectives means nothing for Stata, which is why I made a set of assumptions to define whether a variable is continuous or categorical.
A variable should be continuous if and only if :
- It has more than 20 categories
- It is not a string
- There is no constant scale between its category, no matter the scale
So now, suppose I have the following dataset:
Code:
sysuse auto, clear describe, replace
I have no idea whether what I'm asking is feasible or not, but in any case, I'd appreciate any lead on this matter ! If you have other suggestions of criteria to better identify continuous / categorical variables (even if it will always be more or less imperfect), please feel free to share

Comment