Dear All,
I am relatively new to Stata (taking an 'Introduction to Stata' course), and wondering how to judge when it is/is not acceptable to convert a categorical variable into a continuous one for the purpose of preforming a multiple regression analysis?
I have a categorical variable (ordinal, I think) which measures the respondent's satisfaction with their 'work-life balance'.
How satisfied are you with your work-life balance?
1 Satisfied
2 Somewhat satisfied
3 Somewhat dissatisfied
4 Dissatisfied
(9 No response)
*note, the response options did originally have numbers beside them, but so do all questions in the survey, even those which are nominal categorical variables.
For the purpose of seeing what factors impact their level of satisfaction with work-life balance, I wish to preform a regression analysis. However, I am unsure if it acceptable to convert this categorical variable into a continuous or not, since since there isn't a quantifiable difference between each 'level' of satisfaction.
In a previous assignment, I was asked to use the responses to the following chart to preform a regression analysis. We had to re-code the data so that the more signs of depression the respondent experienced, the higher the number (so a, b, d, and f would need to be re-coded) and then to add the results for each of the 6 questions together to create a 'scale' of depression i.e a continuous variable.
This website (https://statistics.laerd.com/statist...f-variable.php) leads me to believe that if I had 7 or more categories it would be acceptable to convert them into continious variables, but that 4 categories is not appropriate
I am relatively new to Stata (taking an 'Introduction to Stata' course), and wondering how to judge when it is/is not acceptable to convert a categorical variable into a continuous one for the purpose of preforming a multiple regression analysis?
I have a categorical variable (ordinal, I think) which measures the respondent's satisfaction with their 'work-life balance'.
How satisfied are you with your work-life balance?
1 Satisfied
2 Somewhat satisfied
3 Somewhat dissatisfied
4 Dissatisfied
(9 No response)
*note, the response options did originally have numbers beside them, but so do all questions in the survey, even those which are nominal categorical variables.
For the purpose of seeing what factors impact their level of satisfaction with work-life balance, I wish to preform a regression analysis. However, I am unsure if it acceptable to convert this categorical variable into a continuous or not, since since there isn't a quantifiable difference between each 'level' of satisfaction.
In a previous assignment, I was asked to use the responses to the following chart to preform a regression analysis. We had to re-code the data so that the more signs of depression the respondent experienced, the higher the number (so a, b, d, and f would need to be re-coded) and then to add the results for each of the 6 questions together to create a 'scale' of depression i.e a continuous variable.
Always experience | Experience most of the time | Sometimes experience | Experience infrequently | Do not experience at all | |
a. Feeling anxious | 1 | 2 | 3 | 4 | 5 |
b. Feeling incredibly depressed | 1 | 2 | 3 | 4 | 5 |
c. Feeling happy and relaxed | 1 | 2 | 3 | 4 | 5 |
d. Feeling somewhat down | 1 | 2 | 3 | 4 | 5 |
e. Feeling you enjoy life | 1 | 2 | 3 | 4 | 5 |
f. Not able to enjoy life because of health issues |
In some cases, the measurement scale for data is ordinal, but the variable is treated as continuous. For example, a Likert scale that contains five values - strongly agree, agree, neither agree nor disagree, disagree, and strongly disagree - is ordinal. However, where a Likert scale contains seven or more value - strongly agree, moderately agree, agree, neither agree nor disagree, disagree, moderately disagree, and strongly disagree - the underlying scale is sometimes treated as continuous (although where you should do this is a cause of great dispute).
Comment