How do i transform categorical variables to continuous scores?

Kobi Ajayi

Join Date: Oct 2020
Posts: 2

How do i transform categorical variables to continuous scores?

15 Oct 2020, 23:25

Hello Statlist,

I am working on the Health Information National Trends Survey (HINTS) which has 7 variables measuring the functions of patient-centered communication (output below).

I am looking at creating an overall patient-centered communication score, which is my outcome variable. But I am having a hard time doing this. So far I know I should average scores of the individual variables and then transform them to a 0–100 scale, but I'm not sure how. I am using STATA v. 14.

Thank you in advance.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(chanceaskquestions feelingsaddressed involveddecisions understoodnextsteps explainedclearly spentenoughtime helpuncertainty)
 1  1  1  1  1  1  1
-1 -1 -1 -1 -1 -1 -1
 1  1  1  1  1  1  2
 1  1  1  1  1  1  1
 3  3  3  2  2  2  3
 1  1  1  1  1  1  1
 2  3  2  1  1  2  2
 2  2  2  2  2  2  2
 4  4  4  3  3  4  4
 2  2  3  2  2  2  4
 1  2  1  1  1  1  1
 2  2  2  3  2  3  2
-1 -1 -1 -1 -1 -1 -1
 2  2  1  1  1  2  2
 3  4  3  2  2  2  3
-1 -1 -1 -1 -1 -1 -1
 2  4  3  2  4  4  3
 4  4  1  3  1  4  4
 1  3  2  2  2  2  2
 3  3  3  3  2  3  3
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
 2  2  2  1  2  2  2
 1  2  2  1  2  2  2
 3  4  4  3  3  3  4
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
-1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1
 2  1  2  2  2  1  2
 1  1  2  1  1  1  2
 1  1  1  1  1  1  1
 1  2  2  1  1  2  2
 1  1  1  1  1  1  1
 1  2  1  1  1  2  2
 1  2  1  1  1  3  2
 1  2  1  1  1  1  1
-2 -2 -2 -2 -2 -2 -2
-1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1
 1  1  1  1  1  1  1
 1  2  2  1  1  2  3
 1  1  1  1  1  1  1
 2  2  2  2  2  2  2
 1  4  1  1  1  1  1
 3  2  3  3  3  3  3
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
 3  4  3  3  2  2  4
 1  2  1  1  1  2  1
 3  3  3  3  3  3  3
 1  1  1  1  1  1  1
 2  2  2  2  2  2  2
 1  1  1  1  1  1  2
 2  2  2  2  2  2  3
-9 -9 -9 -9 -9 -9 -9
 1  2  2  1  2  2  2
-1 -1 -1 -1 -1 -1 -1
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
-2 -2 -2 -2 -2 -2 -2
 2  2  1  1  1  2  4
 1  3  1  1  1  2  2
 1  1  2  2  1  1  2
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
 2  2  2  1  1  1  1
 1  1  1  1  1  1  1
 1  2  1  1  1  1  2
 1  1  1  1  1  1  1
 1  1  1  1  1  1  2
 1  1  1  1  1  1  1
 1  3  3  1  1  2  3
 2  2  1  1  1  2  2
 1  1  1  1  1  1  1
 2  2  2  1  2  2  2
 1  1  1  1  1  1  1
 2  3  3  3  3  4  3
 1  1  1  1  1  1  1
 1  1  1  1  1  2  1
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
-1 -1 -1 -1 -1 -1 -1
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
 2  1  1  1  1  1  2
 2  2  2  2  2  2  4
 1  1  1  1  1  1  1
 1  1  1  1  1  1  1
 2  2  2  2  2  2  2
 1  1  1  1  1  1  1
 1  1  1  1  1  1  4
 2  1  1  1  1  1  1
 2  2  2  3  3  3  2
 1  2  2  1  1  2  2
end
label values chanceaskquestions chanceaskquestions
label def chanceaskquestions -9 "Missing data (Not Ascertained)", modify
label def chanceaskquestions -2 "Question answered in error (Commission Error)", modify
label def chanceaskquestions -1 "Inapplicable, coded 0 in FreqGoProvider", modify
label def chanceaskquestions 1 "Always", modify
label def chanceaskquestions 2 "Usually", modify
label def chanceaskquestions 3 "Sometimes", modify
label def chanceaskquestions 4 "Never", modify
label values feelingsaddressed feelingsaddressed
label def feelingsaddressed -9 "Missing data (Not Ascertained)", modify
label def feelingsaddressed -2 "Question answered in error (Commission Error)", modify
label def feelingsaddressed -1 "Inapplicable, coded 0 in FreqGoProvider", modify
label def feelingsaddressed 1 "Always", modify
label def feelingsaddressed 2 "Usually", modify
label def feelingsaddressed 3 "Sometimes", modify
label def feelingsaddressed 4 "Never", modify
label values involveddecisions involveddecisions
label def involveddecisions -9 "Missing data (Not Ascertained)", modify
label def involveddecisions -2 "Question answered in error (Commission Error)", modify
label def involveddecisions -1 "Inapplicable, coded 0 in FreqGoProvider", modify
label def involveddecisions 1 "Always", modify
label def involveddecisions 2 "Usually", modify
label def involveddecisions 3 "Sometimes", modify
label def involveddecisions 4 "Never", modify
label values understoodnextsteps understoodnextsteps
label def understoodnextsteps -9 "Missing data (Not Ascertained)", modify
label def understoodnextsteps -2 "Question answered in error (Commission Error)", modify
label def understoodnextsteps -1 "Inapplicable, coded 0 in FreqGoProvider", modify
label def understoodnextsteps 1 "Always", modify
label def understoodnextsteps 2 "Usually", modify
label def understoodnextsteps 3 "Sometimes", modify
label values explainedclearly explainedclearly
label def explainedclearly -9 "Missing data (Not Ascertained)", modify
label def explainedclearly -2 "Question answered in error (Commission Error)", modify
label def explainedclearly -1 "Inapplicable, coded 0 in FreqGoProvider", modify
label def explainedclearly 1 "Always", modify
label def explainedclearly 2 "Usually", modify
label def explainedclearly 3 "Sometimes", modify
label def explainedclearly 4 "Never", modify
label values spentenoughtime spentenoughtime
label def spentenoughtime -9 "Missing data (Not Ascertained)", modify
label def spentenoughtime -2 "Question answered in error (Commission Error)", modify
label def spentenoughtime -1 "Inapplicable, coded 0 in FreqGoProvider", modify
label def spentenoughtime 1 "Always", modify
label def spentenoughtime 2 "Usually", modify
label def spentenoughtime 3 "Sometimes", modify
label def spentenoughtime 4 "Never", modify
label values helpuncertainty helpuncertainty
label def helpuncertainty -9 "Missing data (Not Ascertained)", modify
label def helpuncertainty -2 "Question answered in error (Commission Error)", modify
label def helpuncertainty -1 "Inapplicable, coded 0 in FreqGoProvider", modify
label def helpuncertainty 1 "Always", modify
label def helpuncertainty 2 "Usually", modify
label def helpuncertainty 3 "Sometimes", modify
label def helpuncertainty 4 "Never", modify

Tags: None

Felix Iglhaut

Join Date: Oct 2020

Posts: 4
#2

16 Oct 2020, 00:19

Disclaimer: I am not an expert on the topic so other users may have a greater knowledge to share. In any way you are interested in the topic of data reduction, and I recommend having a closer look at factor analysis

Code:

help factor

or principal component analysis.

Code:

help pca

In both cases I strongly recommend having a look at the complete PDF manual entry, where you will find comprehensive explanations and examples.
1 like
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4374
#3

16 Oct 2020, 01:46

Originally posted by Kobi Ajayi View Post

. . .I know I should average scores of the individual variables and then transform them to a 0–100 scale, but I'm not sure how. I am using STATA v. 14.

Code:

mvdecode chanceaskquestions-helpuncertainty, mv(-9=.m \ -2=.e \ -1=.n) egen double sco = rowmean(chanceaskquestions-helpuncertainty) replace sco = (sco - 1) / (4 - 1) * 100 assert inrange(sco, 0, 100) if !mi(sco)

I don't recall whether gsem was available at the time of Release 14.2, but Felix's suggestion for factor analysis and by implication SEM is worth pursuing in lieu of the sumscore approach that you've been instructed to follow.
1 like
Comment
Kobi Ajayi

Join Date: Oct 2020

Posts: 2
#4

16 Oct 2020, 22:00

Thank you, Felix and Joseph. I will run with your advice and code and then take it from there.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#5

16 Oct 2020, 23:44

For clarity, I would code

Code:

replace sco = (sco - 1) / (4 - 1) * 100

as

Code:

replace sco = ((sco-1)/(4-1))*100

Both will produce the same result in Stata. That is because in Stata's order of operations division precedes multiplication (-help operator-). But there are other programming languages for which multplication and division are considered equivalent in the order hierarchy and in some languages they might be performed left-to-right (which would be equivalent) or right-to-left (which would give a different result). Also, in some languages multiplication takes precedence over division, and that is commonly the way ordinary algebraic notation is interpreted as well (which would give the wrong result here). The code I am suggesting leaves nothing to chance or the imagination, it will never be misinterpreted by a human reader, nor by any language interpreter/compiler.

If you make a point of programming in ways that do not depend on remembering how your particular language orders multiplication and division, you won't have to make a point of remembering how each language you work in handles this, and you may avoid a kind of mistake that is particularly difficult to spot but could have bad consequences if it goes unnoticed.

Last edited by Clyde Schechter; 16 Oct 2020, 23:50.
1 like
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4374
#6

17 Oct 2020, 00:42

Good point. Or maybe in two steps. (In order to help avoid getting lost among the nested parentheses.)

Code:

replace sco = (sco - 1) / (4 - 1) replace sco = 100 * sco
1 like
Comment

Announcement

How do i transform categorical variables to continuous scores?

Comment

Comment

Comment

Comment

Comment