CES-D Score Cutoff Stata

Theresia Verena

Join Date: Sep 2021
Posts: 27

CES-D Score Cutoff Stata

13 Nov 2021, 05:25

Hello,
I am a little confused to create a dummy variable for mental disability from the CESD questionnaire. Can someone help me to command the stats?
Here is my dataex.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str8 hhid14 double(kp02A kp02B kp02C kp02D kp02E kp02F kp02G kp02H kp02I kp02J)
"0010600" 1 1 1 1 3 1 1 4 3 3
"0010651" 1 2 2 2 2 1 2 2 1 1
"0010600" 1 3 3 2 3 2 2 3 2 2
"0010600" 3 3 2 1 3 1 1 2 1 1
"0010651" 2 2 1 1 2 1 1 2 2 1
"0010851" 1 4 4 4 4 4 4 4 4 1
"0010800" 2 2 2 2 2 2 2 2 1 1
"0010851" 1 1 1 1 3 1 1 3 1 1
"0012200" 2 3 1 2 2 2 3 2 3 1
"0012200" 2 3 3 1 1 1 2 1 1 1
"0012241" 2 2 2 3 2 2 1 2 3 2
"0012200" 2 2 1 4 4 1 1 4 1 1
"0012242" 3 2 2 1 4 2 2 4 3 1
"0012251" 3 3 2 2 3 1 2 3 2 4
"0012241" 2 2 1 1 3 1 1 2 1 1
"0012241" 2 2 2 2 2 2 2 2 2 2
"0012400" 1 1 1 1 1 1 2 1 4 1
"0012400" 1 4 1 1 3 1 1 2 1 4
"0012400" 3 2 3 2 3 1 1 3 2 3
"0012451" 3 4 1 3 4 1 3 3 1 3
"0012400" 1 2 2 2 2 3 2 2 2 2
"0012452" 3 1 1 1 1 1 1 1 1 4
"0012400" 2 3 2 3 4 1 1 3 3 2
"0012400" 3 2 2 3 4 3 1 4 3 3
"0012400" 1 1 1 2 2 1 1 3 2 1
"0012400" 1 1 1 1 1 1 1 1 3 1
"0012400" 1 2 1 2 4 3 2 3 1 1
"0012400" 2 2 2 2 3 1 2 1 3 2
"0012451" 1 4 1 4 2 1 4 1 1 1
"0012452" 1 2 1 2 4 1 1 2 2 2
"0012500" 1 2 3 1 2 2 1 4 3 3
"0012500" 1 3 1 2 4 1 3 2 1 1
"0012900" 1 1 1 1 1 1 4 1 4 4
"0012951" 1 1 1 4 4 1 1 2 1 1
"0012952" 1 1 1 4 4 1 2 1 1 1
"0012953" 3 3 3 3 3 3 3 3 3 3
"0012900" 1 2 2 2 1 2 2 2 2 3
"0012951" 3 3 3 3 3 3 4 4 1 1
"0020100" 2 1 1 1 1 1 1 4 1 1
"0020100" 1 1 3 1 3 3 3 3 1 1
"0020100" 2 2 1 2 3 1 1 3 1 1
"0020100" 1 1 1 1 2 1 1 4 1 1
"0020151" 1 1 1 2 4 1 1 4 1 1
"0020152" 1 1 3 1 4 1 1 1 1 1
"0020100" 1 1 1 2 3 3 1 3 3 2
"0020151" 2 2 2 3 3 2 1 1 1 2
"0020200" 1 1 1 1 1 1 1 2 1 1
"0020200" 2 1 1 1 4 4 2 3 3 1
"0020400" 3 3 3 2 3 4 4 3 3 3
"0020400" 1 3 3 4 4 3 1 3 1 1
"0020441" 1 1 1 1 4 1 1 1 1 1
"0020400" 2 3 4 4 3 2 4 3 1 1
"0020451" 1 1 1 1 4 1 2 2 1 1
"0020400" 1 3 1 4 4 2 1 2 1 1
"0020500" 1 1 1 1 2 1 1 4 1 1
"0020600" 1 1 1 1 1 1 1 4 1 1
"0020651" 1 1 1 2 4 1 1 4 1 3
"0020600" 1 1 2 4 4 2 4 4 1 1
"0020600" 2 4 4 2 4 2 2 2 2 2
"0020651" 2 1 2 4 4 2 1 2 1 3
"0021143" 4 3 1 4 4 3 2 1 1 2
"0021143" 1 4 4 1 1 1 4 1 1 4
"0021143" 1 1 2 1 2 1 1 2 1 2
"0021500" 1 3 4 4 2 1 4 1 2 4
"0021500" 4 2 1 2 4 1 1 2 2 4
"0021800" 1 4 3 3 2 1 1 1 1 1
"0021800" 4 3 1 1 4 1 1 4 1 4
"0021800" 3 3 1 1 3 1 1 3 3 3
"0021841" 2 2 2 2 2 1 1 2 1 1
"0021800" 1 1 1 1 4 1 2 3 1 1
"0021831" 1 2 1 1 1 1 1 1 1 1
"0021900" 1 1 2 1 1 1 1 1 1 1
"0021900" 1 1 1 1 1 2 1 2 1 1
"0021900" 1 2 1 1 2 1 1 4 1 3
"0021951" 3 1 1 3 2 2 3 3 1 4
"0021900" 2 2 1 1 2 1 1 2 1 1
"0021900" 3 4 4 3 4 3 3 3 3 3
"0021900" 3 2 2 2 2 2 1 1 1 1
"0022000" 3 3 3 3 3 2 2 3 2 4
"0022000" 4 4 2 3 2 4 4 4 3 3
"0022000" 1 2 3 1 4 1 1 2 1 1
"0022000" 1 2 2 2 2 2 2 2 2 2
"0022100" 1 1 1 1 1 1 4 3 1 4
"0022100" 1 1 3 3 4 1 1 2 1 4
"0022200" 2 3 2 3 2 1 1 4 1 3
"0022200" 1 1 1 1 1 1 4 4 1 1
"0022200" 1 1 1 1 3 2 2 4 1 1
"0022500" 1 2 2 2 4 1 4 3 4 4
"0022500" 3 2 2 2 3 2 2 3 1 1
"0022500" 2 1 1 4 4 1 2 3 1 1
"0022541" 3 3 3 3 4 3 3 3 3 3
"0022500" 3 1 1 2 4 1 1 4 1 1
"0022600" 1 1 1 1 3 1 4 1 1 1
"0022600" 4 4 4 4 1 4 2 1 2 1
"0022600" 3 3 3 1 2 3 3 3 3 3
"0022600" 1 1 1 1 4 1 3 1 1 1
"0022750" 1 1 1 1 4 3 3 3 3 1
"0022942" 1 1 1 1 4 1 1 3 1 1
"0022942" 3 3 3 4 3 2 2 3 3 3
"0022942" 2 3 2 2 2 1 2 2 2 1
end
label values kp02A kp02
label values kp02B kp02
label values kp02C kp02
label values kp02D kp02
label values kp02E kp02
label values kp02F kp02
label values kp02G kp02
label values kp02H kp02
label values kp02I kp02
label values kp02J kp02
label def kp02 1 "1:Rarely or none (<= 1 day)", modify
label def kp02 2 "2:Some days (1-2 days)", modify
label def kp02 3 "3:Occasionally (3-4 days)", modify
label def kp02 4 "4:Most of the time (5-7 days)", modify

kp02E and kp02h are positives on the same scale.
The cutoff score is 16.

Thank you in advance!

Tags: None

Rich Goldstein

Join Date: Mar 2014

Posts: 4439
#2

13 Nov 2021, 05:42

there are a number of things about your data that are not clear to me: (1) you have 41 distinct hhid's among your 100 observations - do you want an indicator variable for each observation or for each hhid? (2) do we sum the 10 "kp*" variables to compare to the cutoff? if no, what do we compare the cutoff of "16" to? (3) what is the import of mentioning just the E and H variables in your note at the bottom of your message?
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#3

13 Nov 2021, 09:45

The Center for Epidemiological Studies - Depression (CES-D) is a depression symptom questionnaire. It has 20 Likert items, each scored 0 to 3, so the full instrument scores 0 to 60.

So, assuming each row is one person and you want to generate an indicator for does this person report significant depressive symptoms, you would sum up all the variables to create a sum score, you'd look around for what's the accepted cutoff, and then you'd generate an indicator. The American Psychological Association link I gave above says the generally accepted cutoff is 16. Assuming you want to use that same cutoff - in some situations, people may propose modifications to a generally accepted cutoff in certain populations, but let's assume that's not in play. Also, the APA link says the questions are all scored 0 to 3, but I only see 1s through 4s in the data above, so I'm going to subtract 1 from each item first. I generally prefer my sum scores to start at 0. Not everyone does this.

Code:

foreach v of varlist kp02? { gen kp02`v'_mod = kp02`v' - 1 } egen cesd_sum = rowtotal(kp02?_mod) gen symptomatic = cesd_sum >= 16

Side notes:
"Mental disability" seems to read like what I think we might call a developmental disability these days - and I don't even know if developmental disability is the preferred term, but if you were talking about this type of disability, mental disability is certainly not preferred

I know this may come across as overly particular, but I'd recommend putting some thought into terminology in front of an audience

I assume you are talking about generating an indicator for significant depression symptoms. Most properly, depression screening instruments identify people at risk of a current major depressive episode, albeit at pretty high risk (i.e. the sensitivity parameter)

Also, as Rich noted, the data have a household ID but not an individual ID. I assume you just missed importing the individual ID.

Last edited by Weiwen Ng; 13 Nov 2021, 10:29.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
1 like
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#4

13 Nov 2021, 15:55

Also, the APA link says the questions are all scored 0 to 3, but I only see 1s through 4s in the data above, so I'm going to subtract 1 from each item first. I generally prefer my sum scores to start at 0. Not everyone does this.

Let's clarify this. It's not mandatory in general to start your score at 0. I do prefer it, because if you start the scores at 1, I have to know how many questions there are to know how many points means no symptoms at all. And in this case, the cutoff for the 20-item CES-D is defined for items scored 0-3, so you do actually need to rescore the questions.

Speaking of how many questions, sharp-eyed readers may have noticed that wait a minute, the lettering stops at J, and J is the 10th letter of the alphabet, and you said that the questions are scored 0 to 3, so how can the max score be 60 points? Well, we can assume that the OP is using the shorter 10-item CES-D, whose cutoff is 10 points. And there's another thing I forgot: questions 5 and 8 are reverse coded. Or at least they are reverse coded in the original wording. Whoever provided the data might or might not have un-reversed the items. If not, after you deduct 1 from each question, you could type:

Code:

recode kp02E_mod kp02H_mod (0 = 3) (1 = 2) (2 = 1) (3 = 0)

If you don't know if the questions have already been un-reverse coded, you can tabulate those two questions along with a few of the normally coded ones. Generally, you should have most people with zero or low depressive symptoms, some with medium, and a few with high.

Last edited by Weiwen Ng; 13 Nov 2021, 15:58.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
1 like
Comment

Announcement

CES-D Score Cutoff Stata

Comment

Comment

Comment