Survey Longitudinal Datasets

Bill Bryant

Join Date: Apr 2020

Posts: 14
#1

Survey Longitudinal Datasets

25 Apr 2020, 15:02

Hi everyone,

I have a dataset that consists of about 1,000 individuals who answered the same questions once a week for 4 weeks (i.e. 4 waves). Some of these questions are wellbeing scores on a Likert scale 1-10 values (e.g. "how happy did you feel last week 1-10"). I now would like to plot some graphs and run some regressions to estimate the factors that led some respondents to report a reduction in their wellbeing score over time.

Question 1) I would like to check if I am following the right procedure in setting up the dataset
Question 2) I would welcome any advice on econometric techniques (I am using Stata 16).

1) Setting up the dataset

To keep things easier, I renamed the variables in a self-explanatory way and my dataset includes data from the first and second wave, with the second wave data being appended at the bottom of the first wave data.

The first thing I do is tell Stata that is is a svyset. The below variables are already included in the dataset from the survey firm.

Code:

svyset VPSU [pweight = WEIGHT], singleunit(certainty) strata(VSTRAT)

Then I tell Stata that this is a panel dataset:

Code:

xtset RESPONDENT_ID WAVE

The above variables refer to the unique ID per individual and the variable "WAVE" takes values of 1 or 2. This means that now I have 2 observations (one per way) for each person.

I now want to see if their wellbeing_scale (their self-reported 1-10 happiness score) has changed between the two waves, so I generate the new variable:

Code:

bysort RESPONDENT_ID (wave_n), sort: gen wellbeing_change = wellbeing_scale-wellbeing_scale[_n-1]

This works, and I now have one "wellbeing_change" observation per individual (which takes values -9 to 9). This means that for every individual I have two rows and one row has the wellbeing_change value and the other row is empty.

However, every time I try to do any tab of this variable by other characteristics (e.g. gender) Stata just gives me the message "no observations". How can I overcome this? Here below is what I get.

. tab wellbeing_change Income_group

no observations

I fear I have to tell Stata to ignore the missing values in the "wellbeing_change" variable, but any "subpop" command doesn't work.

2) Regressions and graphs

I know that svy does not support many estimates command, but if I understood correctly I can still use svy: melogit. I could transform my variables in binary (e.g. if wellbeing got worse or not) and then introduce independent variables (e.g. gender), but this also does not seem to be working. All I get is the same message above "no observations".

I found this command on Stata instructions ( https://www.stata.com/manuals/svysvyset.pdf#svysvyset ):

Code:

svy: melogit y x || psu: || ssu:

but I don't know what my pse or ssu are.

Thank you!
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

25 Apr 2020, 19:56

Your wellbeing_change variable will have values on the second wave, and be missing in the first wave. Is it possible that income_group is only reported on the first wave? In that case, it will be missing on every observation where wellbeing_change is not missing, so there are "no observations" where both variables are non-missing.
Comment
Bill Bryant

Join Date: Apr 2020

Posts: 14
#3

26 Apr 2020, 09:40

Thanks William,

Yes, you are right. I have a set of variables that change values every wave (e.g. wellbeing) and others static ones (e.g. gender, income, etc.). My approach so far has been that of carrying forward the static variables. So for every individual I have two rows, and the static values are repeated twice. I was hoping that by setting xtset Stata would account for this, but then all the -xt- models don't run on svy, so I am a bit stuck on what's the best approach here :/
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#4

26 Apr 2020, 10:59

My approach so far has been that of carrying forward the static variables. So for every individual I have two rows, and the static values are repeated twice.

That's the usual approach for dealing with static variables.
1 like
Comment

. tab wellbeing_change	Income_group
no observations

Announcement

Survey Longitudinal Datasets

Comment

Comment

Comment