Creating a change variable, panel data

Philipp Hoffmann

Join Date: Jun 2019

Posts: 12
#1

Creating a change variable, panel data

24 Jun 2019, 10:03

Hi everyone,

I have a Panel Data Set and want to create a variable which shows me the change of the individual responses between two (and more) variables. So the two variables are these ones:
1. a variable that shows how satisfied people are with the government, ranging from 0 (not very satisfied) to 1 (very satisfied). This variable was conducted on September 2013.
2. The second variable is the same, showing how satisfied people are with the government from 0 (not very satisfied) to 1 (very satisfied). This variable was conducted on December 2013.

So now I want to create a new variable which will calculate the change from the first to the second variable (for example, if one respondent answered the first question with 0 and the second question with 1, it is a change of one-->I want exactly this in a new variable). Unfortunately, I don't have a variable which shows the wave or a time-series variable. I tried a lot but it didn't work. Is there a way to calculate this in STATA (I'm working with Stata 15)?

Maybe someone knows a solution.
Thanks in advance
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

24 Jun 2019, 11:11

Unfortunately, I don't have a variable which shows the wave or a time-series variable.

Well, if there is no way to know which response is from September 2013 and which is from December, then it is not possible to calculate the change, neither in Stata nor in any other software.
Comment
Philipp Hoffmann

Join Date: Jun 2019

Posts: 12
#3

24 Jun 2019, 12:53

Well, the variable names are kp1_1130 and kp2_1130. So kp1 means it is conducted in September 2013 and kp2 means it is conducted in December 2013. The number behind kp shows when the respondents were asked the questions. Is this maybe a way? (Maybe anyone can show me a code as an example and then I try it).
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#4

24 Jun 2019, 12:56

So then I think all you need is:

Code:

gen diff = kp2_1130 - kp1_1130
Comment
Philipp Hoffmann

Join Date: Jun 2019

Posts: 12
#5

24 Jun 2019, 13:51

Ok thanks. Before I read your answer I created a new variable: wave with 1=conducted in september and 2=conducted in december. Can I also do it with this new varibable? (Because I'm not quite sure if I get what I want with your command)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#6

24 Jun 2019, 14:32

I think you need to show an example of your data set, using the datex command. There is no coherent arrangement of this data that has two separate variables and also has a wave variable such as the one you describe in #5. There is wide layout (one observation per person with data from both waves, each wave having its own variables) and long layout (two observations per person, with each wave having its own observation, the two observations having a common value in a person id variable.) Anything that looks like a hybrid of these is likely to get you into trouble with analysis, and in that case before calculating a change variable you would have to transform your data layout into something more workable. So show us what you have.

If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

When asking for help with code, always show example data. When showing example data, always use -dataex-.
Comment
Philipp Hoffmann

Join Date: Jun 2019

Posts: 12
#7

25 Jun 2019, 00:36

Ok, so here are the two variables with the command dataex (I shortened it because it would be too much to show); (left: kp1_1130, right: kp2_1130)
8 6
6 6
9 6
1 1
7 9
6 6
7 9
1 1
8 9
8 6
4 4
6 4
5 5
8 3
5 5
7 7
end
label values kp1_730 kp1_730
label def kp1_730 1 "-5 voellig unzufrieden", modify
label def kp1_730 2 "-4", modify
label def kp1_730 3 "-3", modify
label def kp1_730 4 "-2", modify
label def kp1_730 5 "-1", modify
label def kp1_730 6 "0", modify
label def kp1_730 7 "+1", modify
label def kp1_730 8 "+2", modify
label def kp1_730 9 "+3", modify
label def kp1_730 10 "+4", modify
label def kp1_730 11 "+5 voellig zufrieden", modify
label values kp2_730 kp2_730
label def kp2_730 -95 "nicht teilgenommen", modify
label def kp2_730 1 "-5 voellig unzufrieden", modify
label def kp2_730 2 "-4", modify
label def kp2_730 3 "-3", modify
label def kp2_730 4 "-2", modify
label def kp2_730 5 "-1", modify
label def kp2_730 6 "0", modify
label def kp2_730 7 "+1", modify
label def kp2_730 8 "+2", modify
label def kp2_730 9 "+3", modify
label def kp2_730 10 "+4", modify
label def kp2_730 11 "+5 voellig zufrieden", modify

I deleted my own created variable wave because I think it doesn't make much sense. So I'm quite sure that my data has the long layout (two observations per person). And from these two variables I want to create a new variable which shows me the change between the two variables per each observation (so if for example I stated in kp1_1130 0 and in kp2_1130 1, then this new variable should show for my case a 1). I hope everyone understands what I mean.

But many thanks. I just tried your command from #4 and I'm quite sure that's right. Many thanks for that

Last edited by Philipp Hoffmann; 25 Jun 2019, 00:44.
Comment
Jorrit Gosens

Join Date: Jan 2015

Posts: 1019
#8

25 Jun 2019, 01:17

It's not a good idea to manually edit you dataex output.
You can make sure that only the relevant variables are included by doing e.g.:

Code:

dataex person_id wave_number kp1_1130 kp2_1130

And if you want to limit the number of observations you can do:

Code:

dataex person_id wave_number kp1_1130 kp2_1130 in 1/20

As it is explained now, the earlier answer

Code:

gen diff = kp2_1130 - kp1_1130

still seems the best answer.

Whether your data is long or not isn't clear from the example you provide.
2 likes
Comment
Philipp Hoffmann

Join Date: Jun 2019

Posts: 12
#9

25 Jun 2019, 05:08

Ok, so here is what I did with dataex and the limitation of observations:

input int lfdn byte(kp1_730 kp2_730)
1 8 6
2 6 6
3 9 6
4 1 1
5 7 9
6 6 6
7 7 9
8 1 1
9 8 9
10 8 6
11 4 4
12 6 4
13 5 5
14 8 3
15 5 5
16 7 7
17 1 1
18 5 4
19 2 3
20 3 6
end
label values kp1_730 kp1_730
label def kp1_730 1 "-5 voellig unzufrieden", modify
label def kp1_730 2 "-4", modify
label def kp1_730 3 "-3", modify
label def kp1_730 4 "-2", modify
label def kp1_730 5 "-1", modify
label def kp1_730 6 "0", modify
label def kp1_730 7 "+1", modify
label def kp1_730 8 "+2", modify
label def kp1_730 9 "+3", modify
label values kp2_730 kp2_730
label def kp2_730 1 "-5 voellig unzufrieden", modify
label def kp2_730 3 "-3", modify
label def kp2_730 4 "-2", modify
label def kp2_730 5 "-1", modify
label def kp2_730 6 "0", modify
label def kp2_730 7 "+1", modify
label def kp2_730 9 "+3", modify
[/CODE]

Maybe this will help?
Comment
Jorrit Gosens

Join Date: Jan 2015

Posts: 1019
#10

25 Jun 2019, 05:36

Alright. So assuming the variable lfdn is your person id, than the original suggestion still stands.

Code:

gen diff = kp2_730 - kp1_730

This data layout would be wide rather than long, but whether there is a need to reshape it depends on your analysis, really.

Lastly, you might want to use another approach to labeling your values. Consider replacing rather than labeling, or make a copy of your kp variables to another variable, as strings. There is likely going to be some confusion at some point about values of a numeric variable labelled with other numeric values.
1 like
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#11

25 Jun 2019, 05:56

Note also from the data extract in post #7

Code:

label def kp2_730 -95 "nicht teilgenommen", modify

which suggests that interviewees not present in wave 2 are coded as -95. These should be recoded as Stata missing values, or else the code in post #10 will produce a misleading answer.
1 like
Comment

Announcement

Creating a change variable, panel data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment