Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a change variable, panel data

    Hi everyone,

    I have a Panel Data Set and want to create a variable which shows me the change of the individual responses between two (and more) variables. So the two variables are these ones:
    1. a variable that shows how satisfied people are with the government, ranging from 0 (not very satisfied) to 1 (very satisfied). This variable was conducted on September 2013.
    2. The second variable is the same, showing how satisfied people are with the government from 0 (not very satisfied) to 1 (very satisfied). This variable was conducted on December 2013.

    So now I want to create a new variable which will calculate the change from the first to the second variable (for example, if one respondent answered the first question with 0 and the second question with 1, it is a change of one-->I want exactly this in a new variable). Unfortunately, I don't have a variable which shows the wave or a time-series variable. I tried a lot but it didn't work. Is there a way to calculate this in STATA (I'm working with Stata 15)?

    Maybe someone knows a solution.
    Thanks in advance

  • #2
    Unfortunately, I don't have a variable which shows the wave or a time-series variable.
    Well, if there is no way to know which response is from September 2013 and which is from December, then it is not possible to calculate the change, neither in Stata nor in any other software.

    Comment


    • #3
      Well, the variable names are kp1_1130 and kp2_1130. So kp1 means it is conducted in September 2013 and kp2 means it is conducted in December 2013. The number behind kp shows when the respondents were asked the questions. Is this maybe a way? (Maybe anyone can show me a code as an example and then I try it).

      Comment


      • #4
        So then I think all you need is:

        Code:
        gen diff = kp2_1130 - kp1_1130

        Comment


        • #5
          Ok thanks. Before I read your answer I created a new variable: wave with 1=conducted in september and 2=conducted in december. Can I also do it with this new varibable? (Because I'm not quite sure if I get what I want with your command)

          Comment


          • #6
            I think you need to show an example of your data set, using the datex command. There is no coherent arrangement of this data that has two separate variables and also has a wave variable such as the one you describe in #5. There is wide layout (one observation per person with data from both waves, each wave having its own variables) and long layout (two observations per person, with each wave having its own observation, the two observations having a common value in a person id variable.) Anything that looks like a hybrid of these is likely to get you into trouble with analysis, and in that case before calculating a change variable you would have to transform your data layout into something more workable. So show us what you have.

            If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

            When asking for help with code, always show example data. When showing example data, always use -dataex-.

            Comment


            • #7
              Ok, so here are the two variables with the command dataex (I shortened it because it would be too much to show); (left: kp1_1130, right: kp2_1130)
              8 6
              6 6
              9 6
              1 1
              7 9
              6 6
              7 9
              1 1
              8 9
              8 6
              4 4
              6 4
              5 5
              8 3
              5 5
              7 7
              end
              label values kp1_730 kp1_730
              label def kp1_730 1 "-5 voellig unzufrieden", modify
              label def kp1_730 2 "-4", modify
              label def kp1_730 3 "-3", modify
              label def kp1_730 4 "-2", modify
              label def kp1_730 5 "-1", modify
              label def kp1_730 6 "0", modify
              label def kp1_730 7 "+1", modify
              label def kp1_730 8 "+2", modify
              label def kp1_730 9 "+3", modify
              label def kp1_730 10 "+4", modify
              label def kp1_730 11 "+5 voellig zufrieden", modify
              label values kp2_730 kp2_730
              label def kp2_730 -95 "nicht teilgenommen", modify
              label def kp2_730 1 "-5 voellig unzufrieden", modify
              label def kp2_730 2 "-4", modify
              label def kp2_730 3 "-3", modify
              label def kp2_730 4 "-2", modify
              label def kp2_730 5 "-1", modify
              label def kp2_730 6 "0", modify
              label def kp2_730 7 "+1", modify
              label def kp2_730 8 "+2", modify
              label def kp2_730 9 "+3", modify
              label def kp2_730 10 "+4", modify
              label def kp2_730 11 "+5 voellig zufrieden", modify

              I deleted my own created variable wave because I think it doesn't make much sense. So I'm quite sure that my data has the long layout (two observations per person). And from these two variables I want to create a new variable which shows me the change between the two variables per each observation (so if for example I stated in kp1_1130 0 and in kp2_1130 1, then this new variable should show for my case a 1). I hope everyone understands what I mean.

              But many thanks. I just tried your command from #4 and I'm quite sure that's right. Many thanks for that
              Last edited by Philipp Hoffmann; 25 Jun 2019, 00:44.

              Comment


              • #8
                It's not a good idea to manually edit you dataex output.
                You can make sure that only the relevant variables are included by doing e.g.:
                Code:
                dataex person_id wave_number kp1_1130 kp2_1130
                And if you want to limit the number of observations you can do:
                Code:
                dataex person_id wave_number kp1_1130 kp2_1130 in 1/20
                As it is explained now, the earlier answer
                Code:
                gen diff = kp2_1130 - kp1_1130
                still seems the best answer.

                Whether your data is long or not isn't clear from the example you provide.

                Comment


                • #9
                  Ok, so here is what I did with dataex and the limitation of observations:

                  input int lfdn byte(kp1_730 kp2_730)
                  1 8 6
                  2 6 6
                  3 9 6
                  4 1 1
                  5 7 9
                  6 6 6
                  7 7 9
                  8 1 1
                  9 8 9
                  10 8 6
                  11 4 4
                  12 6 4
                  13 5 5
                  14 8 3
                  15 5 5
                  16 7 7
                  17 1 1
                  18 5 4
                  19 2 3
                  20 3 6
                  end
                  label values kp1_730 kp1_730
                  label def kp1_730 1 "-5 voellig unzufrieden", modify
                  label def kp1_730 2 "-4", modify
                  label def kp1_730 3 "-3", modify
                  label def kp1_730 4 "-2", modify
                  label def kp1_730 5 "-1", modify
                  label def kp1_730 6 "0", modify
                  label def kp1_730 7 "+1", modify
                  label def kp1_730 8 "+2", modify
                  label def kp1_730 9 "+3", modify
                  label values kp2_730 kp2_730
                  label def kp2_730 1 "-5 voellig unzufrieden", modify
                  label def kp2_730 3 "-3", modify
                  label def kp2_730 4 "-2", modify
                  label def kp2_730 5 "-1", modify
                  label def kp2_730 6 "0", modify
                  label def kp2_730 7 "+1", modify
                  label def kp2_730 9 "+3", modify
                  [/CODE]

                  Maybe this will help?

                  Comment


                  • #10
                    Alright. So assuming the variable lfdn is your person id, than the original suggestion still stands.
                    Code:
                    gen diff = kp2_730 - kp1_730
                    This data layout would be wide rather than long, but whether there is a need to reshape it depends on your analysis, really.

                    Lastly, you might want to use another approach to labeling your values. Consider replacing rather than labeling, or make a copy of your kp variables to another variable, as strings. There is likely going to be some confusion at some point about values of a numeric variable labelled with other numeric values.

                    Comment


                    • #11
                      Note also from the data extract in post #7
                      Code:
                      label def kp2_730 -95 "nicht teilgenommen", modify
                      which suggests that interviewees not present in wave 2 are coded as -95. These should be recoded as Stata missing values, or else the code in post #10 will produce a misleading answer.

                      Comment

                      Working...
                      X