Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • using egen to generate a variable based on a list of potential existing variables

    Hi all,

    I am relatively new to Stata and have a question regarding the generation of a new variable. Basically I have a list of observations that could be simply illustrated as follows:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(year_event c2015 c2014 c2013 c2012 c2011)
    2015 7 2 3 4 7
    2014 6 8 8 8 8
    2013 4 9 9 4 9
    2014 5 7 4 7 5
    2016 8 5 7 3 1
    end
    I would want to generate a variable cyear_1 and cyear_2, which is equal to the c-variable of the year 1 year before and 2 years before the year_event variable. So it would look like:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(year_event c2015 c2014 c2013 c2012 c2011 cyear_1 cyear_2)
    2015 7 2 3 4 7 2 3
    2014 6 8 8 8 8 8 8
    2013 4 9 9 4 9 4 9
    2014 5 7 4 7 5 4 7
    2016 8 5 7 3 1 8 5
    end
    Since the potential list of variables to choose from is much longer than this variable, as it goes from 2017 to 1997, I am trying to find a way to link my egen command to the years in the variable names but I have not yet found how to do this on this forum. Could anyone help me with this issue?

    Kind regards

  • #2
    I don't know what egen command you're referring to or why it offers a solution here. But this works for your example:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(year_event c2015 c2014 c2013 c2012 c2011)
    2015 7 2 3 4 7
    2014 6 8 8 8 8
    2013 4 9 9 4 9
    2014 5 7 4 7 5
    2016 8 5 7 3 1
    end
    
    quietly forval j = 1/2 { 
        gen c_year`j' = . 
        forval y = 2013/2016 { 
            local k = `y' - `j' 
            replace c_year`j' = c`k' if year == `y' 
        }
    } 
    
    list 
    
         +----------------------------------------------------------------------+
         | year_e~t   c2015   c2014   c2013   c2012   c2011   c_year1   c_year2 |
         |----------------------------------------------------------------------|
      1. |     2015       7       2       3       4       7         2         3 |
      2. |     2014       6       8       8       8       8         8         8 |
      3. |     2013       4       9       9       4       9         4         9 |
      4. |     2014       5       7       4       7       5         4         7 |
      5. |     2016       8       5       7       3       1         8         5 |
         +----------------------------------------------------------------------+
    Whether this is a good layout for your wider problems is hard to say.

    Comment


    • #3
      Thanks a lot Nick, that's exactly what I wanted.

      Comment


      • #4
        I have an additional question that builds on this code. The code above applies to one variable, but in my actual dataset I have a list of 30 variables that each run from 2011-2015. So I would have c2011 through c2015, x2011 through x2015, z2011 through z 2015 etc.

        Now I am trying to find an efficient way to apply the code above to each variable that follows this patern. For this I tried the following code, but it returns "invalid syntax" :

        Code:
        quietly forval j = 1/2 & forval y = 2011\2015 {
        foreach var of varlist `var'`y' {
        gen `var'_year`j' = .
        local k = `y' - `j'
        replace `var'_year`j' = `var'`k' if year == `y'
        }
        }
        Last edited by Gianni Spolver; 15 Nov 2019, 09:23.

        Comment


        • #5
          You're guessing wildly there, not a good strategy. Nothing in the documentation suggests that anything similar is supported by way of combining loops. But your extension implies adding another loop, which should be soluble. However, nothing in your code loops over the prefixes such as c x z -- and that's what you need to arrange.

          Here is an untested guess at what you want. I don't know why a loop 2013/2016 was the answer in #2 but you are writing one in terms of 2011/2015 in #4 (NB the backslash is wrong in any case).

          Code:
          unab prefixes :  *2015 
          local prefixes : subinstr local prefixes "2015" "", all 
          
          foreach pre of local prefixes { 
                forval j = 1/2 { 
                      gen `pre'_year`j' = . 
                      forval y = 2013/2016 { 
                            local k = `y' - `j' 
                            replace `pre'_year`j' = `pre'`k' if year == `y' 
                     }
          }

          Comment

          Working...
          X