Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Content stored in Local depending on a condition

    Good morning Stata-friends,

    I am currently puzzling with a rather complicated setting and I did not find any matching thread within the last hour.

    So my issue is the following:
    • I have time series data of stock returns and numerous control variables
    • I want to conduct a number of regressions
    • However, I would like to conduct a sample split - so, I would like to run my code for different sample periods
    • Within each sample period, I am then running several regressions with different control variables (so I got a nested loop)
    • For this, I load (= use file.dta, clear) my datafile again within every iteration of my sampleperiod, drop the unnecessary period and only use the remaining period for regressions.

    - This would not be too difficult, BUT there is one big problem: unfortunately, one of the control variables is missing for all observations within sample1.
    So if I run my loop, it always breaks when reaching the incomplete time series (e.g. volume_5), because STATA does not run a regression if no single observation of that volume-variable exists.

    - What I would like to do, is to exclude the missing variable from any regression in sample1.
    - But as all my control variables are stored in a local "controlvars" (I would like to avoid having to delete it manually), I do not know how to exclude missing variable volume5 from this local under the condition that I run sample1.

    1st level of loop: Sample splits (5year sample 1, 5year sample 2, sample 3)
    2nd level of loop: Different sets of control variables (some control variables occur in all regressions [e.g. "lagged_return"], some other variables only occur once per sample period [e.g. "volume_1" - "volume_20"])

    So the regression looks like this.
    Code:
    reg return lagged_return other_vars volumei
    The overall code structure is similar to this one (but far more complicated, in reality I got another level of nested loop, but this does not matter here):
    Code:
    use data.dta, clear
    
    *1st level of loop - conduct all regressions for various sample periods
    forvalues nsample = 1/3 {
    
    *Enables create different samples
    use data.dta, clear
    
    *Generate dummy for sample
    gen sample1     = 1 if date < td(01jan2010)
    gen sample2     = 1 if date >= td(01jan2010)
    gen sample3     = 1  // full sample
    
    *Only use sample period
    drop if sample`nsample' != 1
    
    *Define Control variables (store in locals, as there are many variables)
    *Store Low-level alternations of volume local volume_1 volume_1a volume_1b volume_1c local volume_2 volume_2a volume_2b volume_2c
    *Store higher level volume variable set local volume `volume_1' `volume_2' `volume_3'-`volume_20' local other_vars any_varlist
    *2nd level of loop - regress return on a set of variables and one variable volume(i) per iteration foreach volume_i of local volume { local controlvars `volume_i' `other_vars' lagged_return reg return `controlvars' } }
    This works perfectly fine, if I only include sample2 or 3.


    But using sample1 is impossible, as volume_5 is completely missing in sample5. So during the fifth regression, my code always breaks [error r(2000)].
    So I tried everything I could imagine to get rid of volume_5 if nsample is 1.

    So far, nothing worked.
    Those are my attempts, hopefully someone can come up with something that works. I also tried them at different locations within the code, without success.

    Code:
    *Tried to overwrite the local for volume_5 with blanks if I am in sample1
     if "`nsample'" == 1 local volume_5   // type mismatch r(109)
     if  `nsample'  == 1 local volume_5   // r(2000)
    
    local volume_5   if "`nsample'" == "1"    // r(2000)
    local volume_5   if "`nsample'" == 1    // r(2000)
    
    *Tried to drop the variables related to volume_5 if I am in sample1
    drop volume5 if `nsample' = 1   // variable volume_5 not found  r(111)
    Is there anything I could do?
    I am getting a little desperate about this.

    Would highly appreciate any feedback.

    Best regards,
    Carlos

  • #2
    Sorry guys, I forgot to clarify that I am doing this local definition for all volume_1-volume_5, not only for 1 & 2:
    Code:
    *Define Control variables (store in locals, as there are many variables)[INDENT]*Store Low-level alternations of volume
    local volume_1 volume_1a volume_1b volume_1c
    local volume_2 volume_2a volume_2b volume_2c
    Is there any way to condition the content of a local on the current value of iteration of a forvalues-command?

    Comment


    • #3
      The macrolist functions will help you - see help macrolists for details. Here is sample code accomplishing something similar to what you seek.
      Code:
      local all x1 x2 x3 x4 x5
      local some x2 x4
      forvalues v=1/3 {
          local vars `all'
          if `v'==2 {
              local vars : list all - some
              }
          display "`v' - `vars'"
          }
      Code:
      1 - x1 x2 x3 x4 x5
      2 - x1 x3 x5
      3 - x1 x2 x3 x4 x5

      Comment


      • #4
        Dear William,

        This is awesome stuff! Thank you so much.
        I actually checked -help macrolists- before, but I did not quite get, how I could use it. This will definitely also help me in further applications. Great!

        Comment

        Working...
        X