Content stored in Local depending on a condition

Carlos Teigimiz

Join Date: May 2016

Posts: 26
#1

Content stored in Local depending on a condition

11 Jun 2018, 04:11

Good morning Stata-friends,

I am currently puzzling with a rather complicated setting and I did not find any matching thread within the last hour.

So my issue is the following:
I have time series data of stock returns and numerous control variables

I want to conduct a number of regressions

However, I would like to conduct a sample split - so, I would like to run my code for different sample periods

Within each sample period, I am then running several regressions with different control variables (so I got a nested loop)

For this, I load (= use file.dta, clear) my datafile again within every iteration of my sampleperiod, drop the unnecessary period and only use the remaining period for regressions.

- This would not be too difficult, BUT there is one big problem: unfortunately, one of the control variables is missing for all observations within sample1.
So if I run my loop, it always breaks when reaching the incomplete time series (e.g. volume_5), because STATA does not run a regression if no single observation of that volume-variable exists.

- What I would like to do, is to exclude the missing variable from any regression in sample1.
- But as all my control variables are stored in a local "controlvars" (I would like to avoid having to delete it manually), I do not know how to exclude missing variable volume5 from this local under the condition that I run sample1.

1st level of loop: Sample splits (5year sample 1, 5year sample 2, sample 3)
2nd level of loop: Different sets of control variables (some control variables occur in all regressions [e.g. "lagged_return"], some other variables only occur once per sample period [e.g. "volume_1" - "volume_20"])

So the regression looks like this.

Code:

reg return lagged_return other_vars volumei

The overall code structure is similar to this one (but far more complicated, in reality I got another level of nested loop, but this does not matter here):

Code:

use data.dta, clear *1st level of loop - conduct all regressions for various sample periods forvalues nsample = 1/3 { *Enables create different samples use data.dta, clear *Generate dummy for sample gen sample1 = 1 if date < td(01jan2010) gen sample2 = 1 if date >= td(01jan2010) gen sample3 = 1 // full sample *Only use sample period drop if sample`nsample' != 1 *Define Control variables (store in locals, as there are many variables)
*Store Low-level alternations of volume local volume_1 volume_1a volume_1b volume_1c local volume_2 volume_2a volume_2b volume_2c
*Store higher level volume variable set local volume `volume_1' `volume_2' `volume_3'-`volume_20' local other_vars any_varlist
*2nd level of loop - regress return on a set of variables and one variable volume(i) per iteration foreach volume_i of local volume { local controlvars `volume_i' `other_vars' lagged_return reg return `controlvars' } }

This works perfectly fine, if I only include sample2 or 3.

But using sample1 is impossible, as volume_5 is completely missing in sample5. So during the fifth regression, my code always breaks [error r(2000)].
So I tried everything I could imagine to get rid of volume_5 if nsample is 1.

So far, nothing worked.
Those are my attempts, hopefully someone can come up with something that works. I also tried them at different locations within the code, without success.

Code:

*Tried to overwrite the local for volume_5 with blanks if I am in sample1 if "`nsample'" == 1 local volume_5 // type mismatch r(109) if `nsample' == 1 local volume_5 // r(2000) local volume_5 if "`nsample'" == "1" // r(2000) local volume_5 if "`nsample'" == 1 // r(2000) *Tried to drop the variables related to volume_5 if I am in sample1 drop volume5 if `nsample' = 1 // variable volume_5 not found r(111)

Is there anything I could do?
I am getting a little desperate about this.

Would highly appreciate any feedback.

Best regards,
Carlos
Tags: None
Carlos Teigimiz

Join Date: May 2016

Posts: 26
#2

12 Jun 2018, 02:18

Sorry guys, I forgot to clarify that I am doing this local definition for all volume_1-volume_5, not only for 1 & 2:

Code:

*Define Control variables (store in locals, as there are many variables)[INDENT]*Store Low-level alternations of volume local volume_1 volume_1a volume_1b volume_1c local volume_2 volume_2a volume_2b volume_2c

Is there any way to condition the content of a local on the current value of iteration of a forvalues-command?
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#3

12 Jun 2018, 04:26

The macrolist functions will help you - see help macrolists for details. Here is sample code accomplishing something similar to what you seek.

Code:

local all x1 x2 x3 x4 x5 local some x2 x4 forvalues v=1/3 { local vars `all' if `v'==2 { local vars : list all - some } display "`v' - `vars'" }

Code:

1 - x1 x2 x3 x4 x5 2 - x1 x3 x5 3 - x1 x2 x3 x4 x5
1 like
Comment
Carlos Teigimiz

Join Date: May 2016

Posts: 26
#4

12 Jun 2018, 06:20

Dear William,

This is awesome stuff! Thank you so much.
I actually checked -help macrolists- before, but I did not quite get, how I could use it. This will definitely also help me in further applications. Great!
Comment

Announcement

Content stored in Local depending on a condition

Comment

Comment

Comment