Hi all,
A dataex extract of my data is at the end of this post.
I have a panel dataset which was generated by sending out surveys out to firms every two months (e.g. March 2021, May 2021, July 2021, etc.). The goal is to perform two-way fixed effects estimation with these data.
The surveys ran from November 2020 to March 2022. Respondents were not obligated to respond to our surveys, therefore most respondents have an erratic response pattern, creating a very unbalanced panel. Furthermore, to add to the confusion, certain respondents joined in later months than others. A short example of what I mean:
- Respondent A responds in March 2021, May 2021, July 2021, September 2021
- Respondent B responds in March 2021, May 2021
- Respondent C first responds in March 2021, skips May and July 2021, and then responds again in September 2021.
I would like to find out whether the time of exit from our sample (e.g. May 2021 for respondent B and March 2021 for respondent C, September 2021 for respondent A) is random with respect to the dependent variable, conditional on all the other regressors and fixed effects. For this, I would like to run a BGLW (Becketti et al., 1988) test.
I have run a two-way fixed-effects regression like
and have saved the estimation sample, called full_est_sample (a binary variable indicating whether each observation was used in the estimation).
I would like to create a dummy capturing whether a respondent later attrites. I am not sure if the strategy or the code are correct (comments are more than welcome!):
1. Code a dummy called exit, equal to 1 if a respondent is not observed in the following survey wave (say period t+1), given that they are observed in this period t. 0 otherwise.
I tried the following code, which did not give me what I wanted as it did not equal 1 each time a respondent left the sample in the next wave:
Please could someone let me know of the mistake I have made and how to correct it? The data are xtset so I have tried replacing [_n+1] with F., however I had the same issue so I must have made a mistake.
2. Regress the initial value of y on the initial values of the covariates and on the dummy above. Should time fixed-effects be included in this regression? Is this the correct method for the BGLW test?
A dataex extract of my data is at the end of this post.
I have a panel dataset which was generated by sending out surveys out to firms every two months (e.g. March 2021, May 2021, July 2021, etc.). The goal is to perform two-way fixed effects estimation with these data.
The surveys ran from November 2020 to March 2022. Respondents were not obligated to respond to our surveys, therefore most respondents have an erratic response pattern, creating a very unbalanced panel. Furthermore, to add to the confusion, certain respondents joined in later months than others. A short example of what I mean:
- Respondent A responds in March 2021, May 2021, July 2021, September 2021
- Respondent B responds in March 2021, May 2021
- Respondent C first responds in March 2021, skips May and July 2021, and then responds again in September 2021.
I would like to find out whether the time of exit from our sample (e.g. May 2021 for respondent B and March 2021 for respondent C, September 2021 for respondent A) is random with respect to the dependent variable, conditional on all the other regressors and fixed effects. For this, I would like to run a BGLW (Becketti et al., 1988) test.
I have run a two-way fixed-effects regression like
Code:
xtreg y x1 x2 x3 i.time, fe cluster(ID)
I would like to create a dummy capturing whether a respondent later attrites. I am not sure if the strategy or the code are correct (comments are more than welcome!):
1. Code a dummy called exit, equal to 1 if a respondent is not observed in the following survey wave (say period t+1), given that they are observed in this period t. 0 otherwise.
I tried the following code, which did not give me what I wanted as it did not equal 1 each time a respondent left the sample in the next wave:
Code:
bys ID: g attrition=cond(full_est_sample[_n]==1 & full_est_sample[_n+1]==0,1,0)
2. Regress the initial value of y on the initial values of the covariates and on the dummy above. Should time fixed-effects be included in this regression? Is this the correct method for the BGLW test?
Code:
input float(ID yearmonth y x1 x2 x3 full_est_sample_snskills) 4 744 0 0 0 0 1 4 745 . 0 0 0 0 8 735 . 0 0 0 0 8 738 0 0 0 0 1 8 739 . 0 0 0 0 8 740 0 0 0 0 1 8 741 . 0 0 0 0 8 742 0 0 0 0 1 15 729 0 0 0 0 0 15 731 0 0 0 0 0 16 733 . 1 0 0 0 16 734 -1 0 0 0 1 16 737 . 1 0 0 0 16 740 0 0 0 0 1 20 733 . 1 0 0 0 20 736 0 .8068182 0 0 1 20 739 . .9302326 0 0 0 20 742 -1 1 0 0 1 20 747 . 0 0 0 0 26 730 0 0 0 0 1 27 725 . 0 0 0 0 27 726 . 0 0 0 0 27 728 . 0 0 0 0 27 733 . 0 0 0 0 27 736 0 0 0 0 1 27 738 0 0 0 0 1 27 739 . 0 0 0 0 27 741 . 0 0 0 0 27 747 . 0 0 0 0 35 734 0 .8333333 0 0 1 47 735 . 0 0 0 0 47 738 0 0 0 0 1 47 745 . 0 0 0 0 47 746 -1 0 0 0 1 47 747 . 0 0 0 0 48 744 0 0 0 0 1 50 746 -1 1 0 0 1 58 744 -1 0 0 0 0 66 734 1 .125 0 0 1 66 735 . .125 0 0 0 66 736 1 0 0 0 1 66 737 . .13333334 0 0 0 66 739 . 0 0 0 0 66 740 1 .14285715 0 0 1 66 744 0 0 0 0 1 68 726 . 0 0 0 0 68 733 . 0 0 0 0 70 744 0 0 0 0 0 70 745 . 0 0 0 0 71 734 0 0 0 0 1 71 736 0 0 0 0 1 73 734 -1 1 0 0 1 73 735 . 1 0 0 0 73 736 -1 1 0 0 1 73 737 . 1 0 0 0 73 738 -1 0 0 0 1 73 739 . 0 0 0 0 73 740 -1 0 0 0 1 73 741 . 0 0 0 0 73 742 -1 0 0 0 1 73 743 . 0 0 0 0 73 744 -1 0 0 0 1 73 745 . 0 0 0 0 73 746 -1 0 0 0 1 73 747 . 0 0 0 0 74 744 0 0 0 0 1 75 734 -1 1 0 0 1 75 735 . 1 0 0 0 77 728 . 0 0 0 0 77 730 0 0 0 0 1 77 731 0 0 0 0 1 77 732 0 0 0 0 1 77 733 . 0 0 0 0 77 734 -1 0 0 0 1 77 735 . 0 0 0 0 77 736 0 0 0 0 1 77 737 . 0 0 0 0 77 738 0 0 0 0 1 77 739 . 0 0 0 0 77 740 0 0 0 0 1 77 742 -1 0 0 0 1 77 744 0 0 0 0 1 77 745 . 0 0 0 0 77 746 -1 0 0 0 1 77 747 . 0 0 0 0 79 726 . 0 0 0 0 79 744 0 0 0 0 1 83 745 . 0 0 0 0 86 734 0 0 0 0 1 91 726 . 0 0 0 0 91 728 . 0 0 0 0 91 736 0 0 0 0 1 92 744 0 0 0 0 1 92 745 . 0 0 0 0 94 734 -1 1 0 0 1 95 726 . 0 0 0 0 96 726 . 1 0 0 0 97 733 . 1 0 0 0 97 737 . 1 0 0 0 98 736 0 .3333333 0 0 1
Comment