Hi everyone,
[TLDR: I'm unsure how to interpret my results. Is the GDiD coefficient I get for for my log-ed variable "cumulative"? Any tips for meaningful interpretation?]
I am Tobi, currently a Master student of economics and dealing with end-of-term papers at the moment. I feel like having spent countless hours on this forum already reading and researching, and finding much helpful information and suggestions in the past. Anyway, I don't consider myself a genius with either econometrics or Stata, so a lot of what I learn is trial and error, and maybe my question suggests a lack of basic understanding of what I am dealing with - that's how I feel anyway.
To my question: For a research task I put together a panel dataset (ca. 200 countries à 11 years: 2009-2019) with a bunch of macroeconomic indicators to investigate the "effects" of a policy measure, the participation of states in the Belt and Road Initiative (BRI). I will give you just the core info, as I assume the question is much more general:
I pursued a GDiD approach as countries in my dataset entered treatment, i.e. signed some BRI participation agreement, in different years from 2013 onwards. I also have log-ed DVs.
BRIS: Dummy variable marking measurements during years of BRI participation
TREAT: Dummy variable indiciating a country's status as a BRI participant at some point in time (fun fact, no country discontinued its participation so far, good for me)
GDPPCP: GDP per capita PPP in current US$
LGDPPCP: = log(GDPPCP) as the GDP variable has a skewed distribution in my dataset and the results with it look better. I transform it back using =exp(x)-1
With that I did:
xtset Ccode YEAR, yearly
xtreg LGDPPCP i.BRIS i.TREAT i.YEAR , fe robust
And I got:
So, I noticed basically the longer my pre-treatment period the bigger the coefficient (I use inrange(YEAR, a, b) and some other spiels for testing), and the YEAR outputs seem to "accumulate" as well, which is something I was totally not aware before could or should be happening, if it is indeed the case, and if indeed I didn't make any other foolish mistakes. I am really confused atm, as it makes total sense and no sense at all to me at once, and causal effects in this particular example are dubious anyway... I would be really glad for any help.
Questions:
1) Does my proceeding make sense?
2) Is the coefficient indeed cumulated and should it be like that?
3) How to interpret this? Can I just divide it by the amount of years investigated to get something like a yearly factor showing treatment effects?
Again, apologies if this is actually really simple and I really should have understood all this before bothering with such models...
Regards
Tobi
[TLDR: I'm unsure how to interpret my results. Is the GDiD coefficient I get for for my log-ed variable "cumulative"? Any tips for meaningful interpretation?]
I am Tobi, currently a Master student of economics and dealing with end-of-term papers at the moment. I feel like having spent countless hours on this forum already reading and researching, and finding much helpful information and suggestions in the past. Anyway, I don't consider myself a genius with either econometrics or Stata, so a lot of what I learn is trial and error, and maybe my question suggests a lack of basic understanding of what I am dealing with - that's how I feel anyway.
To my question: For a research task I put together a panel dataset (ca. 200 countries à 11 years: 2009-2019) with a bunch of macroeconomic indicators to investigate the "effects" of a policy measure, the participation of states in the Belt and Road Initiative (BRI). I will give you just the core info, as I assume the question is much more general:
I pursued a GDiD approach as countries in my dataset entered treatment, i.e. signed some BRI participation agreement, in different years from 2013 onwards. I also have log-ed DVs.
BRIS: Dummy variable marking measurements during years of BRI participation
TREAT: Dummy variable indiciating a country's status as a BRI participant at some point in time (fun fact, no country discontinued its participation so far, good for me)
GDPPCP: GDP per capita PPP in current US$
LGDPPCP: = log(GDPPCP) as the GDP variable has a skewed distribution in my dataset and the results with it look better. I transform it back using =exp(x)-1
With that I did:
xtset Ccode YEAR, yearly
xtreg LGDPPCP i.BRIS i.TREAT i.YEAR , fe robust
And I got:
VARIABLES | LGDPPCP |
1.BRIS | 0.038** |
(0.016) | |
1o.TREAT | - |
2010.YEAR | 0.040*** |
(0.003) | |
2011.YEAR | 0.084*** |
(0.007) | |
2012.YEAR | 0.114*** |
(0.009) | |
2013.YEAR | 0.151*** |
(0.011) | |
2014.YEAR | 0.182*** |
(0.013) | |
2015.YEAR | 0.184*** |
(0.016) | |
2016.YEAR | 0.220*** |
(0.017) | |
2017.YEAR | 0.260*** |
(0.018) | |
2018.YEAR | 0.291*** |
(0.020) | |
2019.YEAR | 0.325*** |
(0.021) | |
Constant | 9.129*** |
(0.011) | |
Observations | 2,139 |
Number of Ccode | 198 |
R-squared | 0.498 |
Adj R-squared | 0.495 |
F-test | 64.80 |
Prob > F | 0 |
So, I noticed basically the longer my pre-treatment period the bigger the coefficient (I use inrange(YEAR, a, b) and some other spiels for testing), and the YEAR outputs seem to "accumulate" as well, which is something I was totally not aware before could or should be happening, if it is indeed the case, and if indeed I didn't make any other foolish mistakes. I am really confused atm, as it makes total sense and no sense at all to me at once, and causal effects in this particular example are dubious anyway... I would be really glad for any help.
Questions:
1) Does my proceeding make sense?
2) Is the coefficient indeed cumulated and should it be like that?
3) How to interpret this? Can I just divide it by the amount of years investigated to get something like a yearly factor showing treatment effects?
Again, apologies if this is actually really simple and I really should have understood all this before bothering with such models...
Regards
Tobi
Comment