Hi all,
I'm writing a paper that looks at the effect of fires on recreational visits to National parks and National forests. I have a panel dataset consisting of ~500 IDs (geographical units) and 10 years, and my dependent variable is a count variable so I'm using an FE Poisson model.
I have two questions regarding forced negative correlation between independent variables of interests, and would be grateful for your advice.
1) A fire can burn a geographical unit in different parts by either very low, low, medium, or high severity. For example, a 'unit' may be burned 3% by a fire of low severity, 5% by medium and so on. One of my regressions involves looking at the effect of different proportions of a burn of given severity in the unit. The problem I'm having is that each unit only has one fire, and naturally if 80% of a fire is medium severity, then very low, low and high severity fires can only be 20% of the fire i.e., there's some sort of forced mechanical correlation. I'm concerned about this because this maybe means that when I look at the effect of the whole unit burning with moderate severity relative to none of it burning with moderate severity, the coefficient it's giving me is actually the coefficient for the effect of the unit not having a severe, low or very low grade fire in that area since they're inherently negatively correlated. I'm assuming this is a common problem, but I'm not able to find stuff on it online, possibly because I'm not using the right words. I was wondering if you had suggestions on what people tend to do in these situations/ what I should look up?
The code I'm using for this regression right now is:
xtpoisson pud c.propburn_all_rows#c.propsev1_all_rows c.propburn_all_rows#c.propsev2_all_rows c.propburn_all_rows#c.propsev3_all_rows c.propburn_all_rows#c.propsev4_all_rows pop60 pop120 prec tmean tmax i.year,fe vce(robust)
where propburn_all_rows is a continuous variable ranging from 0 to 1 with information on what proportion of the geographical unit is being burned by a fire, propsev1_all_rows is the proportion of the fire burned by severity 1 (very low severity), propsev2_all_rows is proportion of the fire burned by severity 2 (low severity) and so on.
2) One of my regressions looks at the effect of fires of different ages on visits to national parks and forests. I essentially created a categorical variable called single_column_year_groups that is coded as 1 for observations that are 1-3 years after a fire, 2 for 4-6 years after a fire in the unit etc. I'm then running the following regression:
xtpoisson pud i.single_column_year_groups pop60 pop120 prec tmean tmax i.year, fe vce(robust)
I'm concerned about forced negative correlation between the different year groups (i.e., if the observation is in years 1-3 after fire, it's obviously not in any of the other year groups). I'm trying to understand if this is even a problem and/or if Stata is adjusting for it, because that is the case for basically any categorical variable - if you're in one category, you're not in another? I'd love any insight on how Stata adjusts for this.
Thank you!
I'm writing a paper that looks at the effect of fires on recreational visits to National parks and National forests. I have a panel dataset consisting of ~500 IDs (geographical units) and 10 years, and my dependent variable is a count variable so I'm using an FE Poisson model.
I have two questions regarding forced negative correlation between independent variables of interests, and would be grateful for your advice.
1) A fire can burn a geographical unit in different parts by either very low, low, medium, or high severity. For example, a 'unit' may be burned 3% by a fire of low severity, 5% by medium and so on. One of my regressions involves looking at the effect of different proportions of a burn of given severity in the unit. The problem I'm having is that each unit only has one fire, and naturally if 80% of a fire is medium severity, then very low, low and high severity fires can only be 20% of the fire i.e., there's some sort of forced mechanical correlation. I'm concerned about this because this maybe means that when I look at the effect of the whole unit burning with moderate severity relative to none of it burning with moderate severity, the coefficient it's giving me is actually the coefficient for the effect of the unit not having a severe, low or very low grade fire in that area since they're inherently negatively correlated. I'm assuming this is a common problem, but I'm not able to find stuff on it online, possibly because I'm not using the right words. I was wondering if you had suggestions on what people tend to do in these situations/ what I should look up?
The code I'm using for this regression right now is:
xtpoisson pud c.propburn_all_rows#c.propsev1_all_rows c.propburn_all_rows#c.propsev2_all_rows c.propburn_all_rows#c.propsev3_all_rows c.propburn_all_rows#c.propsev4_all_rows pop60 pop120 prec tmean tmax i.year,fe vce(robust)
where propburn_all_rows is a continuous variable ranging from 0 to 1 with information on what proportion of the geographical unit is being burned by a fire, propsev1_all_rows is the proportion of the fire burned by severity 1 (very low severity), propsev2_all_rows is proportion of the fire burned by severity 2 (low severity) and so on.
2) One of my regressions looks at the effect of fires of different ages on visits to national parks and forests. I essentially created a categorical variable called single_column_year_groups that is coded as 1 for observations that are 1-3 years after a fire, 2 for 4-6 years after a fire in the unit etc. I'm then running the following regression:
xtpoisson pud i.single_column_year_groups pop60 pop120 prec tmean tmax i.year, fe vce(robust)
I'm concerned about forced negative correlation between the different year groups (i.e., if the observation is in years 1-3 after fire, it's obviously not in any of the other year groups). I'm trying to understand if this is even a problem and/or if Stata is adjusting for it, because that is the case for basically any categorical variable - if you're in one category, you're not in another? I'd love any insight on how Stata adjusts for this.
Thank you!
Comment