Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interaction terms between a dummy and continuous variable

    Hi all,

    I am running a fixed effects model on a data set with 18 states from 1970 to 2018.
    State wise agricultural GDP is my dependent variables and my independent variables include a dummy that takes the value of 1 if a heat wave occurred and zero otherwise. I also have irrigation as an independent regressor.
    I want to see the efficiency of irrigation in reducing the adverse effects of heat waves on agricultural GDP so I run the following model.

    AGRI_GDP= B0+ B1*HEAT_WAVE + B2*IRRIGATION + B12*HEAT_WAVE*IRRIGATION+ B3*OTHER_CONTROLS+e
    I see some research papers (for example Dell, M., Jones, B.F. and Olken, B.A., 2012. Temperature Shocks and Economic Growth: Evidence from the Last Half Century. American Economic Journal: Macroeconomics, 4(3), 66-95 https://scholar.harvard.edu/files/de...emperature.pdf) do not include the base dummy term (B1*HEAT_WAVE in my case) when running similar models, and instead only include the base term of the continuous variable (IRRIGATION in my case) and the interaction between the continuous and dummy variable (HEAT_WAVE*IRRIGATION)

    Which of the two approaches would be correct if the objective was to see the efficiency of irrigation in reducing the adverse effects of heat waves on agricultural GDP?


    Thank you for your time.

  • #2
    This is just a general reply to your query, not related to the research field itself.

    When thinking about adding interaction terms, we may consider (at least) 2 important things: a) did the additional interaction term provide a significant p-value? b) and, even more relevant, did the model improve (i.e., it decreased AIC or BIC, presented a significant LR test, etc.).

    Hopefully that helps.
    Best regards,

    Marcos

    Comment


    • #3
      If you drop the base term, you assume that irrigation has different effects with and without a heat wave (that the two slopes are different) and that under 0 irrigation heat waves have no effect. If we use your formulation which includes the base term we can test those assumptions with the model parameters.

      I would prefer to see your formulation unless the literature has already established that under 0 irrigation heat waves make no difference in agricultural production..
      Doug Hemken
      SSCC, Univ. of Wisc.-Madison

      Comment


      • #4
        Thank you Marcos Almeida for the helpful points that you have mentioned.

        Thank you Doug Hemken for your very concise answer.
        It helps me understand why some studies use the base variable and others don't: it is related to the assumption that they make about the impact of the variable and how it behaves in addition to what they want to test.

        On a side note I notice that there can be high variation in the signs of the HEAT WAVE variable depending on whether or not I include the base category along with the interaction term.

        For example if I run the above model
        AGRI_GDP= B0+ B1*HEAT_WAVE + B2*IRRIGATION + B12*HEAT_WAVE*IRRIGATION+ B3*OTHER_CONTROLS+e
        then the coefficient on HEAT WAVE is negative and the interaction is positive and insignificant

        On the other have if I run the model
        AGRI_GDP= B0+ B2*IRRIGATION + B12*HEAT_WAVE*IRRIGATION+ B3*OTHER_CONTROLS+e
        Then the sign on the interaction term is negative and significant
        Could you please shed some light on this?

        Comment


        • #5
          Hi Shailaja,

          As a disclaimer, let me first clarify that I am not an expert at this. However, since I have encountered similar issues recently, let me try to guide you a bit.

          1. Regarding model specification issue which you raise in #1, I agree with Doug Hemken. Our theoretical assumption usually drives our choice regarding exclusion/inclusion of lower order variables (which you call base variables) of interaction terms in our model. In your case, if you believe that 'HEAT_WAVE' may affect the intercept of your model, include it otherwise not. Conceptually, if you believe that 'HEAT_WAVE' not only affects AGRI_GDP through the slope coefficient of IRRIGATION but also has its own exclusive impact on AGRI_GDP through the intercept of the model, you should include 'HEAT_WAVE' in your model. For a detailed discussion on slope dummy and intercept dummy, refer any basic textbook on Econometrics (say Damodar Gujarati).

          2. In FE model with time dummies, variables which are same for all cross-sectional units at a given point in time are wiped out. Assuming that 'HEAT_WAVE' dummy takes a value of 0 or 1 for all states together, it is not feasible to calculate its own coefficient when time dummies are included in the model. When you say that some papers do not include dummy for 'HEAT_WAVE' in their panel FE model, this could be due to presence of 'time dummies' in their model. For eg., if you also include time dummies for t-1 years in your model, you also won't be able to estimate a coefficient for 'HEAT_WAVE' because of perfect collinearity. However, my argument here is only valid if 'HEAT_WAVE' dummy takes a value of 0 or 1 for all states together. If this dummy may take different values for different states at a given point in time, my point is irrelevant here and you may ignore it.

          3. Regarding the change in signs pursuant to inclusion/exclusion of lower order (base) variables in the model, it is very normal since the specification of the model is changing in each scenario. In fact, this is why your theoretical choice regarding the hypothesized impact of 'HEAT_WAVE' on intercept/slope/both becomes very important. For a brilliant discussion on this exact issue please refer these two useful papers which discuss this topic:

          a) Oh! No I got the wrong sign! What should I do? by Peter Kennedy
          b) Hypothesis Testing and Multiplicative Interaction Terms by Bear F. Braumoeller

          Hope this helps!
          Last edited by Prateek Bedi; 28 Dec 2019, 04:44.

          Comment


          • #6
            Correlation between heat waves and amount of irrigation?
            Some high leverage points?
            Polynomial relationships?

            Go back to Marcos' question - which model fits better than the other, and by how much?
            Doug Hemken
            SSCC, Univ. of Wisc.-Madison

            Comment


            • #7
              Dear Prateek thank you for your reply.
              the references you mention are helpful.
              Dear Doug Hemken
              I have checked the correlation between the variables and am in the process of testing polynomial relations. Thank you for the suggestions

              Comment

              Working...
              X