Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to square a standardized log variable and build interaction term?

    Hello everybody,

    I have a question simply to the way of how to calculate something:

    I have variable V, which is continuous, and variable dummy, which is a factor variable.

    First I have the regression:

    Code:
     xtreg P c.lnV##i.dummy lnSa Ratio lnAge Closely i.year, fe vce(cluster ID)
    but now I want to add the squared term of V! How do I do that? I would do:

    Code:
     xtreg P lnV lnV_sq dummy i.dummy#c.lnV i.dummy#c.lnV_sq  lnSa Ratio lnAge Closely i.year, fe vce(cluster ID)
    But I also read I have to center or standardize the variables first in order to not increase multicollinearity too much, so when do I do what with the variables?

    egen lnV_std = std(lnV)
    egen lnV_std_sq = std(lnV*lnV)
    egen Interaction1 = std(lnV)*dummy
    egen Interaction2 = std(lnV*lnV)*dummy

    Is this the right order of squaring, standardizing, taking the log and multiplying?

    Thank you very much for your help

  • #2
    Code:
    xtreg P c.lnV##c.lnV##i.dummy lnSa Ratio lnAge Closely i.year, fe vce(cluster ID)
    As for centering, it is not necessary. It is sometimes helpful, but often doesn't matter. You might want to try your analysis first without centering and see if you run into a colinearity problem. Unless you end up with results where a variable whose effects you are really interested in (not just one you are including to adjust for its contributions to outcome variance) has a large standard error, you may have multicolinearity, but you don't have a multicolinearity problem.

    In any case, if you chose to center, you must take the logarithms first and then center those. If you try to center and then take logs, you will fail because you will have negative centered values. Once you have done that, just replace c.lnV##c.lnV##i.dummy by c.centered_lnV##c.centered_lnV##i.dummy. Don't create your own square or interaction terms--let Stata's factor variable notation handle it for you, so you avoid common mistakes. (Stata will calculate the square of the centered lnV terms and also the interactions of those with the dummy.)

    Comment


    • #3
      To include in your model terms for lnV, lnV squared, and dummy, and all their interactions, changing your command to the following would do.
      Code:
      xtreg P c.lnV##c.lnV##i.dummy lnSa Ratio lnAge Closely i.year, fe vce(cluster ID)
      I don't do much standardizing myself, so I cannot comment on that part of your question.

      Comment


      • #4
        I also note that there is a difference between standardizing and centering. While it is conceivable you will need to center your lnV variable to avoid a multicollinearity problem, there is no reason to standardize it. Standardizing it will just make it harder to interpret your results and explain them to people. If you have to center, go ahead. But do not standardize.

        Comment


        • #5
          Perfect, thank you very much for your answers!!

          I initially created my own interaction terms because in some of my models I also have to work with lags and I got confused wether it would come before or after the c./i. then but I guess it works like:

          Code:
           xtreg P L.c.lnV##L.c.lnV##L.i.dummy lnSa Ratio lnAge Closely i.year, fe vce(cluster ID)
          also, after this I want to run the utest command which doesn't allow time-series operators and interactions... So for that I guess I have to calculate my own interactions? I find this confusing because what is the utest used for if not interactions? I hope I'm using it the right way! I just type:

          Code:
           utest lnV lnV_sq
          But apart from that, I would first take the ln of V, then center it, and then multiply this term with itself (already log AND centered)?

          Thank you so much for your help again, really appreciate it!

          Comment


          • #6
            So from my first question this would be wrong in any way
            Code:
             egen lnV_std_sq = std(lnV*lnV)
            not only because of standardizing itself but also the order of the commands, because I multiply first, then standardize. So it would become:

            Code:
            gen lnV = log(V)
            egen mean_lnV = mean(lnV)
            gen lnV_centered = lnV - mean_lnV
            gen lnV_c_sq = lnV_centered*lnV_centered
            and then for the interactions just multiply the last two with the dummies, right?

            Again, thank you so much!!

            Comment


            • #7
              Hi All,

              I am using xtivreg (for panel data IV reg), which does not support interaction terms. So, I manually need to create interactions.
              All the variables in my analysis (apart from binary variables) are standardized.

              My question is how should I generate the square of a standardized variable called "experience"?
              egen exp_std=std(experience)

              Approach 1: egen exp2_std=std(experience*experience)
              OR
              Approach 2: gen exp2_std=exp_std*exp_std

              Thanks!


              Comment

              Working...
              X