Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dummy variable model problems

    My model is as follows:

    y = b1Jan + b2Feb + ..... + b12Dec + X + et

    Info:

    Without omitting the constant I have to omit one of the monthly variables to avoid colinearity and I get a load of high P values and an Rsquared less than 0.05 (plus the constants are all over the place depending which month I remove.) No good.

    Using the noconstant version above I end up with an Rsquared of 70 and all variables significant apart from the X. However using the link test I get P=0.71 > t= 0.34

    Questions:

    1- Does fact that my numbers vary so wildly without the constant but not with it mean that the high Rsquared is a fluke?
    2- Does the link test statistic mean I haven't specified correctly?
    3- Other than the ramsey test are there any other specification tests on stata I could use?
    4- With no other data available is there a way to respecify a problem like this? Perhaps where I could keep the constant?

    Thanks
    Last edited by ineedhelp; 03 Apr 2014, 22:09.

  • #2
    Please follow the Statalist Forum FAQ/etiquette and use your real name, not an alias. (The same source tells you how to change your current alias to your real name.)

    Remember too, as per FAQ, that posting actual code and output is more likely to generate responses. And without a better description of your data set and research problem, it is unlikely that anyone can advise you on how to deal with the within-year variation you hope to deal with. And specify your equation more informatively. You talk about time; but we have no idea about what frequency "y" and "X" are measured, or indeed if these are pure time-series or panel data.

    Comment


    • #3
      I agree with Stephen on all counts.

      When you omit the constant, R-square is evaluating how much better your model is than a guess of zero for the response, in contrast to how much better your model is than a guess of the mean response. The answer is usually "enormously better" but this is meaningless in practice. This is logical and in no sense a "fluke".
      Last edited by Nick Cox; 04 Apr 2014, 04:07.

      Comment


      • #4
        Please read the FAQ and post with your real name. Also, the answer to this question can be found in many regression textbooks.

        Originally posted by ineedhelp View Post
        Without omitting the constant I have to omit one of the monthly variables to avoid colinearity and I get a load of high P values and an Rsquared less than 0.05 (plus the constants are all over the place depending which month I remove.)
        Since you are including a constant, your R squared is probably right, in the sense that it is correctly calculated and can be interpreted the usual way. The coefficients for the months are going to vary depending on which month you omit, since the omitted month is the baseline you are comparing against. For example, if you omit January, the coefficient on the dummy on February tells you the difference in means between February and January. If you omit March, the coefficient on the dummy on February tells you the difference in means between February and March, etc ...

        Originally posted by ineedhelp View Post
        1- Does fact that my numbers vary so wildly without the constant but not with it mean that the high Rsquared is a fluke?
        R squared in models without a constant is meaningless. See http://www.ats.ucla.edu/stat/mult_pk...noconstant.htm

        Jorge Eduardo Pérez Pérez
        www.jorgeperezperez.com

        Comment


        • #5
          Everybody's told you about the name thing, so I won't repeat that. Running your model with no constant and all the dummies should be equivalent to running the model with a constant and 11 dummies, and thus produce the same \(R^2\) independently of what dummy you drop. That is, if all dummies have the same observations, i.e. no missing observations or the same missing observations. The difference on the coefficients on the dummies depending which one is dropped has already been explained.
          Alfonso Sanchez-Penalver

          Comment

          Working...
          X