Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with log model- the number of observations got smaller after applying logs

    Dear all,

    I'm currently working with panel data. I used xtreg,robust command.
    In the first model I have linear regression. I had almost 200 observations.
    In the second model I used log log regression with one exception. One of the independent variables is GDP growth rate. When I took log from it and wanted to conduct a regression stata told me: insufficient observations
    After doing the regression for logs number of observations dropped to less than 30. I totally don't understand what is the reason for it.
    Could you give me any hints?

  • #2
    GDP growth rate can be zero or negative. \(f(x)=\text{log }x\) is only valid for \(x > 0\). So log transformation is not wise for such a variable as all zero and negative values are turned to missing.

    Comment


    • #3
      One guess: some of your values that you logged were zero or negative, which sounds all too likely with a growth rate, so log of those is missing and the observations will be omitted from the regression.

      It's hard to believe that's the whole story, but it's hard to say without seeing the data or any results. For example, a summarize of all the variables would be illuminating.

      200 observations might not be too big for dataex -- you need to override the default count -- unless you have lots of predictors (in which case you're possibly trying a too complicated model).

      Short summary: Look for missing values. Stata is almost always right in cases like this, and not believing it doesn't make a difference.
      Last edited by Nick Cox; 06 May 2022, 06:44.

      Comment


      • #4
        Thank you both for your answer. I cannot share my data since it is sensitive. I've noticed other control variables are sometimes negativ. I wanted to use logs so that it would be easy to interpret the results.

        Comment


        • #5
          You can still tell us whether you tried to take logarithms of variables with zero or negative values and whether that satisfies you as an explanation of the problem.

          Comment


          • #6
            It was exact the case. Thank you one more time.
            I have 6 independent variables which take negative and 0 values. Now I have to think about the appropriate model. My dependent variable is an indicator which absolute value ranges from 0 to 1 indicating whether the bankruptcy risk is high. My variable of interest is the ratio of equity
            to the risky assets.
            I wanted to take log log model. So for example 1% increase in capital ratio would decrease bankruptcy risk by b%. However, now I know it wasn't good idea. Interpreting the coeffcients from linear model is a real struggle for me in this case

            Comment


            • #7
              It's hard to advise in abstraction, but e.g. crossplot from SSC would give you a first stab at whether you have strong nonlinearity between your outcome and any of the covariates or predictors. (I can't willingly promote the term "independent variables".)

              Outcomes between 0 and 1 might lend themselves to logarithmic transformation if no value is exactly zero, that variable is highly skew in distribution, and that simplifies relationships with the predictors.

              Comment

              Working...
              X