
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Ok again thank you all.

    DepVar is the log of the employed-unemployed rate ratio of native labour

    DepVar = ln(y/(1-y) = ln (N/P-N)

    employment rate of native worker is defined as: y = N/P

    N= native labour
    P = total native workforce

    I tried some of your suggestions and it helps to reduce the coefficients, but they are still too big

    And I'm not sure how to interpet the results.

    Best regards,


    • #17
      Nick Cox can you tell me the code of the quantile normal plots you produced?

      I now the command qnorm, but I can not make it that it looks like your plots.

      Thank you.


      • #18
        I only used the summarize results for 9 percentiles. So, I was doing what I could with what I could see. The code was very ad hoc and I didn't keep it.

        You can do better with your raw data using qnorm and then graph combine.

        See for an overview of quantile plotting in Stata.


        • #19
          Here is a relatively painless way to get quantile-normal plots side by side. You need to install multqplot and indeed qplot from the Stata Journal website first.

          sysuse auto, clear
          multqplot price mpg weight, trscale(invnormal(@)) xla(-2/2) xtitle("") combine(row(1) b1title(standard normal deviate) l1title("extremes, quartiles and median are labelled"))
          Click image for larger version

Name:	multqplot3.png
Views:	1
Size:	37.7 KB
ID:	1449786

          As Yudi Pawitan emphasised (reference in the presentation linked in #18) a normal quantile plot shows much about a distribution even if a distribution is not remotely close to normal and the idea never even entered your head.


          • #20
            Ok great. Thank you.


            • #21
              Dear all,

              I am encountering a similar issue. I want to log one of my independent variables which is very skewed - but this variable contains a lot of 0 which are important for my analysis.
              I like the option of taking the squared root instead of the logarithm as suggested by @Mike Lacy. Would you have a reference for this practice?

              Also, an underlying question is: should I worry a lot that my independent variable is skewed? or is it mainly a concern if the dependent variable is skewed?

              Thanks a lot in advance for your help!
              Best regards,

              [I use Stata 16 for Mac]


              • #22

                There is quite a big difference between transforming a response or outcome and transforming a predictor with a logarithm or similar transformation.

                With a response or outcome there is often (many would say almost always) scope not to transform the response, but to use a model with (in generalized linear model jargon) logarithmic link. That approach has many advantages. For one, a model that is y = exp(Xb) is compatible with some zero or negative outcomes, because the specification is about the mean function, not all the data. Classically a Poisson regression model certainly includes the idea that a count could be zero. Other distributions are compatible with logarithmic link.

                As I understand it asinh and neglog could be link functions for a GLM as they are monotonic and differentiable but I have not seen any work under either heading.

                With a predictor, and contrary to an astonishingly widespread myth, there is no general presumption in modelling that a predictor follows any particular marginal distribution. (Against the particular myth that predictors should be normally distributed. it may be noted that indicator predictors with values say 0 and 1 fail spectacularly to meet that idea.) In practice there remains the question of whether b_j x_j or b_j T(x_j) for some transformation T() is a better idea as a way of capturing a relationship that may be nonlinear. Or it may help a little to tame skewness or subdue outliers in a predictor. Or "theory" may incline the researcher to taking logarithms any way.



                • #23
                  Dear Nick,

                  Thank you very much ! This is very helpful. I also found that I could use log (x+1) instead of log(x) and I am considering it as well.

                  Best regards,


                  • #24
                    That's just a special case of neglog. as defined in

