Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Inverse hyperbolic sine (IHS) transformation allowing for non-normality

    Dear user,

    I want to run a basic Tobit regression, I have a continuous outcome variable of medical cost and I have transformed this using the command 'asinh' in Stata, lets call this IHS cost. I was wondering if there is a user-written command so I can estimate the following equation, similar to equation 11 in Brown, Greene, Harris & Taylor (2015), please see the link to the paper (https://ideas.repec.org/a/eee/ecmode...cp228-236.html).
    Where y is the IHS cost. In particular, I would like to estimate the highlighted gamma parameter.
    Is there a way I am able to commute this?

    Thanks for your help!

    Kind Regards,
    Aarushi

  • #2
    Dear Aarushi Dhingra,

    I cannot see the gamma parameter you refer to, but my advice is that you ignore the Tobit with IHS transformation and just use Poisson regression with robust standard errors. That is a standard way of modelling medical costs and it is much more robust that what you are trying to do.

    Best wishes,

    Joao

    Comment


    • #3
      Dear Joao Santos Silva

      Sorry about that, I tried to upload a picture of the transformation but it did not seem to go through. Please see the attached picture again.

      I was under the impression that Possion regressions can only be used for binary or count data not for continuous data, perhaps I am wrong?

      What I am trying to do is similar to this post on the Stata forum- https://www.statalist.org/forums/for...rmation-by-mle

      Not sure if it will be the same for Tobit.
      Click image for larger version

Name:	equation11, brown (2015).PNG
Views:	2
Size:	5.7 KB
ID:	1637855


      Thanks for your help!

      Aarushi
      Attached Files

      Comment


      • #4
        I was under the impression that Possion regressions can only be used for binary or count data not for continuous data, perhaps I am wrong?
        You are, indeed, wrong. While nobody is conspiring to limit knowledge of this to some select elite, it is one of the "best kept secrets" of statistics and is often the ideal solution for dealing with highly skew data. It is a particularly good solution to the problem that often arises whereby one is tempted to do a log transform of the outcome variable, but is deterred from doing so by the presence of zeroes or negative numbers.

        Comment


        • #5
          Dear Aarushi Dhingra,

          As Clyde noted above, Poisson regression can be used for any kind of non-negative data with no upper limit, and its use to model medical expenditures was pioneered by John Mullahy, who often contributes to this forum. The big advantage of Poisson regression over using the IHS transformation is that with Poisson regression the interpretation of the estimates is very simple, whereas with the IHS transformation the interpretation of the results is much less intuitive. Poisson regression is also much better if you want to use the results for prediction. Therefore, I reiterate that my recommendation is that you use Poisson regression with robust standard errors.

          Best wishes,

          Joao

          Comment


          • #6
            I would just add to Clyde's and Joao's helpful comments in #4 and #5 that it may be helpful to reflect on what feature(s) of your data you wish to model. While this is your decision and yours alone, in many instances what is desired is a robust estimate of the conditional mean of the outcome, E[y|x], and nothing more than that. (This is a point that has been emphasized in a number of contexts by Jeff Wooldridge.)

            If so then one key feature of E[y|x] when dealing with non-negative outcomes (like medical costs) is that it must be positive. The most straightforward (though certainly not the only) way to enforce this is to specify E[y|x]=exp(xb) and a straightforward and robust way to estimate such a model is to use Poisson regression. Importantly this avoids any transformation of the outcome variable.

            In some disciplines working with Generalized Linear Models (GLMs) is perhaps more familiar. If so, then specifying a log-link in the GLM context corresponds to the specification E[y|x]=exp(xb). If in addition one specifies a poisson family then this is equivalent to Poisson regression, i.e.
            Code:
            poisson y x1 x2, vce(robust)
            is equivalent to
            Code:
            glm y x1 x2, fam(poisson) link(log) vce(robust)

            Comment


            • #7
              Thanks for the advice Joao, Clyde, and John. Much appreciated.

              Kind regards,
              Aarushi

              Comment


              • #8
                Originally posted by Clyde Schechter View Post
                You are, indeed, wrong. While nobody is conspiring to limit knowledge of this to some select elite, it is one of the "best kept secrets" of statistics and is often the ideal solution for dealing with highly skew data. It is a particularly good solution to the problem that often arises whereby one is tempted to do a log transform of the outcome variable, but is deterred from doing so by the presence of zeroes or negative numbers.
                Dear Clyde Schechter

                This is a rather old thread but I'm going to shoot my shot anyway: Do you have any literature I can read up about using poisson for such (continuous) data? The general "handbook" I've been referring to such as Wooldridge's Econometric Analysis of Cross Section and Panel Data & Verbeek's A Guide to Modern Econometrics have only introduced the use of possion (and negative binomial) for count data.

                Thanks!

                Comment


                • #9
                  https://blog.stata.com/2011/08/22/use-poisson- rather-than-regress-tell-a-friend/ is relevant. In my reading it's still true that this point is well known but not widely explained.

                  Comment


                  • #10
                    Dear Yodefia Rahmad,

                    Standard references for this include

                    Willard G Manning, John Mullahy, 2001, Estimating log models: to transform or not to transform?, Journal of Health Economics, 20(4), pp. 461-494

                    and

                    J.M.C. Santos Silva and Silvana Tenreyro, 2006, The Log of Gravity, The Review of Economics and Statistics, 88(4), pp. 641-658.

                    The 8th edition of Wooldridge's introductory textbook also covers the topic.

                    Best wishes,

                    Joao

                    Comment

                    Working...
                    X