Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • using logs in a regression without creating a new variable for log

    Good evening,

    I was wondering if it is possible to use logs in a regression without creating a new variable for log.
    If not, what's the best way of bulk logging many variables to create logs of variables? For example, assume my variables are A, B, C, D1, D2, ..., D50, E, F, and I want to log D1 through D50.

    Thank you,
    Stan

  • #2
    There is no way to use a log-transformed version of a variable in Stata without creating a new log-variable. To log-transform 50 variables:
    Code:
    forvalues i = 1/50 {
        gen log_D`i' = log(D`i')
    }

    Comment


    • #3
      Thank you, Clyde, for your quick response.
      I actually inadvertently oversimplified my problem. My variables are more like A, B, C, Ddsfgs, Dkgfghgj, Deyrytr, ... E, F. In other words, not intrinsically ordered. What is the best solution in this case? Perhaps using the * operator somehow?

      Comment


      • #4
        I think you can get what you want via -ds- and -foreach-. E.g.,

        Code:
        clear
        input a b c da db dc e
        1 2 3 4 5 6 7
        end
        quietly ds d*
        local vlist = r(varlist)
        foreach v of local vlist {
            generate log`v' = ln(`v')
        }
        list, clean noobs
        Output from -list- command:
        Code:
        . list, clean noobs
        
            a   b   c   da   db   dc   e      logda      logdb      logdc  
            1   2   3    4    5    6   7   1.386294   1.609438   1.791759
        For the variable names you showed, I suppose the code would look something like this:

        Code:
        quietly ds D*
        local vlist = r(varlist)
        foreach v of local vlist {
            generate log_`v' = ln(`v')
        }
        Last edited by Bruce Weaver; 08 Jan 2025, 20:36. Reason: Added my guess at the code for Stan's variable names.
        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 18.5 (Windows)

        Comment


        • #5
          That worked! Thank you very much!

          Comment


          • #6
            It is even easier than Bruce Weaver showed.


            Code:
            foreach v of var D* { 
                  gen log_`v' = log(`v')
            }
            Be clear that log() = ln() while log10() differs from both. The more mathematics you and your peers know, the more natural ln() will seem to be. If at some point you are also using exp(), using log10() will just lead to complications.

            There is a celebrated put-down in Paul Halmos' autobiography that no real mathematician would understand log() to be anything but natural log; hence the notation ln() (which goes back to the 19th century) is reserved for dirty-handed aliens. As someone who admires mathematics immensely but merely uses it as a low level, I tend to use ln() to remind myself and inform my readers that I don't mean log base 10 -- and equally on the odd occasions when I do use log10(), that's necessarily explicit. Log base 10 can be useful say for explicit quick axis labels. Numeracy includes being able to read log base 10 of 3 and 6 as meaning a thousand and a million, and so forth, although I would be impressed to encounter anyone who could work equally comfortably in their head with exp(3) or exp(6) or its powers other than 1.

            On a different note, although I note that ds remains useful, I consider findname from the Stata Journal to be a better command. Reasons to use ds include preferring to use official commands, being (very) familiar with its syntax, and finding that it does what you want, so why use anything else. Reasons to use findname are that you too find part of the syntax of ds rather awkward (and that part can be blamed on me, although StataCorp folded it back into the official command) or that you need its extra functionality.

            The history of findname is a little long, given various fixes and additions, but all anyone need do is look at the latest version:


            Code:
            . search findname, sj
            
            Search of official help files, FAQs, Examples, and Stata Journals
            
            SJ-23-4 dm0048_5  . . . . . . . . . . . . . . . . Software update for findname
                    (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
                    Q4/23   SJ 23(4):1096
                    options vallabelcountdef() and vallabelcountuse() have been
                    extended to allow zero (0) as an argument
            
            SJ-20-2 dm0048_4  . . . . . . . . . . . . . . . . Software update for findname
                    (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
                    Q2/20   SJ 20(2):504
                    new options include columns()
            
            SJ-15-2 dm0048_3  . . . . . . . . . . . . . . . . Software update for findname
                    (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
                    Q2/15   SJ 15(2):605--606
                    updated to be able to find strL variables
            
            SJ-12-1 dm0048_2  . . . . . . . . . . . . . . . . Software update for findname
                    (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
                    Q1/12   SJ 12(1):167
                    correction for handling embedded double quote characters
            
            SJ-10-4 dm0048_1  . . . . . . . . . . . . . . . . Software update for findname
                    (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
                    Q4/10   SJ 10(4):691
                    update for not option
            
            SJ-10-2 dm0048  . . . . . . . . . . . . . .  Speaking Stata: Finding variables
                    (help findname if installed)  . . . . . . . . . . . . . . .  N. J. Cox
                    Q2/10   SJ 10(2):281--296
                    produces a list of variable names showing which variables
                    have specific properties, such as being of string type, or
                    having value labels attached, or having a date format

            Comment


            • #7
              Thank you, Nick Cox. I did not know that foreach v of var D* would work. That is much easier indeed.
              --
              Bruce Weaver
              Email: [email protected]
              Version: Stata/MP 18.5 (Windows)

              Comment


              • #8
                The –gmm– procedure allows one to estimate using transformed LHS variables without creating the transformed variables.
                Code:
                sysuse auto
                
                gen lprice=log(price)
                
                reg lprice mpg foreign, vce(robust)
                
                qui gmm (log(price) - {xb: mpg foreign _cons}), vce(robust) instr(mpg foreign) igmm
                gmm
                which yields
                Code:
                . reg lprice mpg foreign, vce(robust)
                
                Linear regression                               Number of obs     =         74
                                                                F(2, 71)          =      16.24
                                                                Prob > F          =     0.0000
                                                                R-squared         =     0.3340
                                                                Root MSE          =     .32448
                
                ------------------------------------------------------------------------------
                             |               Robust
                      lprice | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                -------------+----------------------------------------------------------------
                         mpg |  -.0421151   .0079262    -5.31   0.000    -.0579195   -.0263107
                     foreign |   .2824445   .0789464     3.58   0.001     .1250298    .4398591
                       _cons |     9.4536   .1733665    54.53   0.000     9.107917    9.799283
                ------------------------------------------------------------------------------
                
                .
                . qui gmm (log(price) - {xb: mpg foreign _cons}), vce(robust) instr(mpg foreign) igmm
                
                . gmm
                
                GMM estimation
                
                Number of parameters =   3
                Number of moments    =   3
                Initial weight matrix: Unadjusted                 Number of obs   =         74
                GMM weight matrix:     Robust
                
                ------------------------------------------------------------------------------
                             |               Robust
                             | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
                -------------+----------------------------------------------------------------
                         mpg |  -.0421151   .0077639    -5.42   0.000     -.057332   -.0268982
                     foreign |   .2824445   .0773296     3.65   0.000     .1308813    .4340077
                       _cons |     9.4536    .169816    55.67   0.000     9.120767    9.786433
                ------------------------------------------------------------------------------
                Instruments for equation 1: mpg foreign _cons
                (I suspect the difference in std. errs. owes to different degrees-of-freedom corrections.)

                It's not obvious that such use of –gmm– yields benefits beyond those available from simply creating the transformed variables.

                I suppose it could be beneficial in a situation where one wanted to explore a variety of LHS-variable transformations, e.g.
                Code:
                forval j=1/5 {
                 gmm (price^(1/`j') - {xb: mpg foreign _cons}), vce(robust) instr(mpg foreign) igmm
                }

                Comment

                Working...
                X