Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using weights in regression

    Hi everyone,
    I want to run a regression using weights in stata. I already know which command to use : reg y v1 v2 v3 [pweight= weights]. But I would like to find out how stata exactly works with the weights and how stata weights the individual observations.
    In the stata-syntax-file I have read the attached concept.

    I tried to do the regression manually in stata by first weight all variables of observation i with sqrt(wi) and then perform a multiple linear regression. However, I don't get the same results as when I do a regression by using the option [pweight = weights].
    Does anyone know why the calculation is false and how stata considers the weights in the observations?

    Thank you for your help!
    Attached Files
    Last edited by Yolanda Schmidt; 20 Jul 2020, 05:35.

  • #2
    You get different results because pweights and aweights are different. In the picture that you post you see it is aweight, and in your post you speak of pweight.

    Here is an explanation of what is going on

    https://www.stata.com/support/faqs/s...ry-statistics/

    Comment


    • #3
      Here is another tutorial on weights in Stata. In short it is a bit of a headache to figure out 1. what you need to do (how to weight) 2. what Stata does 3. It all depends on the estimator.

      https://www.parisschoolofeconomics.e...n-stata(1).pdf

      Comment


      • #4
        Here is an example showing the equivalence in #1:

        Code:
        sysuse auto
        keep in 1/10
        keep mpg weight
        gen w=_n^2
        l, sep(10)
        regress mpg weight [aweight=w]
        gen wcons= sqrt(w)
        gen wmpg= sqrt(w)*mpg
        gen wweight= sqrt(w)*weight
        regress wmpg wweight wcons, nocons
        Res.:

        Code:
        . l, sep(10)
        
             +--------------------+
             | mpg   weight     w |
             |--------------------|
          1. |  22    2,930     1 |
          2. |  17    3,350     4 |
          3. |  22    2,640     9 |
          4. |  20    3,250    16 |
          5. |  15    4,080    25 |
          6. |  18    3,670    36 |
          7. |  26    2,230    49 |
          8. |  20    3,280    64 |
          9. |  16    3,880    81 |
         10. |  19    3,400   100 |
             +--------------------+
        
        . regress mpg weight [aweight=w]
        (sum of wgt is 385)
        
              Source |       SS           df       MS      Number of obs   =        10
        -------------+----------------------------------   F(1, 8)         =    477.32
               Model |  95.5590801         1  95.5590801   Prob > F        =    0.0000
            Residual |  1.60158778         8  .200198473   R-squared       =    0.9835
        -------------+----------------------------------   Adj R-squared   =    0.9815
               Total |  97.1606679         9  10.7956298   Root MSE        =    .44744
        
        ------------------------------------------------------------------------------
                 mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
              weight |    -.00588   .0002691   -21.85   0.000    -.0065007   -.0052594
               _cons |   39.02116   .9195017    42.44   0.000     36.90078    41.14153
        ------------------------------------------------------------------------------
        
        . regress wmpg wweight wcons, nocons
        
              Source |       SS           df       MS      Number of obs   =        10
        -------------+----------------------------------   F(2, 8)         =   9418.14
               Model |  145183.339         2  72591.6694   Prob > F        =    0.0000
            Residual |  61.6611296         8   7.7076412   R-squared       =    0.9996
        -------------+----------------------------------   Adj R-squared   =    0.9995
               Total |      145245        10     14524.5   Root MSE        =    2.7763
        
        ------------------------------------------------------------------------------
                wmpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
             wweight |    -.00588   .0002691   -21.85   0.000    -.0065007   -.0052594
               wcons |   39.02116   .9195017    42.44   0.000     36.90078    41.14153
        ------------------------------------------------------------------------------

        Comment


        • #5
          Originally posted by Joro Kolev View Post
          You get different results because pweights and aweights are different. In the picture that you post you see it is aweight, and in your post you speak of pweight.

          Here is an explanation of what is going on

          https://www.stata.com/support/faqs/s...ry-statistics/
          Thanks for the document, its very helpful.

          But in the Instruction they mention:
          Note that point estimates are the same than the one obtained using aweight

          I also get the same results when i use pweights and aweights.

          Comment


          • #6
            Originally posted by Andrew Musau View Post
            Here is an example showing the equivalence in #1:

            Code:
            sysuse auto
            keep in 1/10
            keep mpg weight
            gen w=_n^2
            l, sep(10)
            regress mpg weight [aweight=w]
            gen wcons= sqrt(w)
            gen wmpg= sqrt(w)*mpg
            gen wweight= sqrt(w)*weight
            regress wmpg wweight wcons, nocons
            Res.:

            Code:
            . l, sep(10)
            
            +--------------------+
            | mpg weight w |
            |--------------------|
            1. | 22 2,930 1 |
            2. | 17 3,350 4 |
            3. | 22 2,640 9 |
            4. | 20 3,250 16 |
            5. | 15 4,080 25 |
            6. | 18 3,670 36 |
            7. | 26 2,230 49 |
            8. | 20 3,280 64 |
            9. | 16 3,880 81 |
            10. | 19 3,400 100 |
            +--------------------+
            
            . regress mpg weight [aweight=w]
            (sum of wgt is 385)
            
            Source | SS df MS Number of obs = 10
            -------------+---------------------------------- F(1, 8) = 477.32
            Model | 95.5590801 1 95.5590801 Prob > F = 0.0000
            Residual | 1.60158778 8 .200198473 R-squared = 0.9835
            -------------+---------------------------------- Adj R-squared = 0.9815
            Total | 97.1606679 9 10.7956298 Root MSE = .44744
            
            ------------------------------------------------------------------------------
            mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            weight | -.00588 .0002691 -21.85 0.000 -.0065007 -.0052594
            _cons | 39.02116 .9195017 42.44 0.000 36.90078 41.14153
            ------------------------------------------------------------------------------
            
            . regress wmpg wweight wcons, nocons
            
            Source | SS df MS Number of obs = 10
            -------------+---------------------------------- F(2, 8) = 9418.14
            Model | 145183.339 2 72591.6694 Prob > F = 0.0000
            Residual | 61.6611296 8 7.7076412 R-squared = 0.9996
            -------------+---------------------------------- Adj R-squared = 0.9995
            Total | 145245 10 14524.5 Root MSE = 2.7763
            
            ------------------------------------------------------------------------------
            wmpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            wweight | -.00588 .0002691 -21.85 0.000 -.0065007 -.0052594
            wcons | 39.02116 .9195017 42.44 0.000 36.90078 41.14153
            ------------------------------------------------------------------------------
            Thank you! Now it works! I forgot to weight the constant.

            Comment


            • #7
              how do i use weight(1) option in stata and how do i explain it?

              Comment

              Working...
              X