Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question regarding the collapse command in STATA.

    Hello, I am facing a question regarding the collapse command in STATA and would be thankful for any clarification on this. On page 7of dcollapse.pdf, under "weights", it says, "Weight normalization affects only the sum, count, sd, semean, and sebinomial statistics.". Does this mean that if I use collapse to generate means, it will not matter if I use weights or not? For example, would the following two commands be equivalent?

    collapse (mean) educ [pweights=x]
    collapse (mean) educ

    Thank you.

  • #2
    [quote]For example, would the following two commands be equivalent?[/code]
    No! See for yourself:
    Code:
    . sysuse auto, clear
    (1978 automobile data)
    
    .
    . set seed 1234
    
    . gen x = 1/runiform()
    
    .
    . preserve
    
    . collapse (mean) mpg [pweight = x]
    
    . list, noobs clean
    
            mpg  
        19.5521  
    
    .
    . restore
    
    . collapse (mean) mpg
    
    . list, noobs clean
    
            mpg  
        21.2973
    You have misread the sentence "Weight normalization affects only the sum, count, sd, semean, and sebinomial statistics." The focus of the sentence is not the use of weights but the normalization of whatever weights are used. The reason that the mean statistic is not included in that sentence is that the mean comes out the same whether you use the weights as you find them, or normalize the weights. By contrast, a sd calculated using normalized weights is different from the sd calculated using the weights as you find them:

    Code:
    . sysuse auto, clear
    (1978 automobile data)
    
    .
    . set seed 1234
    
    . gen x = 1/runiform()
    
    .
    . summ x, meanonly
    
    . gen normalized_x = x/r(N)
    
    .
    . preserve
    
    . collapse (mean) mpg (sd) headroom [iweight = x]
    
    . list, noobs clean
    
            mpg   headroom  
        19.5521        0.7  
    
    .
    . restore
    
    . collapse (mean) mpg (sd) headroom [iweight = normalized_x]
    
    . list, noobs clean
    
            mpg   headroom  
        19.5521        0.8
    Note: In the example regarding normalization I used iweights to illustrate what happens because sd's cannot take pweights, fweights must be integers, and aweights are automatically normalized and Stata will not calculate a non-normalized sd with aweights. So iweights are the only weights that can be used to demonstrate this example.

    Comment


    • #3
      [QUOTE=Clyde Schechter;n1768230]
      For example, would the following two commands be equivalent?[/code]
      No! See for yourself:
      Code:
      . sysuse auto, clear
      (1978 automobile data)
      
      .
      . set seed 1234
      
      . gen x = 1/runiform()
      
      .
      . preserve
      
      . collapse (mean) mpg [pweight = x]
      
      . list, noobs clean
      
      mpg
      19.5521
      
      .
      . restore
      
      . collapse (mean) mpg
      
      . list, noobs clean
      
      mpg
      21.2973
      You have misread the sentence "Weight normalization affects only the sum, count, sd, semean, and sebinomial statistics." The focus of the sentence is not the use of weights but the normalization of whatever weights are used. The reason that the mean statistic is not included in that sentence is that the mean comes out the same whether you use the weights as you find them, or normalize the weights. By contrast, a sd calculated using normalized weights is different from the sd calculated using the weights as you find them:

      Code:
      . sysuse auto, clear
      (1978 automobile data)
      
      .
      . set seed 1234
      
      . gen x = 1/runiform()
      
      .
      . summ x, meanonly
      
      . gen normalized_x = x/r(N)
      
      .
      . preserve
      
      . collapse (mean) mpg (sd) headroom [iweight = x]
      
      . list, noobs clean
      
      mpg headroom
      19.5521 0.7
      
      .
      . restore
      
      . collapse (mean) mpg (sd) headroom [iweight = normalized_x]
      
      . list, noobs clean
      
      mpg headroom
      19.5521 0.8
      Note: In the example regarding normalization I used iweights to illustrate what happens because sd's cannot take pweights, fweights must be integers, and aweights are automatically normalized and Stata will not calculate a non-normalized sd with aweights. So iweights are the only weights that can be used to demonstrate this example.
      Thanks, Clyde! Makes sense.

      Comment

      Working...
      X