Hi all,
May be this is a very silly question. I am trying to generate population estimates with sample weights using the svy command in STATA.
I used the following code.
and got the following output.
Number of strata = 1 Number of obs = 8,245
Number of PSUs = 8,245 Population size = 36,385,946
Design df = 8,244
--------------------------------------------------------------
| Linearized
| Total std. err. [95% conf. interval]
-------------+------------------------------------------------
id | 3.64e+14 2.34e+12 3.60e+14 3.69e+14
--------------------------------------------------------------
The number in the table 3.64e+14 matches the number in population size (36,385,946), which is what one would expect. My problem is when I do this with a different round of data, I get the following output.
Number of strata = 1 Number of obs = 7,859
Number of PSUs = 7,859 Population size = 41,786,760
Design df = 7,858
--------------------------------------------------------------
| Linearized
| Total std. err. [95% conf. interval]
-------------+------------------------------------------------
id | 7.11e+14 7.34e+12 6.97e+14 7.25e+14
--------------------------------------------------------------
Why is the number in the table (7.11e+14) and the number in population size (41,786,760) not matching? For context, this is the NHATS data and 41 million matches with the report. Then what is 71 million? Is this the correct command to generate population estimate? If both the numbers were 71 million, I would certainly think so. But, 41 million is the correct population.
Appreciate any inputs. Thanks in advance!
May be this is a very silly question. I am trying to generate population estimates with sample weights using the svy command in STATA.
I used the following code.
Code:
svyset id [pweight=weight] svy: total id
Number of strata = 1 Number of obs = 8,245
Number of PSUs = 8,245 Population size = 36,385,946
Design df = 8,244
--------------------------------------------------------------
| Linearized
| Total std. err. [95% conf. interval]
-------------+------------------------------------------------
id | 3.64e+14 2.34e+12 3.60e+14 3.69e+14
--------------------------------------------------------------
The number in the table 3.64e+14 matches the number in population size (36,385,946), which is what one would expect. My problem is when I do this with a different round of data, I get the following output.
Number of strata = 1 Number of obs = 7,859
Number of PSUs = 7,859 Population size = 41,786,760
Design df = 7,858
--------------------------------------------------------------
| Linearized
| Total std. err. [95% conf. interval]
-------------+------------------------------------------------
id | 7.11e+14 7.34e+12 6.97e+14 7.25e+14
--------------------------------------------------------------
Why is the number in the table (7.11e+14) and the number in population size (41,786,760) not matching? For context, this is the NHATS data and 41 million matches with the report. Then what is 71 million? Is this the correct command to generate population estimate? If both the numbers were 71 million, I would certainly think so. But, 41 million is the correct population.
Appreciate any inputs. Thanks in advance!
Comment