Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Increasing Population Size that Data is Sampling

    Hi,

    I am producing descriptive statistics of a larger population using a sample. There is weights (wgt) given for every variable. I used this code to get a simple mean wage for women:

    Code:
    svyset [pweight=wgt], clear
    svy: mean wage_month if woman
    However, whatever it is using to calculate "population size" from weights is off. The population size is far lower than it's supposed to be for the country I am analyzing. Is there a way to effectively increase the population size that the data is sampling without messing it up?


    *One thing to note is this was originally a dataset spanning 30 years and I kept only the year 2016. Its supposed to be a population of 16 million in 2016 but it is showing the population size is 6 million.

  • #2
    From what I can see, to have a monthly wage, you probably dropped all people who are too young, still in school, or retired, and that would be 30-40% attrition. Then you are looking at female only, that cuts another half. If "16 millions" is the full population, I think 6 millions in your output may be right.

    If it's supposed to be analyzed as 30-year, then it's possible that the weight was based on 30 years. One way to check is to compute the sum of this weight variable:

    Code:
    describe wgt
    display r(sum)
    and see if this is close to the population size. Do the same for year 2016:

    Code:
    describe wgt if year == 2016
    display r(sum)
    and then you can make a judgment there.


    It's kind of difficult to suggest how to reweight them without knowing the sampling scheme. If this is a survey released by an official entity, the best way is to read the analysis instructions in their data user manual, and if nothing found, contact the distributor.

    Comment

    Working...
    X