Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Complex sample (Svy) population estimate 95%CI differ between SPSS and Stata: which one is better?

    Dear Statalister,

    I estimated the population size 95%CI using Stata (with the svy command). I am using Stata version 13.

    * svysetting the data
    svyset [pweight=weight1], strata(mystrata) // so no clustering and analyses with replacement.

    * example of my tabulate.
    svy: tab var1 var2, count for(%20g) ci

    However, my college is using SPSS and she thinks the estimates in SPSS are more accurate.

    Although the percentages and 95%CI of these were exactly the same, when I estimate population size, the 95%CI are slightly different.

    My question is: which program should I trust and why?

    I am just interested in simple 2x2 tables and population sizes.

    I hope you can help me!

    Kind regards,
    Daniela.

  • #2
    Just one more thing, the discrepancy is only with 2x2 tables.

    Comment


    • #3
      There is too few information here to provide a good answer. You neither state what (exactly) your colleague typed in SPSS nor the (complete and exact) output of either software. An example of a table other than 2 by 2, including complete syntax and output, where the differences do not occur might also be helpful.

      Best
      Daniel

      Comment


      • #4
        Dear Daniel,

        Thank you for your response.

        I understand more output is needed, apologies. I hope now it will be better.

        So here follows the SPSS syntax

        TEMPORARY.
        select if regio=1.
        CSTABULATE
        /PLAN FILE='U:\_Special\ewCBSGGD.csaplan'
        /TABLES VARIABLES= var2 BY var1.
        /CELLS TABLEPCT POPSIZE
        /STATISTICS CIN(95)
        /MISSING SCOPE=TABLE CLASSMISSING=EXCLUDE.

        Output SPSS
        Population Size Estimate 53966,798
        95% Confidence Interval Lower 51286,815
        Upper 56646,780

        And now the Stata syntax:

        tabout var2 var1 if regio==1 ///
        using "titel.xls", svy cell(freq ci) ///
        replace f(1c)

        Output Stata
        Population size
        No. CI
        53,966.80 [51,310.3,56,623.3]
        I noticed that I only get different results when one of the variables has missing data. I know it is not a big difference in the 95%Ci, but I want to understand where the differences come from.

        Looking forward to hearing from you.

        Kind regards,
        Daniela.

        Comment


        • #5
          I cannot help much here as I have no deeper understanding of SPSS. But I can give some further advice.

          tabout is not an official Stata command, so you need to tell us where it is from. Since tabout is not an official Stata command, try using svy tabulate instead and see whether results still differ.

          I noticed that I only get different results when one of the variables has missing data.
          In this case, you need to find out how SPSS handles missing cases, which probably has to do with the

          Code:
          /MISSING SCOPE=TABLE CLASSMISSING=EXCLUDE.
          line, and how Stata does (listwise, I suppose). As an aside, it might be better to specify the subpop() option instead of an if qualifier to restrict the survey sample.

          Perhaps someone else has more insights.

          Best
          Daniel
          Last edited by daniel klein; 10 Apr 2017, 09:30.

          Comment


          • #6
            Thank you! much appreciated!

            I tried with svy: tab and I got the same results as when using tabout.

            I'll try to find something on how missing data is handled in both softwares, but no luck untill now - at least there are no big differences in the 95%CI.

            Kind regards, Daniela


            Comment

            Working...
            X