Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimating means from the same study population (but with or without missing observations)

    Dear members,

    I am struggling to test a null hypothesis, which is two means from the same study population (but one from the sample excluding missing obs. and the other the sample including the partially missing observations) are the same. I cannot do the usual T-test because they are not from two independent or paired samples.

    The variable (mean) is GST, which is a composite score constructed by 7 variables or questions. The respondents who answered to all of the 7 variables are "0" under miss_gst variable below.

    - This means that for my sample without missing observations (excluding those who did not answer to all of the 7 questions), n=529. (miss_gst==0)
    - The other sample "with miss" includes those who responded at least one of the 7 questions for this score. This means the sample size is n=565.(569-4) as 4 samples had missing data for all the 7 variables (thus, excluded). For those who responded at least one of the 7 questions, the mean score is calculated by giving a score 0 to the missing data.
    - Each question asked a scale of 1-5. The individual mean score is calculated as [the sum of variable 1-7 / No. of variables responded.]

    . tab miss_gst

    miss_gst | Freq. Percent Cum.
    ------------+-----------------------------------
    0 | 529 92.97 92.97
    1 | 11 1.93 94.90
    2 | 7 1.23 96.13
    3 | 9 1.58 97.72
    4 | 3 0.53 98.24
    5 | 4 0.70 98.95
    6 | 2 0.35 99.30
    7 | 4 0.70 100.00
    ------------+-----------------------------------
    Total | 569 100.00

    Below is the detailed statistics for the total mean score for 1) sample with missing observations (n=565) and 2) sample without missing observations (n=529)

    I need to test if the mean for the sample 1 and sample 2 is the same, or statistically different.

    I tried

    ttest meanscore_gstwithmiss == 3.990818

    but this does not take into account the SD and other distributions for two samples (which are coming from he same study population).

    I also created a dummy variable for gst_miss==0 and gst_miss !=0 but this calculate the mean from the sample without missing (529) and those who are gst_miss = 1-6 (those who are not in the gst_miss==0) which is around 39 people only. So this still does not answer to my question.

    Could you please guide me how to calculate the statistical difference (T-test) for these two means?

    Thanks
    best, Rinko


    . summarize meanscore_gstwithmiss, detail

    Gender Stereotypical Traits (Mean +/- SD) with
    missing obs
    -------------------------------------------------------------
    Percentiles Smallest
    1% 1.2 1
    5% 2.285714 1
    10% 2.714286 1 Obs 565
    25% 3.428571 1 Sum of wgt. 565

    50% 4.142857 Mean 3.950135
    Largest Std. dev. .8604686
    75% 4.571429 5
    90% 5 5 Variance .7404062
    95% 5 5 Skewness -.9709162
    99% 5 5 Kurtosis 3.657444

    . summarize meanscore_gst, detail

    Gender Stereotypical Traits (Mean +/- SD)
    -------------------------------------------------------------
    Percentiles Smallest
    1% 1.714286 1
    5% 2.428571 1.285714
    10% 2.714286 1.571429 Obs 529
    25% 3.571429 1.571429 Sum of wgt. 529

    50% 4.142857 Mean 3.990818
    Largest Std. dev. .8215419
    75% 4.571429 5
    90% 5 5 Variance .674931
    95% 5 5 Skewness -.8749316
    99% 5 5 Kurtosis 3.251743

  • #2
    Code:
    ttest x == y , unequal unpaired

    Comment


    • #3
      George Ford Dear George, Thanks so much for the code. This works out perfectly and I could get the output below.

      Many many thanks, for your help !!!
      Rinko

      . ttest meanscore_gst== meanscore_gstwithmiss, unequal unpaired

      Two-sample t test with unequal variances
      ------------------------------------------------------------------------------
      Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]
      ---------+--------------------------------------------------------------------
      meansc~t | 529 3.990818 .0357192 .8215419 3.920649 4.060987
      meansc.. | 565 3.950135 .0362002 .8604686 3.879031 4.021238
      ---------+--------------------------------------------------------------------
      Combined | 1,094 3.969807 .0254487 .8417322 3.919873 4.019741
      ---------+--------------------------------------------------------------------
      diff | .0406834 .0508558 -.0591028 .1404696
      ------------------------------------------------------------------------------
      diff = mean(meanscore_gst) - mean(meanscore_gstw~s) t = 0.8000
      H0: diff = 0 Satterthwaite's degrees of freedom = 1091.58

      Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
      Pr(T < t) = 0.7881 Pr(|T| > |t|) = 0.4239 Pr(T > t) = 0.2119

      Comment

      Working...
      X