Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Display of group labels using ttest

    Why does ttest not display the complete value labels of groups? The labels are short (9 characters) but contain a space character. Here the results comparing the output of ttest and oneway:
    Code:
    . ttest minutes if completed, by(us_version)
    
    Two-sample t test with equal variances
    ------------------------------------------------------------------------------
       Group |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
    ---------+--------------------------------------------------------------------
     version |   2,047    11.24085    .1676527    7.585244    10.91207    11.56964
     version |   2,045    10.03937     .144189    6.520469    9.756593    10.32214
    ---------+--------------------------------------------------------------------
    Combined |   4,092     10.6404    .1109572    7.097792    10.42287    10.85794
    ---------+--------------------------------------------------------------------
        diff |            1.201489    .2211449                .7679245    1.635053
    ------------------------------------------------------------------------------
        diff = mean(version) - mean(version)                          t =   5.4330
    H0: diff = 0                                     Degrees of freedom =     4090
    
        Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
     Pr(T < t) = 1.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 0.0000
    
    . oneway minutes us_version if completed, tab
    
    questionnai |      Summary of duration total
        re part |        Mean   Std. dev.       Freq.
    ------------+------------------------------------
      version A |       11.24        7.59       2,047
      version B |       10.04        6.52       2,045
    ------------+------------------------------------
          Total |       10.64        7.10       4,092
    
                            Analysis of variance
        Source              SS         df      MS            F     Prob > F
    ------------------------------------------------------------------------
    Between groups      1476.77766      1   1476.77766     29.52     0.0000
     Within groups      204622.261   4090   50.0298926
    ------------------------------------------------------------------------
        Total           206099.038   4091   50.3786454
    
    Bartlett's equal-variances test: chi2(1) =  46.5968    Prob>chi2 = 0.000
    Please ignore the issue whether ttest (or oneway) is the optimal procedure to compare average durations.

  • #2
    As you know, value labels can be up to 32 characters long, so some abbreviation is often inevitable. If I were reporting a t test I would typically use fewer significant figures and leave out most of the detail -- or give a graphical equivalent -- yet also typically show full(er) text, so does this really bite?

    I fear that I am not grasping the question!

    Comment


    • #3
      My question is: Although the value labels of the group variable ("us_version") are only 9 characters wide (including space character), why does the ttest output not show the complete labels (it drops " A" and " B") while the oneway output is able to show them completely? When reading the output of ttest you don't know which group's mean is larger, whereas the output table of oneway shows this.
      Last edited by Dirk Enzmann; 19 Aug 2024, 21:49.

      Comment


      • #4
        I explored this further and obviously the maximum value label length of ttest is 8 characters in order to display the labels fully (not 32). Although I appreciate shorter labels, to my mind 8 characters are way to short.

        Here a demonstration:
        Code:
        . sysuse auto
        (1978 automobile data)
        
        .
        . // 8 character wide value labels:
        . lab def origin 0 "1234567A" 1 "1234567B", modify
        
        . ttest price, by(foreign)
        
        Two-sample t test with equal variances
            
        Group      Obs        Mean    Std. err.   Std. dev.    [95% conf. interval]
            
        1234567A       52    6072.423    429.4911    3097.104    5210.184    6934.662
        1234567B       22    6384.682    558.9942    2621.915    5222.19    7547.174
            
        Combined       74    6165.257    342.8719    2949.496    5481.914      6848.6
            
        diff            -312.2587    754.4488    -1816.225    1191.708
            
        diff = mean(1234567A) - mean(1234567B)    t =  -0.4139
        H0: diff = 0                                     Degrees    of freedom =       72
        
        Ha: diff < 0                 Ha: diff != 0    Ha: diff > 0
        Pr(T < t) = 0.3401         Pr(T > t) = 0.6802    Pr(T > t) = 0.6599
        
        .
        . // 9 character wide value labels:
        . lab def origin 0 "12345678A" 1 "12345678B", modify
        
        . ttest price, by(foreign)
        
        Two-sample t test with equal variances
            
        Group      Obs        Mean    Std. err.   Std. dev.    [95% conf. interval]
            
        12345678       52    6072.423    429.4911    3097.104    5210.184    6934.662
        12345678       22    6384.682    558.9942    2621.915    5222.19    7547.174
            
        Combined       74    6165.257    342.8719    2949.496    5481.914      6848.6
            
        diff            -312.2587    754.4488    -1816.225    1191.708
            
        diff = mean(12345678) - mean(12345678)    t =  -0.4139
        H0: diff = 0                                     Degrees    of freedom =       72
        
        Ha: diff < 0                 Ha: diff != 0    Ha: diff > 0
        Pr(T < t) = 0.3401         Pr(T > t) = 0.6802    Pr(T > t) = 0.6599
        Hence, my original question now turns into a wishlist topic. Of course, everything would be fine if longer labels will be abbreviated, but here no abbreviation takes place.
        Last edited by Dirk Enzmann; 19 Aug 2024, 22:25.

        Comment


        • #5
          The length limit wasn't always 32 but I've forgotten, if i ever knew, what it was originally, perhaps even 8!

          As of now, this is a question for StataCorp really. I suppose their main philosophy is that there are many tools now to take results and produce customised tables to taste. Most researchers would in this case gain space by omitting some results and trimming the number of decimal places in some other columns. Otherwise the message is that the most informative text should be at the start of a value label (or variable label too, in other contexts).

          Comment

          Working...
          X