Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Chi -square test with missing value using table1_mc

    Hi all,

    I am having problems to compute the chi-square statistics based only on the valid entries (without missing values) using table1_mc from SSC.

    The same problem I get with prtest, proportions are calculated with missing values.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(m1 m2 age female dept offcampus) str11 comment
    
    2 2 18 0 1 0 ""           
    4 4 20 1 4 0 "Great!"     
    2 2 18 1 1 0 ""           
    1 1 19 1 1 1 "I'm leaving"
    2 2 18 0 1 0 ""           
    2 2  . . . 1 ""           
    2 2 18 1 . 0 ""           
    3 4 20 1 4 1 ""           
    4 3 18 1 3 0 ""           
    2 2 19 0 2 1 ""           
    2 2 19 1 2 1 ""           
    2 2 19 1 . 1 ""           
    2 2 20 0 4 1 ""           
    4 3 20 0 . 1 ""           
    2 4 20 1 3 0 ""           
    3 2 20 0 . 1 ""           
    2 2 19 0 4 1 ""           
    2 2 18 1 1 0 ""           
    2 2 18 0 1 0 ""           
    4 4 18 0 2 0 ""           
    4 4 20 0 4 0 ""           
    2 2 18 0 1 0 ""           
    3 3 19 0 1 0 ""           
    2 1 18 1 1 0 ""           
    2 2  . . . 0 ""           
    1 2 18 0 1 0 ""           
    2 2 18 0 4 0 ""           
    4 4 21 0 4 0 ""           
    2 2 19 1 1 0 ""           
    2 2 20 0 1 0 ""           
    4 4 20 1 4 0 ""           
    2 2 18 0 1 1 ""           
    2 2 19 1 4 1 ""           
    2 2 18 0 3 1 ""           
    2 2 19 0 2 1 ""           
    2 2 19 1 1 0 ""           
    4 2 19 1 1 0 ""           
    2 2 19 0 3 1 ""           
    4 4 19 1 4 0 ""           
    2 2 19 1 4 1 ""           
    2 2 18 1 2 0 ""           
    2 2 19 1 3 1 ""           
    2 2 19 0 4 1 ""           
    3 3 19 1 3 0 ""           
    2 2 18 0 . 0 ""           
    2 4 19 0 3 0 ""           
    2 2 20 1 3 0 ""           
    2 2 19 1 4 1 ""           
    1 2 19 0 1 1 ""           
    2 2 20 0 1 0 ""           
    2 2 18 1 1 0 ""           
    2 2 19 0 . 0 ""           
    2 2 18 0 3 1 ""           
    2 2 19 0 1 1 ""           
    2 3 19 1 1 0 ""           
    2 2 19 1 2 0 ""           
    2 2 18 1 4 0 ""           
    2 3 20 1 2 0 ""           
    2 2 19 0 4 1 ""           
    2 2 17 1 1 0 ""           
    2 2  . . . 0 ""           
    2 2 19 1 1 0 ""           
    2 2 19 1 1 1 ""           
    4 4 20 0 4 1 ""           
    2 2 18 1 3 0 ""           
    3 2 18 0 3 0 ""           
    2 2 19 1 2 1 ""           
    2 2 17 0 3 0 ""           
    2 2 19 1 3 0 ""           
    2 2 19 0 3 0 ""           
    2 3 20 0 4 0 ""           
    2 2 20 1 3 1 ""           
    2 2 18 0 1 0 ""           
    4 2 19 0 1 0 ""           
    4 2 20 0 4 0 ""           
    2 2 18 1 2 1 ""           
    2 2 20 1 4 1 ""           
    2 2 18 0 1 1 ""           
    2 2 19 1 2 0 ""           
    2 2 19 0 1 1 ""           
    2 2 19 0 3 1 ""           
    2 2 19 1 1 1 ""           
    2 2 19 0 1 1 ""           
    2 3 21 0 3 1 ""           
    4 4 19 1 4 0 ""           
    2 1 17 0 1 1 ""           
    3 3 19 1 4 0 ""           
    4 4 19 1 1 0 ""           
    2 2 20 0 4 1 ""           
    2 2 19 1 2 0 ""           
    4 3 18 1 4 0 ""           
    4 3 20 0 2 0 ""           
    3 2 18 1 3 0 ""           
    2 3 18 1 4 0 ""           
    2 2 19 0 1 0 ""           
    3 3 18 0 4 0 ""           
    3 4 20 1 2 0 ""           
    2 2 19 0 2 1 ""           
    4 4 19 1 4 0 ""           
    2 3 20 1 2 0 ""           
    end
    
    
    tab female dept,  chi2
    
    
               |                    dept
        female |         1          2          3          4 |     Total
    -----------+--------------------------------------------+----------
             0 |        18          5          9         12 |        44
             1 |        14         10          9         14 |        47
    -----------+--------------------------------------------+----------
         Total |        32         15         18         26 |        91
    
              Pearson chi2(3) =   2.2240   Pr = 0.527
    
    
    
    tab female dept, miss chi2
    
              |                          dept
        female |         1          2          3          4          . |     Total
    -----------+-------------------------------------------------------+----------
             0 |        18          5          9         12          4 |        48
             1 |        14         10          9         14          2 |        49
             . |         0          0          0          0          3 |         3
    -----------+-------------------------------------------------------+----------
         Total |        32         15         18         26          9 |       100
    
              Pearson chi2(8) =  34.0972   Pr = 0.000
    
    
    
    table1_mc, by(female) miss onecol vars(dept cat)  extraspace statistic test clear
    
      +--------------------------------+
      | factor   N_0   N_1   m_0   m_1 |
      |--------------------------------|
      | dept      48    49     0     0 |
      +--------------------------------+
       N_ ... #records used below,   m_ ... #records not used
     
      +------------------------------------------------------------------------------+
      |              female = 0   female = 1   Test         Statistic        p-value |
      |------------------------------------------------------------------------------|
      |              N=48         N=49                                               |
      |------------------------------------------------------------------------------|
      | dept                                   Chi-square   Chi2(4)=  2.98    0.56   |
      |    1         18 (38%)     14 (29%)                                           |
      |    2         5 (10%)      10 (20%)                                           |
      |    3         9 (19%)      9 (18%)                                            |
      |    4         12 (25%)     14 (29%)                                           |
      |    Missing   4 (  8%)     2 (  4%)                                           |
      +------------------------------------------------------------------------------+
    Data are presented as n (%).
    You can check that lamentably table1_mc use missing values to Chi2 test.

  • #2
    You can check that lamentably table1_mc use missing values to Chi2 test.
    Yes, but it's doing that because you specified the -missing- option in your -table1_mc- command. Remove that and you will get what you want.

    Comment


    • #3
      Thanks Clyde Schechter for you reply. Removing miss option

      Code:
      table1_mc, by(female)  onecol vars(dept cat)  extraspace statistic test clear
      
        +--------------------------------+
        | factor   N_0   N_1   m_0   m_1 |
        |--------------------------------|
        | dept      44    47     4     2 |
        +--------------------------------+
         N_ ... #records used below,   m_ ... #records not used
       
        +--------------------------------------------------------------------------+
        |          female = 0   female = 1   Test         Statistic        p-value |
        |--------------------------------------------------------------------------|
        |          N=48         N=49                                               |
        |--------------------------------------------------------------------------|
        | dept                               Chi-square   Chi2(3)=  2.22    0.53   |
        |    1     18 (41%)     14 (30%)                                           |
        |    2     5 (11%)      10 (21%)                                           |
        |    3     9 (20%)      9 (19%)                                            |
        |    4     12 (27%)     14 (30%)                                           |
        +--------------------------------------------------------------------------+
      
      
      
      Only remove missing frequencies and yes change a little the chi-square p-value not was my expected result.
      
      Because results of
      
      
      
      Code:
       tab female dept, miss chi2
      give a p-value=0.0000, I can not get this result with table1_mc | dept female | 1 2 3 4 . | Total -----------+-------------------------------------------------------+---------- 0 | 18 5 9 12 4 | 48 1 | 14 10 9 14 2 | 49 . | 0 0 0 0 3 | 3 -----------+-------------------------------------------------------+---------- Total | 32 15 18 26 9 | 100 Pearson chi2(8) = 34.0972 Pr = 0.000

      Comment


      • #4
        I'm not following you. When you remove the missing option from the -table1_mc- command you are getting:
        Code:
          +--------------------------------------------------------------------------+
          |          female = 0   female = 1   Test         Statistic        p-value |
          |--------------------------------------------------------------------------|
          |          N=48         N=49                                               |
          |--------------------------------------------------------------------------|
          | dept                               Chi-square   Chi2(3)=  2.22    0.53   |
          |    1     18 (41%)     14 (30%)                                           |
          |    2     5 (11%)      10 (21%)                                           |
          |    3     9 (20%)      9 (19%)                                            |
          |    4     12 (27%)     14 (30%)                                           |
          +--------------------------------------------------------------------------+
        and -tab female dept, chi2- gives you

        Code:
                   |                    dept
            female |         1          2          3          4 |     Total
        -----------+--------------------------------------------+----------
                 0 |        18          5          9         12 |        44
                 1 |        14         10          9         14 |        47
        -----------+--------------------------------------------+----------
             Total |        32         15         18         26 |        91
        
                  Pearson chi2(3) =   2.2240   Pr = 0.527
        and these results are the same except that -tab- gives you 3 decimal places in the p-value whereas table1_mc gives you 2.

        So, what is the problem?

        Comment


        • #5
          I expect get this chi- square result whith table1_mc:

          Code:
           tab female dept, miss chi2        
          
                     |                          dept
              female |         1          2          3          4          . |     Total
          -----------+-------------------------------------------------------+----------
                   0 |        18          5          9         12          4 |        48
                   1 |        14         10          9         14          2 |        49
                   . |         0          0          0          0          3 |         3
          -----------+-------------------------------------------------------+----------
               Total |        32         15         18         26          9 |       100
          
                    Pearson chi2(8) =  34.0972   Pr = 0.000


          Pr=0.0000, Not Accept H0

          and whith table1_mc chi-square test:

          Pr=0.53 or 0.56, I Accept H0
          Last edited by Rodrigo Badilla; 11 Jun 2024, 12:26.

          Comment


          • #6
            Oh, I see what's going on. The problem is that your data is not compliant with the requirements of table1_mc. From its help file:
            Code:
                  by(varname)           group observations by varname, which must be either (i) string, or (ii) numeric and contain only non-negative integers, whether or not a
                                          value label is attached
            [emphasis added]
            You can't have missing values in the -by()- variable of table1_mc. And, in fact, looking carefully at the output it's giving you, you can see that it is taking the missing values of variable dept into account, but ignoring those of variable female in doing its calculations.

            You can work around this by something like this:
            Code:
            clonevar female2 = female
            replace female2 = 9 if missing(female)
            label define female2 0 "Male"    1    "Female"    9    "Missing"
            label values female2 female2
            tab female2 dept, miss chi2
            table1_mc, by(female2) miss onecol vars(dept cat)  extraspace statistic test
            By the way, -table1_mc- is not an official Stata command. When posting about user-written commands, it is best practice to mention that fact and indicate where it comes from. In this case, it is by Mark Chatfield and available from SSC.

            Comment


            • #7
              Thanks Clyde Schechter for your reply.

              I thought that miss option in table1_mc was like miss in tabulate x y, miss chi2.

              I will use your command.

              Regards
              Rodrigo
              Last edited by Rodrigo Badilla; 11 Jun 2024, 13:04.

              Comment

              Working...
              X