Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • If Statement: Variable is Continuous

    Hi,

    I currently have a foreach loop, and want to run tabulations using each variable in the variable list (most of them are categorical). However, if the variable in the variable list if continuous, I want it to run ranksum instead. Is there an expression I can use to signify whether a variable in a variable list is continuous or not, without creating a separate variable list?

    scalar k=5
    foreach v of varlist char1 char2 char3 {
    if "`v'" != "char2" {
    tabulate exposure `v' if cond1!=., chi2 exact row col
    putexcel J`=k' = `r(p_exact)'
    }
    scalar k = k + 1
    }

    Currently, I have replaced all continuous variables in the variable list with the placeholder variable "char2" and ran the following codes below, for each, after the foreach loop. Incorporating ranksum into the foreach loop would be more efficient than replacing each box individually.

    ranksum continuousvar if cond1!=. & exposure !=., by (exposure)
    scalar pvalue = 2*normprob(-abs(`r(z)'))
    display pvalue // to check that it matches
    putexcel F6 = pvalue, nformat(number_d3)
    Thank you!
    Priscilla

  • #2
    There is no surefire way to distinguish a continuous variable from a categorical variable. For example, suppose I had a variable that took on all integer values from 1 through 50. That might be a (quasi-) continuous variable such as a scale score with a range from 1 to 50. Or it might be a categorical variable denoting, say, the 50 states of the United States of America. There is no algorithmic way to know which it is.

    As a practical matter, one often picks some threshold number of values and calls the variable categorical if it has fewer values than that, and continuous otherwise. Suppose you decide to say a variable is continuous if it has 10 or more different values. You could do this:
    Code:
    scalar k = 5
    local threshold 10
    foreach v of varlist char1 char2 char3 /* etc. */ {
         distinct `v'
         if `r(ndistinct)' > `threshold' { // "CONTINUOUS" CASE
             ranksum `v' if cond1 != . & exposure != ., by(exposure)
             putexcel J`=k' = `r(p_exact)'
        }
        else { // "DISCRETE" CASE
            tabulate exposure `v' if cond1!=., chi2 exact row col
            putexcel J`=k' = `r(p_exact)'
        }
        scalar k = k + 1
    }
    Added: distinct.ado is written by Gary Longton and Nick Cox. It is available from SSC.
    Last edited by Clyde Schechter; 03 Aug 2023, 15:55.

    Comment


    • #3
      Hi Clyde,

      That is a great point!

      Unfortunately, I get the error code "command distinct is unrecognized" when I run it in Stata 15.1.
      When I remove the line "distinct `v'", there is a different error code of ">10 invalid name"
      Is there an alternative method I can try?

      Thanks again!
      Priscilla

      Comment


      • #4
        -distinct- is not part of official Stata. It is a user-written program available from SSC. I forgot to say that in my original post. I did mention it in an addendum, but I think you read and acted on my post before that. Sorry for the confusion. Install -distinct- from SSC and run the code as shown in #2.

        Comment


        • #5
          I am not sure whether this is what you want, but it may be that vl-macros may help (see -vl help-, the PDF-manual shows much more details -- note that you you cannot only define "system" vl-macros but also "user" vl-macros).

          To illustrate the principles from which you can build:
          Code:
          sysuse auto, clear
          
          vl set   // you also can use the option "categorical(#)', see -vl help-
          vl move (price mpg weight length displacement) vlcontinuous
          vl dir
          vl list, min max obs
          
          foreach v of varlist _all {
             local vtype : char `v'[_vlsysname]
             if "`vtype'"== "vlcontinuous" sum `v'
             else {
                if "`vtype'"=="vlcategorical" tab1 `v'
                else di _n as res "`v'" as txt " is neither continuous nor categorical"
             }
          }
          the result will be:

          Code:
          . sysuse auto, clear
          (1978 automobile data)
          
          .
          . vl set   // you also can use the option "categorical(#)', see -vl help-
          
          -------------------------------------------------------------------------------
                            |                      Macro's contents
                            |------------------------------------------------------------
          Macro             |  # Vars   Description
          ------------------+------------------------------------------------------------
          System            |
            $vlcategorical  |       2   categorical variables
            $vlcontinuous   |       2   continuous variables
            $vluncertain    |       7   perhaps continuous, perhaps categorical variables
            $vlother        |       0   all missing or constant variables
          -------------------------------------------------------------------------------
          Notes
          
                1. Review contents of vlcategorical and vlcontinuous to ensure they are correct.  Type
                   vl list vlcategorical and type vl list vlcontinuous.
          
                2. If there are any variables in vluncertain, you can reallocate them to vlcategorical,
                   vlcontinuous, or vlother.  Type vl list vluncertain.
          
                3. Use vl move to move variables among classifications.  For example, type
                   vl move (x50 x80) vlcontinuous to move variables x50 and x80 to the continuous
                   classification.
          
                4. vlnames are global macros.  Type the vlname without the leading dollar sign ($) when
                   using vl commands.  Example: vlcategorical not $vlcategorical.  Type the dollar sign with
                   other Stata commands to get a varlist.
          
          . vl move (price mpg weight length displacement) vlcontinuous
          note: 5 variables specified and 5 variables moved.
          
          ------------------------------
          Macro          # Added/Removed
          ------------------------------
          $vlcategorical               0
          $vlcontinuous                5
          $vluncertain                -5
          $vlother                     0
          ------------------------------
          
          . vl dir
          
          -------------------------------------------------------------------------------
                            |                      Macro's contents
                            |------------------------------------------------------------
          Macro             |  # Vars   Description
          ------------------+------------------------------------------------------------
          System            |
            $vlcategorical  |       2   categorical variables
            $vlcontinuous   |       7   continuous variables
            $vluncertain    |       2   perhaps continuous, perhaps categorical variables
            $vlother        |       0   all missing or constant variables
          -------------------------------------------------------------------------------
          
          . vl list, min max obs
          
          -----------------------------------------------------------------------------------
              Variable | Macro           Values         Levels       Min       Max        Obs
          -------------+---------------------------------------------------------------------
                 rep78 | $vlcategorical  integers >=0        5         1         5         69
               foreign | $vlcategorical  0 and 1             2         0         1         74
              headroom | $vlcontinuous   noninteger                  1.5         5         74
            gear_ratio | $vlcontinuous   noninteger                 2.19      3.89         74
                 price | $vlcontinuous   integers >=0       74      3291     15906         74
                   mpg | $vlcontinuous   integers >=0       21        12        41         74
                weight | $vlcontinuous   integers >=0       64      1760      4840         74
                length | $vlcontinuous   integers >=0       47       142       233         74
          displacement | $vlcontinuous   integers >=0       31        79       425         74
                 trunk | $vluncertain    integers >=0       18         5        23         74
                  turn | $vluncertain    integers >=0       18        31        51         74
          -----------------------------------------------------------------------------------
          
          .
          . foreach v of varlist _all {
            2.    local vtype : char `v'[_vlsysname]
            3.    if "`vtype'"== "vlcontinuous" sum `v'
            4.    else {
            5.       if "`vtype'"=="vlcategorical" tab1 `v'
            6.       else di _n as res "`v'" as txt " is neither continuous nor categorical"
            7.    }
            8. }
          
          make is neither continuous nor categorical
          
              Variable |        Obs        Mean    Std. dev.       Min        Max
          -------------+---------------------------------------------------------
                 price |         74    6165.257    2949.496       3291      15906
          
              Variable |        Obs        Mean    Std. dev.       Min        Max
          -------------+---------------------------------------------------------
                   mpg |         74     21.2973    5.785503         12         41
          
          -> tabulation of rep78 
          
               Repair |
          record 1978 |      Freq.     Percent        Cum.
          ------------+-----------------------------------
                    1 |          2        2.90        2.90
                    2 |          8       11.59       14.49
                    3 |         30       43.48       57.97
                    4 |         18       26.09       84.06
                    5 |         11       15.94      100.00
          ------------+-----------------------------------
                Total |         69      100.00
          
              Variable |        Obs        Mean    Std. dev.       Min        Max
          -------------+---------------------------------------------------------
              headroom |         74    2.993243    .8459948        1.5          5
          
          trunk is neither continuous nor categorical
          
              Variable |        Obs        Mean    Std. dev.       Min        Max
          -------------+---------------------------------------------------------
                weight |         74    3019.459    777.1936       1760       4840
          
              Variable |        Obs        Mean    Std. dev.       Min        Max
          -------------+---------------------------------------------------------
                length |         74    187.9324    22.26634        142        233
          
          turn is neither continuous nor categorical
          
              Variable |        Obs        Mean    Std. dev.       Min        Max
          -------------+---------------------------------------------------------
          displacement |         74    197.2973    91.83722         79        425
          
              Variable |        Obs        Mean    Std. dev.       Min        Max
          -------------+---------------------------------------------------------
            gear_ratio |         74    3.014865    .4562871       2.19       3.89
          
          -> tabulation of foreign 
          
           Car origin |      Freq.     Percent        Cum.
          ------------+-----------------------------------
             Domestic |         52       70.27       70.27
              Foreign |         22       29.73      100.00
          ------------+-----------------------------------
                Total |         74      100.00
          Last edited by Dirk Enzmann; 03 Aug 2023, 16:40.

          Comment


          • #6
            It's not going to make much difference here, but distinct was written up in the Stata Journal and is being maintained through the Journal.
            The version on SSC dates to 2012.

            Code:
             search distinct, sj
            
            Search of official help files, FAQs, Examples, and Stata Journals
            
            SJ-23-2 dm0042_4  . . . . . . . . . . . . . . . . Software update for distinct
                    (help distinct, distinctgen if installed)  N. J. Cox and G. M. Longton
                    Q2/23   SJ 23(2):595--596
                    most important change is addition of distinctgen command
            
            SJ-20-4 dm0042_3  . . . . . . . . . . . . . . . . Software update for distinct
                    (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
                    Q4/20   SJ 20(4):1028--1030
                    sort() option has been added
            
            SJ-15-3 dm0042_2  . . . . . . . . . . . . . . . . Software update for distinct
                    (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
                    Q3/15   SJ 15(3):899
                    improved table format and display of large numbers of
                    observations
            
            SJ-12-2 dm0042_1  . . . . . . . . . . . . . . . . Software update for distinct
                    (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
                    Q2/12   SJ 12(2):352
                    options added to restrict output to variables with a minimum
                    or maximum of distinct values
            
            SJ-8-4  dm0042  . . . . . . . . . . . .  Speaking Stata: Distinct observations
                    (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
                    Q4/08   SJ 8(4):557--568
                    shows how to answer questions about distinct observations
                    from first principles; provides a convenience command
            This is a point at which I raise what may be a slightly delicate question. My efforts to contact my long-time co-author Gary Longton before the most recent update all failed. I also contacted his former employer to ask them for any information they could give, or if policy rules out releasing details on past or present employees, for them to convey that I wished to establish contact. But I received no reply at all.

            If anyone can give me further information. please contact me privately.
            Last edited by Nick Cox; 03 Aug 2023, 18:24.

            Comment

            Working...
            X