Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing means across groups using svy commands and esttab

    Dear STATA users

    I hope someone can help me. I am having a really hard time making a - what I thought would be a quite simple table.

    I want to make a table over background factors related to placement breakdown looking similar to this one below:
    Probability of placement breakdown Significance
    Gender
    Boys (n=545) 0.21 0.720
    Girls (n=478) 0.23 0.720
    Age
    0-5 (n=56) 0.07 0.000
    6-12 (n=318) 0.22 0.958
    13-17 (n=587) 0.23 0.320
    18-22 (n=52) 0.23 0.741
    Placement breakdown is a dummyvariable called placementbreakdown 1 = placement breakdown 0 = no placement breakdown
    Gender is also a dummy variable called gender 1 = boy 0 = girl.
    And age consists of multiple dummy variables called age_0_5, age_6_12, age_13_17 and age_18_22.

    If we start with the gender variable, I might be able to figure out the rest on my own.

    My data is survey data collected in clusteres, so we use the svy-command to compensate for that.

    What I have tried to do is:

    . svy: mean breakdown, over(gender)
    (running mean on estimation sample)

    Survey: Mean estimation

    Number of strata = 1 Number of obs = 1,007
    Number of PSUs = 156 Population size = 1,007
    Design df = 155

    --------------------------------------------------------------------
    | Linearized
    | Mean std. err. [95% conf. interval]
    -------------------+------------------------------------------------
    c.breakdown@gender |
    0 | .2252632 .024941 .175995 .2745313
    1 | .2142857 .0237364 .1673971 .2611744
    --------------------------------------------------------------------

    . test [email protected][email protected]

    Adjusted Wald test

    ( 1) [email protected] - [email protected] = 0

    F( 1, 155) = 0.13
    Prob > F = 0.7200

    . estadd scalar p_diff = r(p)

    added scalar:
    e(p_diff) = .71999581

    . esttab ., cells("b") stats(p_diff) nostar noabbrev nonumber eqlabels(none) collabels(Probability of placement breakdown) nomtitle

    -------------------------
    Probability of placement breakdown
    -------------------------
    [email protected] .2252632
    [email protected] .2142857
    -------------------------
    p_diff .7199958
    -------------------------

    I have tried several ways but this is the closest i get to the table above. It is not very interpretable especially since there is no labels on the variables.

    Are there any suggestions on how I could do this better/more interpretable?

    Thank you in advance,

    Sofie.

  • #2
    See #2 here https://www.statalist.org/forums/for...tab-using-over

    Comment


    • #3
      Thank you Andrew! That was very helpfull. Still not exactly the result I was looking for, yet though.

      This is my preliminary result:

      . svy: regress breakdown ibn.gender, nocons
      (running regress on estimation sample)

      Survey: Linear regression

      Number of strata = 1 Number of obs = 1,007
      Number of PSUs = 156 Population size = 1,007
      Design df = 155
      F(2, 154) = 66.84
      Prob > F = 0.0000
      R-squared = 0.2196

      ------------------------------------------------------------------------------
      | Linearized
      breakdown | Coefficient std. err. t P>|t| [95% conf. interval]
      -------------+----------------------------------------------------------------
      gender |
      Girl | .2252632 .024941 9.03 0.000 .175995 .2745313
      Boy | .2142857 .0237364 9.03 0.000 .1673971 .2611744
      ------------------------------------------------------------------------------

      . mat list e(b)

      e(b)[1,2]
      0. 1.
      gender gender
      y1 .22526316 .21428571

      . test 0.gender=1.gender

      Adjusted Wald test

      ( 1) 0bn.gender - 1.gender = 0

      F( 1, 155) = 0.13
      Prob > F = 0.7200

      . estadd scalar p_diff = r(p)

      added scalar:
      e(p_diff) = .71999581

      . esttab, cells("b") wide se nostar label stats(p_diff) collabels(Probability of placement breakdown) noabbrev nomtitle nonumber

      ---------------------------------
      Probability of placement breakdown
      ---------------------------------
      Girl .2252632
      Boy .2142857
      ---------------------------------
      p_diff .7199958
      ---------------------------------

      Do you know if there is somehow I can get the variable label (in this example: "Gender of the child") automatically included in the tabel (in a collumn above the variable values or so)?

      Something like this:
      Probability of placement breakdown
      Gender of the child
      Girl .2252632
      Boy .2142857
      p_diff .7199958
      In this example it is not really necessary (since the label values are quite informative). But I have several variables that need some kind of explanation to be interpreted e.g. yes/no-questions. You need some informations about the question to understand what "yes" and "no" stands for. I found codes where you can type a label manually, but I am looking for a way to do this automatically since I have to do it with more than 120 + variables.

      Comment


      • #4
        Add -label- as an option in esttab. Otherwise show an example where the labels fail to appear. But note that your variables need to be labeled in the first place.

        Comment


        • #5
          Thank you and sorry. Thought I made an example with my last post, but maybe it more clear with anouther example. When I use the command (including the label option), I get this table:

          . esttab, cells("b") wide se nostarlabelstats(p_diff) collabels(Probability of placement breakdown) noabbrev nomtitle nonumber

          ---------------------------------
          Probability of placement breakdown
          ---------------------------------
          No .2252632
          Yes .2142857
          ---------------------------------
          p_diff .7199958
          ---------------------------------

          What I would like is a table, that also includes the variable label. Something like this (so it is clear what yes and no stands for) :
          Probability of placement breakdown
          School age
          No .2252632
          Yes .2142857
          p_diff .7199958
          I know I can redefine the labels values and give them a more informative name, but again since I work with over 120+ variables, I would prefer it there is an easier way.

          Hope someone can help me.

          Comment


          • #6
            Sorry, my mind was on value labels when I read your question in #3. By example, I mean a reproducible example in the sense of FAQ Advice #12. As long as the dependent variable in svy:regress is labeled, here is an automatic way (highlighted)

            Code:
            webuse nhanes2f
            svyset psuid [pweight=finalwgt], strata(stratid)
            svy: regress zinc ibn.race, nocons
            esttab, not nostar noobs label nonumbers mlab("Mean", lhs("`:var lab `e(depvar)''"))
            Res.:

            Code:
            . esttab, not nostar noobs label nonumbers mlab("Mean", lhs("`:var lab `e(depvar)''"))
            
            ---------------------------------
            serum zinc (mcg/dL)          Mean
            ---------------------------------
            White                       87.50
            Black                       85.09
            Other                       83.57
            ---------------------------------

            Comment


            • #7
              Thank you so much for your quick respond. Now I get this:

              . esttab, cells("b") wide se nostar label stats(p_diff) collabels(Probability of placement breakdown) noabbrev nomtitle nonumber mlab("Mean", lhs("`:var lab `e(depvar)''"))

              ---------------------------------
              breakdown Mean
              Probability of placement breakdown
              ---------------------------------
              No .219888
              Yes .21843
              ---------------------------------
              p_diff .9683106
              ---------------------------------

              But, actually it is the label of the independent variable I want included in the table (in your example race).

              I tried to just change 'depvar' to 'indepvar', but that was not possible:

              esttab, cells("b") wide se nostar label stats(p_diff) collabels(Probability of placement breakdown) noabbrev nomtitle nonumber mlab("Mean", lhs("`:var lab `e(indepvars)''"))
              nothing found where name expected

              ---------------------------------
              Mean
              Probability of placement breakdown
              ---------------------------------
              No .219888
              Yes .21843
              ---------------------------------
              p_diff .9683106
              ---------------------------------

              Any suggestions?

              Comment


              • #8
                In the regression, there are two variables. It seems that you want the label of the categorical variable, e.g., gender in your case. From the stored results, the only way I see is to parse the command line to extract the variable name. From my example:



                Code:
                webuse nhanes2f
                svyset psuid [pweight=finalwgt], strata(stratid)
                svy: regress zinc ibn.race, nocons
                esttab, not nostar noobs label nonumbers mlab("Mean", lhs("`:var lab `=ustrregexra("`e(command)'",  "(.*ibn\.)(.*)\,\s+\w+", "$2")''"))
                Res.:

                Code:
                . esttab, not nostar noobs label nonumbers mlab("Mean", lhs("`:var lab `=ustrregexra("`e(command)'",  "(.*ibn\.)(.*)\,\s+\w+"
                > , "$2")''"))
                
                ---------------------------------
                1=white, 2=black, ~r         Mean
                ---------------------------------
                White                       87.50
                Black                       85.09
                Other                       83.57
                ---------------------------------

                Comment

                Working...
                X