Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a table of different averages given they comply with a certain criteria and across different variables of interest

    Hi, I am trying to create a table that includes averages of variables given they fullfill something.
    Basically I have 2 dummy variables named “seller” and “customer”. I want to characterise my table as seller: (seller==1), customer: (customer==1) and unidentified: seller==0 & customer==0, and all.

    I have several variables I want averages of, i.e “size”, “bm”, “roa”, “leverage_ratio” ++.

    I basically want a table looking something like this:

    Averages
    Seller Customer Unidentified All
    Size …… ………. ……. .......

    bm ...... ............ ........ .........

    roa ...... ............ ............. .........

    leverage

    So what I attempted was do something like this, however it didn't turn out pretty. It did have size, bm etc after each other, but instead of understanding that there are 4 categories, it just put additional ones next to each other like this:
    Seller Customer Unidentified All Seller Customer Unidentified

    mean size if seller==1
    estimates store size1
    mean size if customer==1
    estimates store size2
    mean size if seller==0 & customer==0

    estimates store size3
    mean size
    estimates store size4

    mean bm if seller==1
    estimates store bm1
    mean bm if customer==1
    estimates store bm2
    .

    ​.
    .

    esttab size1 size2 size3 size4 bm1 bm2 bm3 bm4

    I only need the averages/means, not significance etc. Could someone please show me how to create a table?

    Best,
    Emilie


  • #2
    Here is one way:
    Code:
    statsby _b, by(seller customer) total clear: mean size bm
    reshape long _b_, i(seller customer) j(measure) string
    gen byte type = 3 - (2*seller + customer)
    lab def type 1 "Seller" 2 "Customer" 3 "Unidentified"
    lab val type type
    tabdisp measure type, c(_b_) totals format(%9.3f)

    Comment


    • #3
      Hi Phil, thanks for the reply!
      I have a few issues though, and I have pasted the output below.

      First, the results don't seem to be correct.
      I.e for size variable
      using
      mean size if seller==1 yields 5. The output below states 5.144.
      mean size if customer==1 yields 7.84, whereas below states 8.297.

      Moreover, the command seems to clear all of my data, so that I am left with nothing but the results.
      Am I doing something wrong here?
      ​How does the method below consider that seller=1 or customer=1?

      Best,
      Emilie



      [statsby _b, by(seller customer) total clear: mean size bm
      (running mean on estimation sample)

      command: mean size bm
      by: seller customer

      Statsby groups
      ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
      ....

      . reshape long _b_, i(seller customer) j(measure) string
      (note: j = bm size)

      Data wide -> long
      -----------------------------------------------------------------------------
      Number of obs. 5 -> 10
      Number of variables 4 -> 4
      j variable (2 values) -> measure
      xij variables:
      _b_bm _b_size -> _b_
      -----------------------------------------------------------------------------

      . gen byte type = 3 - (2*seller + customer)
      (2 missing values generated)

      . lab def type 1 "Seller" 2 "Customer" 3 "Unidentified"

      . lab val type type

      . tabdisp measure type, c(_b_) totals format(%9.3f)

      --------------------------------------------------------------------------------
      | type
      measure | 0 Seller Customer Unidentified Total
      ----------+---------------------------------------------------------------------
      bm | 0.883 0.940 1.050 25.741 16.111
      size | 7.656 5.144 8.297 5.601 5.707
      --------------------------------------------------------------------------------]

      Comment


      • #4
        Originally posted by Emilie Gronsund View Post
        First, the results don't seem to be correct. I.e for size variable using
        mean size if seller==1
        yields 5. The output below states 5.144.
        This difference is due to the presence of observations for which seller==1 and customer==1. The means for those observations appear in the first (unlabeled) column of the table in your output above. Note that in your original code, you're counting such observations twice (i.e., they appear in both the seller and customer means). You need to decide how you want to handle those observations, and modify the code I gave accordingly.

        Originally posted by Emilie Gronsund View Post
        Moreover, the command seems to clear all of my data, so that I am left with nothing but the results.
        Right (that's a consequence of using statsby). You could, if you wanted, work a bit harder to retain the data while calculating the means, but in many cases, it'd be simpler just to use preserve/restore.
        Last edited by Phil Schumm; 27 Apr 2014, 10:38.

        Comment


        • #5
          Thanks Phil,

          May I ask how to modify the code? I've been steering at it for ages, and don't seem to understand. Now, I do not want the 0 section in my table. I am trying to look at some statistics on how customers and suppliers differ fundamentally. So if a firm is both a customer and a supplier, it would be counted in both average means for seller and customer.

          Is there a way to combine tables? Like if I did it the way I suggested first?
          I.e lets say you calculate the 3 different means of size, and 3 different for bm. If you combine the first three into a table and then the other into another.

          mean size if seller==1
          estimates store size1
          mean size if customer==1
          estimates store size2
          mean size if seller==0 & customer==0

          estimates store size3
          esttab size1 size2 size3


          mean bm if seller==1
          estimates store bm1
          mean bm if customer==1
          estimates store bm2
          mean bm
          if seller==0 & customer==0
          estimates store bm3
          esttab bm1 bm2 bm3

          And then combine these two tables?
          Means:
          Seller Customer Unidentified
          Size
          Bm
          Thanks again,

          Best,
          Emilie

          Comment


          • #6
            I've not been following this, but it seems likely that you need a new composite variable, say

            Code:
            gen role = 1 if seller == 1 & customer == 0
            replace role = 2 if customer == 1 & seller == 0
            replace role = 3 if customer == 1 & seller == 1
            Naturally, there are briefer ways to do that, such as

            Code:
            gen role = seller + 2 * customer
            Then define value labels as you wish.

            Comment


            • #7
              Originally posted by Emilie Gronsund View Post
              May I ask how to modify the code?
              Based on your original idea of using estimates (and thereby retaining the original data), the following code should generate what you want:
              Code:
              mean size bm if seller
              estimates store Seller
              mean size bm if customer
              estimates store Customer
              mean size bm if !seller & !customer
              estimates store Unidentified
              mean size bm
              estimates store Total
              
              estimates table Seller Customer Unidentified Total, model(12) b(%9.3f)
              You can use the various options for estimates table to tart up the final table, if desired.

              Comment

              Working...
              X