Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • When to use pweight

    Hi,

    I am using cmxtmixlogit to check the relationship between Y (action chosen by subject) and X (potential loss incurring if the action is chosen).
    X depends on what action subjects chose before, leading to the distribution of X by Y shown below.
    Theory suggests that X should decrease 1 significantly, increase 2 significantly, decrease 3 which may or may not be significant, but the effect should be smaller than the effect on 1.
    The table below makes me think that the regression might capture the fact that the proportion of 2 out of three actions increases with X. That is, X=14 has much less obs is not taken into account (or it works like outliers?).
    Thus, I am thinking to use pweight for the regression. I wondering if using pweight is a proper way to go.

    Thanks a lot

    Code:
    tab X Y,
    
               |             Y
      X        |        1           2          3 |     Total
    -----------+---------------------------------+----------
             2 |        21          0         63 |        84
           6.5 |        36         30        167 |       233
             7 |       523          0        492 |     1,015
            11 |        48        164      1,285 |     1,497
            14 |         0         68         50 |       118
            18 |       156          0      1,377 |     1,533
    -----------+---------------------------------+----------
         Total |       784        262      3,434 |     4,480
    
    cmxtmixlogit Y, casevars(X C) #C: controls
    margins, dydx(X) outcomes(,altsubpop) # I have unbalanced alternatives
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
    X            |
        _outcome |
              1  |  -.0287474   .0047379    -6.07   0.000    -.0380335   -.0194614
              2  |   .0254602     .00438     5.81   0.000     .0168757    .0340448
              3  |   .0061816   .0035598     1.74   0.082    -.0007954    .0131586
    ------------------------------------------------------------------------------

  • #2
    Disclaimer: I know nothing about the -cmxtmixlogit- command other than that it is part of official Stata.

    Nevertheless, I can assure you that using pweights is not justified by anything you have said in #1. -pweights- are used (and are required) when the sample data has been gathered in a way that causes some people (or firms, or whatever the unit of analysis is) to have a higher probability of being recruited than others. For example, if a sample were designed by picking households at random from a list of households in the community and then selecting one person from each household to participate, people from large households would have lower probability of being selected than people from small households. -pweights- are used to adjust for this unequal probability of sampling. Or if a survey of quality of care in a clinic were done by pulling a random sample of all visits in the past year and contacting the patient who made the visit, then people who have more frequent appointments will be overrepresented in the final sample, and pweights are used to adjust for this unequal probability of sampling.

    There is no other use for pweights whatsoever. And you have not described anything suggesting that the 4,480 people (or whatever they are) in your study were recruited in such a way.

    Added: Also, StataCorp has been producing high quality software for over 30 years now. It is inconceivable to me that they would create a command that did not properly deal with unequal numbers of observations, except in the circumstance where there is no proper way to deal with that. And in that case, the help file and documentation would say so, and the command itself would refuse to run with unequal data. So when you say "That is, X=14 has much less obs is not taken into account (or it works like outliers?)" I am quite confident that it is taken into account in whatever way is necessary to achieve correct results.

    Perhaps if you explain what you were hoping to accomplish through pweights, somebody who understands -cmxtmixlogit- will show you an appropriate way to achieve that goal (or explain why it can't be done at all, if that is the case.)
    Last edited by Clyde Schechter; 07 Aug 2021, 13:52.

    Comment


    • #3
      Hi Clyde,

      Thanks for your reply.
      Let me first clarify about 4480 obs. I have panel data, N=56 (subjects), T=80 (number of time choosing actions).
      The three actions are not always available, in the table, if there is 0, that corresponds to the case that the action is unavailable.

      I want to check if the theory is consistent with the data. particularly, I am trying to figure out why the effect of X on Y= is positive, inconsistent with the theory that higher X should make people play 3 less likely.
      However, the obs with X=14 is much fewer than others, but the proportion of Y=3 is higher. Or in general, the proportion of Y=3 increases with X. I am thinking I should weight the proportion of Y=3 (and 1, 2) by the empirical likelihood of each value of X appear in my data, so that I might get clearer picture of effect of X on subjects' likelihood of playing each action (use graph, table or ideally regression).

      I mentioned outliers in #1, because I think that perhaps a few obs with X=14 pulls up the effect so that it comes positive, just like OLS.

      That's why I am thinking to use pweight which works in the way that I want, but it is not designed for such case, and so I am not confident if I actually should use it.
      Last edited by Jasmine Xu; 07 Aug 2021, 14:42.

      Comment


      • #4
        I just realised i make a mistake using -tab- since I am switch between different data structure. Here is the right one.. just swap action 2 and 3. sorry about it
        Code:
                   |             Y
                 X |      1         3              2 |     Total
        -----------+---------------------------------+----------
                 2 |        21          0         63 |        84 
               6.5 |        36         30        167 |       233 
                 7 |       523          0        492 |     1,015 
                11 |        48        164      1,285 |     1,497 
                14 |         0         68         50 |       118 
                18 |       156          0      1,377 |     1,533 
        -----------+---------------------------------+----------
             Total |       784        262      3,434 |     4,480

        Comment

        Working...
        X