Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Known problem in ologit using pweights?

    Hi all,
    I'm using Stata 16.1 and seem to have a bizarre bug. I’m running a series of ologit models on the same data (testing a complex model against a simpler one, and also estimating a weighted model). When I do this a few times in a row, the weighted model in which I use [pweight=weight] gives very different results for each run with that same model (all other, unweighted models give the same results every run). "Clear all" and then rerun doesn’t help, but if I completely close Stata and restart it, I get the original results from the first run again. Very mysterious.
    I will upload the syntax and data if needed, but I wanted to first check if this is a known issue?

    Best,
    Annette

  • #2
    I cannot reproduce the problem with a minimal example.

    Code:
    clear all
    version 16.1
    set seed 123
    webuse nhanes2
    gen w = runiform()
    
    
    ologit hlthstat i.sex age bpsystol bpdiast
    ologit hlthstat i.sex age bpsystol bpdiast
    ologit hlthstat i.sex age bpsystol bpdiast
    ologit hlthstat i.sex age bpsystol bpdiast [pweight=w]
    ologit hlthstat i.sex age bpsystol bpdiast [pweight=w]
    ologit hlthstat i.sex age bpsystol bpdiast [pweight=w]
    Please share more information on data and code.
    Best wishes

    (Stata 16.1 MP)

    Comment


    • #3
      Okay, guess I'll put the data and syntax then. Note that the data below is only part of the 18 variables I actually use, but dataex does not allow more. Here we go:
      input byte(medtotaal3 geslacht doelp1_lftcat5) double stroomtot_6c float(Gpijl12obbg Pandergew_all Paanh_boei Pafw_noodweer Pafw_derden Pafw_collega Paanvul_gevaar Paanvul_wapen Pinzet_onbekend)
      0 1 3 9 2 1 0 1 0 1 1 0 0
      0 1 5 2 2 0 0 1 0 1 1 0 0
      0 1 2 2 1 1 0 0 0 0 0 0 1
      0 9 4 3 0 0 0 1 1 1 0 0 0
      0 2 3 2 2 1 0 0 0 0 0 0 0
      0 1 1 1 0 0 0 0 0 0 0 0 1
      0 1 4 1 0 0 0 0 0 0 1 1 0
      0 1 3 1 2 1 0 0 0 0 0 0 1
      0 2 3 1 1 1 0 1 1 0 1 1 0
      0 1 3 1 99 0 0 0 0 0 0 0 1
      0 1 2 9 1 1 0 0 0 0 0 0 1
      0 1 4 2 2 0 0 0 0 0 0 0 1
      0 1 3 0 0 1 1 1 1 0 0 0 0
      0 1 2 2 99 1 0 0 0 0 1 0 0
      0 1 4 3 2 1 1 1 0 0 0 0 0
      0 1 2 3 0 1 0 0 0 0 1 0 0
      0 1 3 2 0 0 1 0 0 0 0 0 0
      0 1 4 9 99 1 0 0 0 0 0 0 1
      0 1 2 4 1 1 0 1 0 1 0 0 0
      0 1 4 1 2 1 0 0 0 0 0 0 1
      0 1 4 1 2 1 1 1 0 1 1 0 0
      0 1 3 1 2 1 0 1 0 1 1 0 0
      0 1 2 1 0 1 0 0 0 0 1 0 0
      0 2 4 1 99 0 0 0 0 0 1 1 0
      0 1 2 9 99 0 0 0 0 0 0 0 1
      0 1 2 1 0 1 0 0 0 0 0 0 1
      0 1 4 1 99 0 0 1 0 0 0 0 0
      0 1 1 9 1 1 0 0 0 0 0 0 1
      0 1 2 9 0 0 0 0 0 0 0 0 1
      0 1 1 1 99 0 0 0 0 0 1 1 0
      0 1 1 1 2 0 0 0 0 0 0 0 1
      0 1 3 2 1 0 0 0 0 0 0 0 1
      0 1 2 9 2 1 1 0 0 0 1 0 0
      0 9 3 3 1 1 0 0 0 0 1 0 0
      0 1 4 2 99 0 0 0 0 0 0 0 1
      0 1 4 1 99 1 1 0 0 0 1 0 0
      0 1 3 3 0 1 0 0 0 0 1 1 0
      0 9 3 0 1 0 0 0 0 0 1 0 0
      0 1 4 9 2 1 0 0 0 0 0 0 1
      0 1 4 3 1 1 0 0 0 0 1 0 0
      0 1 4 1 99 1 0 0 0 0 1 0 0
      0 1 2 1 2 0 0 0 0 0 0 0 1
      0 1 2 1 0 0 0 1 0 0 0 0 0
      0 1 2 2 1 0 0 0 0 0 0 0 1
      0 1 2 1 2 0 0 0 0 0 1 1 0
      0 1 4 2 0 1 0 1 1 1 1 1 0
      0 1 3 2 1 1 0 0 0 0 1 0 0
      0 1 3 9 1 1 0 0 0 0 1 0 0
      0 1 2 1 99 1 1 1 1 1 0 0 0
      0 2 2 1 2 0 0 0 0 0 1 1 0
      0 1 2 9 99 1 1 0 0 0 0 0 0
      0 1 4 0 0 1 0 0 0 0 0 0 1
      0 1 4 3 0 0 0 0 0 0 1 0 0
      0 1 4 1 0 0 1 1 0 1 1 1 0
      0 1 3 2 1 1 1 1 0 0 1 0 0
      0 1 4 2 1 0 0 0 0 0 0 0 1
      0 1 3 1 0 1 0 0 0 0 0 0 1
      0 1 3 9 2 0 0 1 1 1 0 0 0
      0 1 4 1 1 0 0 0 0 0 1 0 0
      0 1 4 9 0 1 0 1 0 1 1 0 0
      0 1 2 2 2 0 1 1 0 1 0 0 0
      0 1 4 2 2 0 0 0 0 0 1 1 0
      0 1 3 9 1 0 0 0 0 0 1 1 0
      0 1 3 3 0 1 0 1 0 1 1 0 0
      0 1 2 1 0 1 0 0 0 0 0 0 1
      0 1 3 1 0 1 0 0 0 0 0 0 1
      0 1 4 2 1 0 0 0 0 0 0 0 1
      0 1 3 9 2 0 0 1 1 0 1 0 0
      0 1 2 1 2 0 0 1 0 0 1 0 0
      0 1 2 3 2 1 1 1 0 0 0 0 0
      0 1 2 0 0 1 0 1 0 1 0 0 0
      0 2 4 2 0 0 0 0 0 0 0 0 1
      0 1 4 3 2 1 0 0 0 0 1 0 0
      0 1 4 3 1 1 0 1 0 1 0 0 0
      0 2 4 9 1 1 1 1 0 1 1 0 0
      0 1 3 1 1 1 0 0 0 0 1 0 0
      0 1 2 2 0 1 0 0 0 0 0 0 1
      0 1 1 3 1 1 0 0 0 0 0 0 1
      0 1 4 2 2 0 0 0 0 0 1 0 0
      0 1 4 1 0 1 0 1 0 1 0 0 0
      0 9 4 0 1 0 0 0 0 0 1 1 0
      0 1 4 1 1 0 0 0 0 0 0 0 1
      0 1 2 9 0 0 0 0 0 0 1 0 0
      0 1 5 9 0 1 0 1 1 0 1 1 0
      0 1 4 3 2 0 0 1 0 1 1 1 0
      0 1 3 1 2 1 0 0 0 0 1 0 0
      0 1 2 1 99 1 0 1 1 1 1 0 0
      0 1 2 9 1 0 0 0 0 0 0 0 1
      0 1 3 2 0 0 0 0 0 0 0 0 1
      0 1 4 0 0 1 0 0 0 0 0 0 1
      0 1 2 1 0 1 0 0 0 0 1 0 0
      0 1 3 0 0 1 0 0 0 0 0 0 1
      0 1 2 9 2 1 0 0 0 0 1 0 0
      0 1 3 2 99 0 0 0 0 0 0 0 1
      0 1 4 0 2 1 0 0 0 0 1 1 0
      0 2 3 2 99 0 0 0 0 0 1 0 0
      0 1 . 9 0 0 0 0 0 0 0 0 1
      0 1 2 3 2 0 0 0 0 0 1 0 0
      0 1 4 3 99 0 1 0 0 0 0 0 0
      0 1 3 9 99 1 0 0 0 0 1 0 0
      end
      label values medtotaal3 labels0
      label def labels0 0 "0. Geen letsel", modify
      label values geslacht geslacht
      label def geslacht 1 "1. Man", modify
      label def geslacht 2 "2. Vrouw", modify
      label def geslacht 9 "9. Onbekend", modify
      label values doelp1_lftcat5 lftcat5
      label def lftcat5 1 "1. 13-17", modify
      label def lftcat5 2 "2. 18-25", modify
      label def lftcat5 3 "3. 26-35", modify
      label def lftcat5 4 "4. 36-55", modify
      label def lftcat5 5 "5. 56-84", modify
      label values stroomtot_6c stroomtot_6c
      label def stroomtot_6c 0 "0. geen stroom", modify
      label def stroomtot_6c 1 "1. weinig: 0-4 sec", modify
      label def stroomtot_6c 2 "2. standaard: 5 sec", modify
      label def stroomtot_6c 3 "3. meer dan standaard: 6-10 sec", modify
      label def stroomtot_6c 4 "4. veel: 11-26 sec", modify
      label def stroomtot_6c 9 "9. onbekend", modify
      label values Gpijl12obbg Gpijl12obbg
      label def Gpijl12obbg 0 "beide pijltjes geen obbg", modify
      label def Gpijl12obbg 1 "een pijltje geen obbg", modify
      label def Gpijl12obbg 2 "beide pijltjes wel obbg", modify
      label def Gpijl12obbg 99 "onbekend", modify

      Next, the syntax (again, only part of the predictor variables are in):
      gen tel = 1
      bysort medtotaal3: gen freq = sum(tel)
      gen weight = 1 / freq
      tab weight
      * Model weighting for the right skewed distribution of ordinal variable medtotaal3
      ologit medtotaal3 i.geslacht ib3.doelp1_lftcat5 i.stroomtot_6c ib2.Gpijl12obbg i.Pandergew_all i.Paanh_boei i.Pafw_noodweer i.Pafw_derden i.Pafw_collega i.Paanvul_gevaar i.Paanvul_wapen i.Pinzet_onbekend [pweight=weight] //n= 864//
      estat ic

      I know I could use a nbreg or gologit2 model for the dispersion of medtotaal3, but I really prefer to keep its ordinality

      Best,
      Annette

      Comment


      • #4
        The ologit can't run on your data example because all observations of medtotaal3 are 0.

        Comment


        • #5
          I see that the dataex selection of my data shows only cases having value 0 on variable medtotaal3. In fact, it has 4 categories, ranging from 0 to 3, but the dispersion is that 613 out of the 895 cases have value 0. Apparently, all first 100, which dataex selected.

          Comment


          • #6
            Originally posted by Hemanshu Kumar View Post
            The ologit can't run on your data example because all observations of medtotaal3 are 0.
            Yes, in this part of the data. Not in the real data, but dataex takes only 100 cases. The actual distribution of medtotaal3 is:

            0. Geen letsel 613
            1. Gering valletsel, geen hoofdletsel 140
            2. Gering hoofdletsel 89
            3. Ernstig letsel of ernstige symptomen 53

            Total 895

            Don't known how to make dataex take a proportional selection of cases instead of the first 100

            Comment


            • #7
              You could randomize the order of your observations with something like:
              Code:
              set seed 123
              gen x = runiform()
              sort x
              and then take the first 100. Try a few different seeds if the first 100 still only contain a single value of medtotaal3.

              If you want to draw proportionately from the different values of medtotaal3, you could instead do something like:
              Code:
              set seed 123
              sample 10, by(medtotaal3)
              Last edited by Hemanshu Kumar; 02 Apr 2025, 11:35.

              Comment


              • #8
                Indeed, could I've thought of that, thanks. It worked well, so here is the new data:

                input byte(medtotaal3 geslacht doelp1_lftcat5) double stroomtot_6c float(Gpijl12obbg Pandergew_all Paanh_boei Pafw_noodweer Pafw_derden Pafw_collega Paanvul_gevaar Paanvul_wapen Pinzet_onbekend)
                . 1 4 4 99 1 0 0 0 0 1 1 0
                0 1 3 1 99 1 0 0 0 0 0 0 1
                0 1 2 9 0 0 0 0 0 0 0 0 1
                2 1 3 1 1 1 0 0 0 0 0 0 1
                0 1 4 1 0 0 1 1 0 1 1 1 0
                0 1 4 9 99 1 0 1 0 1 1 0 0
                0 1 3 0 99 1 0 0 0 0 0 0 1
                1 1 4 2 99 0 0 0 0 0 1 0 0
                0 1 4 3 99 1 0 0 0 0 1 0 0
                0 1 2 1 2 0 0 0 0 0 0 0 1
                . 1 3 1 99 1 0 0 0 0 0 0 1
                0 1 4 1 0 1 0 0 0 0 1 1 0
                2 1 . 2 1 0 0 0 0 0 0 0 1
                1 1 3 2 99 0 1 0 0 0 0 0 0
                0 1 4 2 1 0 0 0 0 0 1 1 0
                1 1 2 3 2 1 0 0 0 0 0 0 1
                0 9 3 0 1 0 0 0 0 0 1 0 0
                1 1 4 2 1 0 0 0 0 0 0 0 1
                . 1 2 9 99 1 0 0 0 0 1 1 0
                0 1 2 1 99 0 0 1 0 0 1 1 0
                2 1 4 2 1 0 0 1 0 1 0 0 0
                3 2 4 1 1 0 0 0 0 0 1 1 0
                1 1 5 2 1 0 0 1 0 0 1 0 0
                0 1 3 1 99 0 0 0 0 0 1 1 0
                . 1 3 1 99 0 0 0 0 0 0 0 1
                1 1 2 1 99 0 0 0 0 0 0 0 1
                0 1 1 9 1 0 0 0 0 0 1 0 0
                0 2 4 1 99 0 0 0 0 0 1 1 0
                2 1 3 1 99 1 1 1 1 1 1 0 0
                1 1 3 2 1 0 0 0 0 0 0 0 1
                0 9 2 3 2 0 0 1 0 1 1 1 0
                0 1 5 3 1 1 0 0 0 0 1 0 0
                0 1 3 3 1 1 0 1 0 1 0 0 0
                0 1 3 3 1 1 0 0 0 0 1 0 0
                0 1 2 2 2 0 0 0 0 0 0 0 1
                0 1 4 2 0 1 0 1 0 1 0 0 0
                0 2 4 3 0 0 0 0 0 0 1 0 0
                0 1 4 3 1 1 0 0 0 0 1 1 0
                0 2 1 2 1 1 0 1 1 1 0 0 0
                . 1 4 2 99 1 0 0 0 0 0 0 1
                2 1 2 3 0 0 1 1 0 0 0 0 0
                . 1 4 9 2 1 0 0 0 0 0 0 1
                0 1 3 0 0 1 1 1 1 0 0 0 0
                3 1 4 1 99 0 0 0 0 0 1 1 0
                3 1 2 2 2 0 0 0 0 0 1 1 0
                0 2 4 2 1 0 0 0 0 0 0 0 1
                0 1 4 2 1 1 0 0 0 0 1 1 0
                0 1 5 1 99 0 0 0 0 0 0 0 1
                0 1 4 9 2 1 0 0 0 0 0 0 1
                . 1 2 3 0 1 0 0 0 0 0 0 1
                0 1 3 2 2 0 0 0 0 0 1 0 0
                1 1 3 1 0 0 0 0 0 0 1 0 0
                0 1 3 2 2 0 0 0 0 0 0 0 1
                . 1 3 1 99 1 1 1 0 0 0 0 0
                0 1 1 1 1 0 0 0 0 0 0 0 1
                3 1 2 3 2 0 0 0 0 0 0 0 1
                1 1 4 9 2 0 0 0 0 0 1 1 0
                1 1 3 2 1 0 0 0 0 0 1 1 0
                0 1 3 9 1 1 0 0 0 0 0 0 1
                0 1 2 9 1 0 0 0 0 0 0 0 1
                3 1 2 2 2 0 1 0 0 0 1 0 0
                . 1 2 4 99 1 0 0 0 0 1 1 0
                3 1 3 1 0 0 0 0 0 0 0 0 1
                0 1 4 3 1 1 0 0 0 0 0 0 1
                3 1 4 9 2 0 0 1 0 0 1 1 0
                . 1 4 3 99 0 0 0 0 0 0 0 1
                0 1 3 4 0 1 0 0 0 0 1 0 0
                3 1 2 2 1 0 0 0 0 0 0 0 1
                0 1 3 9 0 0 0 0 0 0 1 1 0
                0 1 3 1 99 1 0 0 0 0 0 0 1
                . 1 4 2 99 0 0 0 0 0 0 0 1
                1 1 4 1 1 0 0 0 0 0 0 0 1
                2 1 2 1 1 0 0 0 0 0 1 0 0
                0 1 4 1 2 1 1 1 0 1 1 0 0
                0 2 2 1 0 0 0 1 1 1 1 1 0
                0 1 4 0 1 0 0 0 0 0 0 0 1
                0 1 4 3 1 1 1 0 0 0 0 0 0
                0 1 3 1 . 0 0 0 0 0 0 0 1
                0 1 4 9 2 1 0 0 0 0 1 0 0
                0 1 4 1 1 0 0 1 1 0 1 0 0
                2 1 4 1 99 1 0 0 0 0 1 1 0
                2 1 . 4 2 1 1 1 1 1 1 1 0
                0 1 4 1 1 0 0 0 0 0 1 0 0
                2 1 4 3 2 0 1 1 1 1 1 0 0
                0 1 3 3 2 1 0 0 0 0 1 0 0
                3 1 4 1 0 1 0 0 0 0 0 0 1
                0 1 4 4 2 0 0 0 0 0 0 0 1
                0 1 2 9 0 0 0 0 0 0 0 0 1
                . 1 4 9 99 0 0 0 0 0 0 0 1
                . 1 3 3 99 1 0 0 0 0 1 0 0
                1 1 2 1 99 0 0 0 0 0 0 0 1
                0 1 4 3 2 0 0 1 0 0 0 0 0
                0 1 3 9 1 1 0 0 0 0 1 0 0
                0 1 1 1 1 0 0 0 0 0 1 0 0
                0 1 4 2 0 0 0 0 0 0 1 0 0
                0 1 3 9 1 1 0 0 0 0 1 0 0
                0 1 3 2 0 0 0 0 0 0 0 0 1
                0 1 3 1 1 0 0 1 0 1 1 1 0
                0 1 3 1 0 1 0 0 0 0 1 1 0
                2 1 4 3 1 0 0 0 0 0 0 0 1
                end
                label values medtotaal3 labels0
                label def labels0 0 "0. Geen letsel", modify
                label def labels0 1 "1. Gering valletsel, geen hoofdletsel", modify
                label def labels0 2 "2. Gering hoofdletsel", modify
                label def labels0 3 "3. Ernstig letsel of ernstige symptomen", modify
                label values geslacht geslacht
                label def geslacht 1 "1. Man", modify
                label def geslacht 2 "2. Vrouw", modify
                label def geslacht 9 "9. Onbekend", modify
                label values doelp1_lftcat5 lftcat5
                label def lftcat5 1 "1. 13-17", modify
                label def lftcat5 2 "2. 18-25", modify
                label def lftcat5 3 "3. 26-35", modify
                label def lftcat5 4 "4. 36-55", modify
                label def lftcat5 5 "5. 56-84", modify
                label values stroomtot_6c stroomtot_6c
                label def stroomtot_6c 0 "0. geen stroom", modify
                label def stroomtot_6c 1 "1. weinig: 0-4 sec", modify
                label def stroomtot_6c 2 "2. standaard: 5 sec", modify
                label def stroomtot_6c 3 "3. meer dan standaard: 6-10 sec", modify
                label def stroomtot_6c 4 "4. veel: 11-26 sec", modify
                label def stroomtot_6c 9 "9. onbekend", modify
                label values Gpijl12obbg Gpijl12obbg
                label def Gpijl12obbg 0 "0. beide pijltjes geen obbg", modify
                label def Gpijl12obbg 1 "1. een pijltje geen obbg", modify
                label def Gpijl12obbg 2 "2. beide pijltjes wel obbg", modify
                label def Gpijl12obbg 99 "99. onbekend", modify
                [/CODE]

                Comment


                • #9
                  Thanks for sharing. The problem is that sorting is not stable. To resolve the issue, change:

                  Code:
                  bysort medtotaal3: gen freq = sum(tel)
                  to
                  Code:
                  sort medtotaal3, stable
                  by medtotaal3: gen freq = sum(tel)
                  EDIT: After having thought about this for a bit longer, what you probably really want is:

                  Code:
                  bysort medtotaal3: egen freq = sum(tel)
                  With egen, values are indentical within a group. Sorting is irrelevant here as well.
                  Last edited by Felix Bittmann; 02 Apr 2025, 13:21.
                  Best wishes

                  (Stata 16.1 MP)

                  Comment


                  • #10
                    Originally posted by Felix Bittmann View Post
                    Thanks for sharing. The problem is that sorting is not stable. To resolve the issue, change:

                    Code:
                    bysort medtotaal3: gen freq = sum(tel)
                    to
                    Code:
                    sort medtotaal3, stable
                    by medtotaal3: gen freq = sum(tel)
                    EDIT: After having thought about this for a bit longer, what you probably really want is:

                    Code:
                    bysort medtotaal3: egen freq = sum(tel)
                    With egen, values are indentical within a group. Sorting is irrelevant here as well.

                    Dear Felix,

                    Yes, this does it! The estimates are stable now, using egen for creating the weight variable. Thanks a lot for solving this!

                    best,
                    Annette

                    Comment


                    • #11
                      A slightly off-topic piece of advice: You should never use the stable option of sort. If you need a stable sort order, define additional variables that establish it.

                      Comment

                      Working...
                      X