Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • A question of coefficient of regression

    Dear Statalist,

    As shown in the table below, I want to know why the coefficient of 5 is so large.

    The variable of groupsibc has 7 categories. I guess that the possible reason is the too small samples in this catergory. Is that right?

    Click image for larger version

Name:	_20250112165659.png
Views:	1
Size:	16.3 KB
ID:	1770634



    Thank you for your time and assistance.

    Best regards,

    Bonnie

  • #2
    Hard to say anything without a sight of your command, your sample size and your data.

    Code:
    tab groupsibc, su(movec)
    may be instructive, as also (e.g.)

    Code:
    dotplot movec, over(groupsibc)

    Comment


    • #3
      I basically agree with Nick. However, given that the standard error of the coefficient for 5.groupsbic is not much different from that of the other levels of groupsbic, and certainly isn't disproportionately large, it is not very likely that the N for that subgroup is the culprit here. But this is very indirect evidence: follow Nick's advice to get a firmer grasp on the situation.

      Comment


      • #4
        It isn’t obvious to me that the coefficient for 5 is large. The coefficients for the other groups (including the omitted category) are all very close to zero. Does it make substantive sense that group 5 scores about 20 points lower than the other groups? If, say, group 5 was disadvantaged relative to the other groups, a 20 points lower score might make perfect sense.

        The numbers in and of themselves probably wouldn’t concern me until I found them substantively implausible.

        if you do find the numbers implausible or you are not sure if they are plausible, follow the suggestions of Nick and Clyde.

        Also, there are severable other variables in your model. Do you have interactions involving groupsibc? Interactions can make main effects seem weird if you don’t know how to interpret them.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 18.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Originally posted by Nick Cox View Post
          Hard to say anything without a sight of your command, your sample size and your data.

          Code:
          tab groupsibc, su(movec)
          may be instructive, as also (e.g.)

          Code:
          dotplot movec, over(groupsibc)
          **===========================

          Sorry about my vague question.

          Code:
          cloglog movec groupsibc

          Movec is a dummy variable (0,1)
          Groupsibc is a dummy variable with seven categories (0 is the reference group)

          In fact, I am not sure whether the coefficient is right, it just looks strange. I have encountered this issue on occasion, even in OLS regression. One of the coefficients of a dummy variable with multiple categories looks larger than others. It confused me. I was wondering how to exclude the statistic problems when i encoutered this problem. In other words, how can I prove this obtrusive coefficient is right and impossible from the statistic wrong.

          I hope I describe clearly. Thank you for your help!!

          Comment


          • #6
            Originally posted by Richard Williams View Post
            It isn’t obvious to me that the coefficient for 5 is large. The coefficients for the other groups (including the omitted category) are all very close to zero. Does it make substantive sense that group 5 scores about 20 points lower than the other groups? If, say, group 5 was disadvantaged relative to the other groups, a 20 points lower score might make perfect sense.

            The numbers in and of themselves probably wouldn’t concern me until I found them substantively implausible.

            if you do find the numbers implausible or you are not sure if they are plausible, follow the suggestions of Nick and Clyde.

            Also, there are severable other variables in your model. Do you have interactions involving groupsibc? Interactions can make main effects seem weird if you don’t know how to interpret them.
            Thank you for your answer!

            There's no interaction in my model. It just has some common variables, such as gender and age.

            code:
            cloglog movec groupsibc controls.

            Maybe this coefficient is right; I was wondering if there is any estimator to help me check if this coefficient is correct.

            Thank you!



            Last edited by Yuyao Li; 16 Jan 2025, 23:43.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              I basically agree with Nick. However, given that the standard error of the coefficient for 5.groupsbic is not much different from that of the other levels of groupsbic, and certainly isn't disproportionately large, it is not very likely that the N for that subgroup is the culprit here. But this is very indirect evidence: follow Nick's advice to get a firmer grasp on the situation.

              Thank you for your answer!

              I learned a lot from the answer. In my sample, there is only 54 individuals in category 5, do you think its too small to need to merge with other groups? (entire sample is 4670). Is there any rule could instruct me to set category?

              Thank you for your help!

              Comment


              • #8
                Thanks for replying to #2 in #5. However, I suggested trying two commands in #2 and you don't show the results for either. So, sorry, but I can't add much to the discussion, except to say that sub-sample size of 54 doesn't trouble me unduly. Whatever or whoever they are, they are quite different from others.

                Comment


                • #9
                  And I agree with Nick. An N of 54 is adequate for a single group like this in most situations. And looking at your regression results from #1 I definitely would not merge these people into one of the other groups as they are demonstrably very different from all the other groups.

                  Comment


                  • #10
                    Originally posted by Nick Cox View Post
                    Thanks for replying to #2 in #5. However, I suggested trying two commands in #2 and you don't show the results for either. So, sorry, but I can't add much to the discussion, except to say that sub-sample size of 54 doesn't trouble me unduly. Whatever or whoever they are, they are quite different from others.
                    Thank you Nick, really appreciate it!

                    Comment


                    • #11
                      Originally posted by Clyde Schechter View Post
                      And I agree with Nick. An N of 54 is adequate for a single group like this in most situations. And looking at your regression results from #1 I definitely would not merge these people into one of the other groups as they are demonstrably very different from all the other groups.
                      Thank you for your help, and your reply has also shed light on my work.

                      Comment

                      Working...
                      X