A question of coefficient of regression

Yuyao Li

Join Date: Jan 2025

Posts: 12
#1

A question of coefficient of regression

12 Jan 2025, 02:35

Dear Statalist,

As shown in the table below, I want to know why the coefficient of 5 is so large.

The variable of groupsibc has 7 categories. I guess that the possible reason is the too small samples in this catergory. Is that right?

Thank you for your time and assistance.

Best regards,

Bonnie
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35486
#2

12 Jan 2025, 03:49

Hard to say anything without a sight of your command, your sample size and your data.

Code:

tab groupsibc, su(movec)

may be instructive, as also (e.g.)

Code:

dotplot movec, over(groupsibc)
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29976
#3

12 Jan 2025, 16:45

I basically agree with Nick. However, given that the standard error of the coefficient for 5.groupsbic is not much different from that of the other levels of groupsbic, and certainly isn't disproportionately large, it is not very likely that the N for that subgroup is the culprit here. But this is very indirect evidence: follow Nick's advice to get a firmer grasp on the situation.
1 like
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4968
#4

12 Jan 2025, 17:42

It isn’t obvious to me that the coefficient for 5 is large. The coefficients for the other groups (including the omitted category) are all very close to zero. Does it make substantive sense that group 5 scores about 20 points lower than the other groups? If, say, group 5 was disadvantaged relative to the other groups, a 20 points lower score might make perfect sense.

The numbers in and of themselves probably wouldn’t concern me until I found them substantively implausible.

if you do find the numbers implausible or you are not sure if they are plausible, follow the suggestions of Nick and Clyde.

Also, there are severable other variables in your model. Do you have interactions involving groupsibc? Interactions can make main effects seem weird if you don’t know how to interpret them.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
Yuyao Li

Join Date: Jan 2025

Posts: 12
#5

16 Jan 2025, 22:04

Originally posted by Nick Cox View Post

Hard to say anything without a sight of your command, your sample size and your data.

Code:

tab groupsibc, su(movec)

may be instructive, as also (e.g.)

Code:

dotplot movec, over(groupsibc)

**===========================

Sorry about my vague question.

Code:
cloglog movec groupsibc

Movec is a dummy variable (0,1)
Groupsibc is a dummy variable with seven categories (0 is the reference group)

In fact, I am not sure whether the coefficient is right, it just looks strange. I have encountered this issue on occasion, even in OLS regression. One of the coefficients of a dummy variable with multiple categories looks larger than others. It confused me. I was wondering how to exclude the statistic problems when i encoutered this problem. In other words, how can I prove this obtrusive coefficient is right and impossible from the statistic wrong.

I hope I describe clearly. Thank you for your help!!
Comment
Yuyao Li

Join Date: Jan 2025

Posts: 12
#6

16 Jan 2025, 22:20

Originally posted by Richard Williams View Post

It isn’t obvious to me that the coefficient for 5 is large. The coefficients for the other groups (including the omitted category) are all very close to zero. Does it make substantive sense that group 5 scores about 20 points lower than the other groups? If, say, group 5 was disadvantaged relative to the other groups, a 20 points lower score might make perfect sense.

The numbers in and of themselves probably wouldn’t concern me until I found them substantively implausible.

if you do find the numbers implausible or you are not sure if they are plausible, follow the suggestions of Nick and Clyde.

Also, there are severable other variables in your model. Do you have interactions involving groupsibc? Interactions can make main effects seem weird if you don’t know how to interpret them.

Thank you for your answer!

There's no interaction in my model. It just has some common variables, such as gender and age.

code:
cloglog movec groupsibc controls.

Maybe this coefficient is right; I was wondering if there is any estimator to help me check if this coefficient is correct.

Thank you!

Last edited by Yuyao Li; 16 Jan 2025, 22:43.
Comment
Yuyao Li

Join Date: Jan 2025

Posts: 12
#7

16 Jan 2025, 23:35

Originally posted by Clyde Schechter View Post

I basically agree with Nick. However, given that the standard error of the coefficient for 5.groupsbic is not much different from that of the other levels of groupsbic, and certainly isn't disproportionately large, it is not very likely that the N for that subgroup is the culprit here. But this is very indirect evidence: follow Nick's advice to get a firmer grasp on the situation.

Thank you for your answer!

I learned a lot from the answer. In my sample, there is only 54 individuals in category 5, do you think its too small to need to merge with other groups? (entire sample is 4670). Is there any rule could instruct me to set category?

Thank you for your help!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35486
#8

17 Jan 2025, 03:56

Thanks for replying to #2 in #5. However, I suggested trying two commands in #2 and you don't show the results for either. So, sorry, but I can't add much to the discussion, except to say that sub-sample size of 54 doesn't trouble me unduly. Whatever or whoever they are, they are quite different from others.
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29976
#9

17 Jan 2025, 08:13

And I agree with Nick. An N of 54 is adequate for a single group like this in most situations. And looking at your regression results from #1 I definitely would not merge these people into one of the other groups as they are demonstrably very different from all the other groups.
1 like
Comment
Yuyao Li

Join Date: Jan 2025

Posts: 12
#10

17 Jan 2025, 18:33

Originally posted by Nick Cox View Post

Thanks for replying to #2 in #5. However, I suggested trying two commands in #2 and you don't show the results for either. So, sorry, but I can't add much to the discussion, except to say that sub-sample size of 54 doesn't trouble me unduly. Whatever or whoever they are, they are quite different from others.

Thank you Nick, really appreciate it!
Comment
Yuyao Li

Join Date: Jan 2025

Posts: 12
#11

17 Jan 2025, 18:40

Originally posted by Clyde Schechter View Post

And I agree with Nick. An N of 54 is adequate for a single group like this in most situations. And looking at your regression results from #1 I definitely would not merge these people into one of the other groups as they are demonstrably very different from all the other groups.

Thank you for your help, and your reply has also shed light on my work.
Comment

Announcement

A question of coefficient of regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment