Stata dropping standard errors in regression

Mike Jim

Join Date: Aug 2018

Posts: 6
#1

Stata dropping standard errors in regression

02 Aug 2018, 10:24

Hi everyone,

I am trying to run a simple regression on this test data to understand better how to read the coefficients in a multiple regression with two or more binary variables:

# Sex Race y
# 1 Male White 1
# 2 Female White 3
# 3 Male Black 5
# 4 Female Black 7

In this case the model is y = B0+B1Race+B2Sex. I have coded males as 0 and females as 1, and white as 0 and black as 1. When I run the regression I get the following results:

. reg y Race Sex

Source | SS df MS Number of obs = 4
-------------+---------------------------------- F(2, 1) = .
Model | 20 2 10 Prob > F = .
Residual | 0 1 0 R-squared = 1.0000
-------------+---------------------------------- Adj R-squared = 1.0000
Total | 20 3 6.66666667 Root MSE = 0

------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Race | 4 . . . . .
Sex | 2 . . . . .
_cons | 1 . . . . .

Stata drops the standard errors and just gives me the coefficients and I cannot figure out why, as when I run the same regression in R I get the following results with no standard errors dropped:

# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 1 3.85e-16 2.60e+15 2.4e-16 ***
# SexFemale 2 4.44e-16 4.50e+15 < 2e-16 ***
# RaceBlack 4 4.44e-16 9.01e+15 < 2e-16 ***
# ...
# Warning message:
# In summary.lm(lm(y ~ Sex + Race, d)) :
# essentially perfect fit: summary may be unreliable

I understand that in this case, my coefficients should be read as the difference between the groups, as noted here: https://stats.stackexchange.com/ques...ical-variables However, I am confused about Stata dropping the standard errors. Could somebody please help me to understand why Stata is dropping the standard errors? Thanks in advance.

Last edited by Mike Jim; 02 Aug 2018, 10:29.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29962
#2

02 Aug 2018, 11:39

The data in your model have a perfect fit to the equation y = 7 - 2*sexa - 4*racea. The residuals are all exactly zero. Consequently the residual variance is zero and so are the standard errors of all the model coefficients. For whatever, reason, Stata chooses simply not to bother calculating and showing them. You will see this happen with any linear model that has R² = 1 in Stata.

Note, by the way, that the standard errors shown in your R output are all very close to zero, but they are not exactly zero. So they are, in fact, wrong.
1 like
Comment
Mike Jim

Join Date: Aug 2018

Posts: 6
#3

02 Aug 2018, 14:11

Thank you so much Clyde! I was puzzled by this difference between R and Stata but now it is all clear. Thank you!
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#4

03 Aug 2018, 11:16

To add to Clyde's comment, if you're going to play with regression, use a reasonable number of observations. It seldom if ever makes sense to do regression in situations where N is about the number of parameters.
1 like
Comment

Announcement

Stata dropping standard errors in regression

Comment

Comment

Comment