Marginsplot at defined values

Janine Stubbs

Join Date: May 2021
Posts: 34

Marginsplot at defined values

18 Mar 2024, 21:30

Hi,

I am running Stata 18 on Windows 10. The aim is to model the impact of filament length on failure. Please find the dataex file pasted below, where failure (0 = no fail, 1 = fail), length (mm) and age (in weeks). There are numerous approaches to analysing this data. Here I am focusing on treating the continuous variable length as a categorical variable, because I stumbled on Stata output that I don’t understand. Code and questions appear below.

Approach 1. Cut length into quartiles, generating a new variable “length_cat” and treat these as categorical.

xtile length_cat = length, nq(4)
margins length_cat
marginsplot

So far so good.

Approach 2. I don’t maintain this is in any way sensible, but suppose we keep length as quartiles, and treat “length_cat” as continuous (by omitting i.), noting the possible values of “length_cat” and the corresponding mean length of each level of length_cat:

codebook length_cat
bysort length_cat: su length

Plot the predicted margins, specifying values of continuous variable to report.

// A.
logistic fail length_cat weeks
margins, at(length_cat=(1 2 3 4))
marginsplot

No questions here: the association is constrained to a monotonic relationship.

//B.
qui logistic fail length_cat weeks
margins, at(length=(4.74 6.41 7.80 10.38))
marginsplot

How is it that Stata plot this, given that the variable “length” is not in the regression? If I change the variable name “length” to “mystery”, Stata understandably complains that “mystery” is not in the list of covariates and produces no margins.

//C.
qui logistic fail length_cat weeks
margins, at(length =(1(2)14))
marginsplot

Same question as for B.

//D.
qui logistic fail length_cat weeks
margins, at(length_cat=(1(2)14))
marginsplot

What has Stata plotted here? The variable “length_cat” only takes the values 1, 2, 3, 4. (I’m guessing the answer to this question will partly explain B. and C.)

Thankyou!

Janine

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(partID length weeks) long fail float(pop totfail prop_fail) byte length_cat
  1 1.8 75 0  1 0         0 1
  2   2 71 1  1 1         1 1
  3 2.1 60 0  1 0         0 1
  4 2.2 48 0  1 0         0 1
  5 2.3 72 0  1 0         0 1
  6 2.4 26 0  1 0         0 1
  7 2.6 63 0  3 0         0 1
  8 2.6 45 0  3 0         0 1
  9 2.6 50 0  3 0         0 1
 10 2.7 73 1  3 2  .6666667 1
 11 2.7 69 0  3 2  .6666667 1
 12 2.7 70 1  3 2  .6666667 1
 13 2.8 43 0  7 1 .14285715 1
 14 2.8 68 0  7 1 .14285715 1
 15 2.8 41 0  7 1 .14285715 1
 16 2.8 74 0  7 1 .14285715 1
 17 2.8 60 0  7 1 .14285715 1
 18 2.8 67 1  7 1 .14285715 1
 19 2.8 36 0  7 1 .14285715 1
 20 2.9 69 0 10 1        .1 1
 21 2.9 53 0 10 1        .1 1
 22 2.9 66 1 10 1        .1 1
 23 2.9 56 0 10 1        .1 1
 24 2.9 70 0 10 1        .1 1
 25 2.9 48 0 10 1        .1 1
 26 2.9 52 0 10 1        .1 1
 27 2.9 60 0 10 1        .1 1
 28 2.9 55 0 10 1        .1 1
 29 2.9 65 0 10 1        .1 1
 30   3 35 0 10 3        .3 1
 31   3 52 1 10 3        .3 1
 32   3 63 1 10 3        .3 1
 33   3 62 0 10 3        .3 1
 34   3 67 0 10 3        .3 1
 35   3 52 0 10 3        .3 1
 36   3 67 0 10 3        .3 1
 37   3 16 0 10 3        .3 1
 38   3 42 0 10 3        .3 1
 39   3 62 1 10 3        .3 1
 40 3.1 64 0 12 5  .4166667 1
 41 3.1 68 0 12 5  .4166667 1
 42 3.1 32 0 12 5  .4166667 1
 43 3.1 51 0 12 5  .4166667 1
 44 3.1 70 0 12 5  .4166667 1
 45 3.1 76 1 12 5  .4166667 1
 46 3.1 67 1 12 5  .4166667 1
 47 3.1 63 1 12 5  .4166667 1
 48 3.1 75 1 12 5  .4166667 1
 49 3.1 63 0 12 5  .4166667 1
 50 3.1 59 0 12 5  .4166667 1
 51 3.1 66 1 12 5  .4166667 1
 52 3.2 69 1  5 2        .4 1
 53 3.2 41 0  5 2        .4 1
 54 3.2 67 1  5 2        .4 1
 55 3.2 37 0  5 2        .4 1
 56 3.2 63 0  5 2        .4 1
 57 3.3 61 0 15 4 .26666668 1
 58 3.3 57 1 15 4 .26666668 1
 59 3.3 33 1 15 4 .26666668 1
 60 3.3 59 0 15 4 .26666668 1
 61 3.3 47 0 15 4 .26666668 1
 62 3.3 60 0 15 4 .26666668 1
 63 3.3 38 0 15 4 .26666668 1
 64 3.3 34 0 15 4 .26666668 1
 65 3.3 62 0 15 4 .26666668 1
 66 3.3 76 1 15 4 .26666668 1
 67 3.3 68 0 15 4 .26666668 1
 68 3.3 62 0 15 4 .26666668 1
 69 3.3 78 1 15 4 .26666668 1
 70 3.3 33 0 15 4 .26666668 1
 71 3.3 66 0 15 4 .26666668 1
 72 3.4 68 0 17 5 .29411766 1
 73 3.4 65 1 17 5 .29411766 1
 74 3.4 56 0 17 5 .29411766 1
 75 3.4 42 1 17 5 .29411766 1
 76 3.4 64 0 17 5 .29411766 1
 77 3.4 68 0 17 5 .29411766 1
 78 3.4 57 0 17 5 .29411766 1
 79 3.4 69 0 17 5 .29411766 1
 80 3.4 74 1 17 5 .29411766 1
 81 3.4 66 0 17 5 .29411766 1
 82 3.4 64 0 17 5 .29411766 1
 83 3.4 56 1 17 5 .29411766 1
 84 3.4 60 0 17 5 .29411766 1
 85 3.4 50 0 17 5 .29411766 1
 86 3.4 80 0 17 5 .29411766 1
 87 3.4 50 0 17 5 .29411766 1
 88 3.4 48 1 17 5 .29411766 1
 89 3.5 48 0 22 9  .4090909 1
 90 3.5 59 0 22 9  .4090909 1
 91 3.5 62 0 22 9  .4090909 1
 92 3.5 57 0 22 9  .4090909 1
 93 3.5 59 1 22 9  .4090909 1
 94 3.5 75 1 22 9  .4090909 1
 95 3.5 34 0 22 9  .4090909 1
 96 3.5 66 0 22 9  .4090909 1
 97 3.5 69 1 22 9  .4090909 1
 98 3.5 77 1 22 9  .4090909 1
 99 3.5 74 1 22 9  .4090909 1
100 3.5 50 1 22 9  .4090909 1
end
label values fail faillabel
label def faillabel 0 "no fail", modify
label def faillabel 1 "fail", modify

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 29792
#2

18 Mar 2024, 23:16

I am not sure what is going on in B and C. Stata should give an error message complaining that length is not in the regression. But I think what may be happening is that Stata is applying variable abbreviation here, and is interpreting length as an abbreviation for length_cat, which is a variable in the regression. This would even be appropriate behavior if there weren't also a variable whose exact name is length. But given that there is such a variable, Stata should not be allowing you to abbreviate length_cat to length here. I would consider this behavior a bug.

As for d), there is no mystery here. Within the -at()- option you can specify any values for a variable you want--they don't have to be within the observed range of the variable in the data set. (Mind, applying -margins, at()- with values outside the range of the observed data is usually a bad idea, but it is legal to do.) What Stata does when you run -margins, at()- is create new observations with the -at()- values replacing the actual values of the -at()- variables, and then applies the -predict- command to calculate what the outcome variable expectation is conditional on those -at()- values. This kind of calculation is not at all constrained by the actual observed values in the data.
Comment
Janine Stubbs

Join Date: May 2021

Posts: 34
#3

19 Mar 2024, 17:18

Very illuminating, thankyou Clyde!
Comment
Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014

Posts: 658
#4

19 Mar 2024, 19:37

The at() option parses variable names relative to the column stripe of e(b) instead of the variable names in the current dataset. If you have variable abbreviations on (the default for set varabbrev), then margins' parsing code will accept abbreviations in option at(). In the above example, length is a non-ambiguous abbreviation for length_cat since variable length is not among the independent variables in the currently fitted model.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29792
#5

19 Mar 2024, 20:03

The at() option parses variable names relative to the column stripe of e(b) instead of the variable names in the current dataset.

That's good to know. Does that appear in the documentation anywhere? If so, I missed it. If not, it should probably be added.
Comment
Janine Stubbs

Join Date: May 2021

Posts: 34
#6

19 Mar 2024, 21:55

Jeff,

Thanks for your contribution!

I take this as a warning to turn variable abbreviation off in the case that one has similarly named variables.

I wonder how ubiquitous this parsing method it, among Stata commands and options.
Comment
Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014

Posts: 658
#7

20 Mar 2024, 16:13

Here are some of the commands and elements where Stata parses variable names (varlists) relative to the column stripe elements in e(b).

margins marginslist

margins options dydx(), dyex(), eydx(), and eyex()

test coeflist

testparm varlist

lincom exp

pwcompare marginlist

contrast termlist
Comment

Announcement

Marginsplot at defined values

Comment

Comment

Comment

Comment

Comment

Comment