Why there's no offical command for performing Log-linear Models in Stata

Chen Samulsion

Join Date: Jan 2018

Posts: 872
#1

Why there's no offical command for performing Log-linear Models in Stata

18 Oct 2018, 19:27

Dear Stata users,

This is a question annoyed me for a long time. Log-linear models that model cross tabulation is used a lot in sociology, especially in social mobility research. However, there's no offical command specially designed for it in Stata. Some user-written command such as -loglin- (D. H. Judson, 1992, from stb8) and -ipf- (Adrian Mander, 2009, from SSC) is old and not well performed. And also some scholars suggest to use -logit- or -poisson- or -glm- command as a substitute. For example German Rodriguez and Maarten Buis
http://data.princeton.edu/wws509/notes/c5.pdf
http://maartenbuis.nl/presentations/london15b.pdf
But compared to SPSS and others statistical softwares, these substitute is dissatisfying in their model build option, outputs (results form, parameters) and interpretation. After all, it is a mystery for me that Stata does not and does not plan to provide offical command for Log-linear models.

Last edited by Chen Samulsion; 18 Oct 2018, 19:29.
Tags: GLM, loglinear, poisson
Nick Cox

Join Date: Mar 2014

Posts: 35433
#2

19 Oct 2018, 03:42

This is for StataCorp really, but you have partly answered your own question by alluding to poisson, glm and so forth. I don't know the basis for your statement that StataCorp is not planning additions of this kind.

I haven't used SPSS in this century and never used it for log-linear models, so I can't follow what you're missing.

FWIW, one of the reasons I originally wrote contract was to get datasets into a shape standard for these models, but over the following 20 years I have not noticed much interest in them (e.g. on Statalist). I don't doubt that they are heavily used in some fields.
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3426
#3

19 Oct 2018, 05:23

StataCorp never tells us what its plans are, so we don't know that it does not plan to provide official commands for log-linear models. Moreover, StataCorp is the only entity that can tell us about the reasons behind its past decisions, but we can guess.

My guess is that it isn't implemented because it is not used that much. In most disciplines it is not used at all. Even in the sub-discipline social stratification research that you mention (and I work in), the number of people that use it is fairly limited. If you go to the big conferences in this field (RC28, ECSR) then talks using log-linear models will happen, but the vast majority of talks will not use log-linear models. Moreover, those who wanted to do log-linear models can do so by using contract in combination with poisson or glm, and with the current factor variable notation, that has become even easier. Those two together reduces the added value of a separate command. StataCorp needs to prioritize what it spends its resources on, and, although I would like it to be high on their list of priorities, I understand why that is not the case.

In principle we don't need to wait for StataCorp to implement a model. I have thought about writing such a command and decided that a basic implementation was doable, but not worth it. What would be interesting is a wider suite of log-linear models: e.g. taking care of missing values with EM or an RC2 model. I decided that implementing such a suite of log-linear commands myself would cost me too much time.

So, why does SPSS have log-linear models? My guess is that it has to do with the age of SPSS and Stata. There was a time before the widespread use of logit and probit models for analyzing categorical data, and at that time the go-to-method for analyzing categorical data was log-linear analysis. SPSS is older than Stata and was in part written at that time, so it included a module for log-linear analysis. Stata got started in the mid `80s, at which time log-linear models were replaced by the logit and probit models, which for most applications was a clear improvement.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
4 likes
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 872
#4

19 Oct 2018, 05:36

Nick, an observation about quantitative sociology: scholars relied heavily on SPSS when social mobility research was dominated by a variety of loglinear models (1974~1992 or so), and as with this tidal wave receding they turned to Stata.

Last edited by Chen Samulsion; 19 Oct 2018, 05:40.
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 872
#5

19 Oct 2018, 05:46

Thanks Maarten Buis for your remarkable explanation, with both model technique and history.
Comment

Richard Williams

Join Date: Apr 2014
Posts: 4945

19 Oct 2018, 07:29

A student asked me about this in 2011. Michael Hout has a little green Sage book, "Mobility Tables", published in 1983. With a little bit of work I was able to replicate much of his analysis. Granted, it could be easier, but it is not impossible to do with existing Stata commands. One advantage of doing it the hard way is that it does force you to understand the models a bit better. Here is the code, entirely self-contained. You'll of course understand it better if you have Hout's book handy. (Incidentally, I do not teach this in my courses because I see so little demand for it. If somebody really really really wants to do it I might recommend that they check out SPSS or another package if they think Stata is too hard.)

Code:

* Reproduce Analyses from Hout, 1983, Mobility Tables,
* Little Green Sage Book
***********************************************************************
clear all
input float(freq fathocc sonocc)
1414 1 1
 724 2 1
 798 3 1
 756 4 1
 409 5 1
 521 1 2
 524 2 2
 648 3 2
 914 4 2
 357 5 2
 302 1 3
 254 2 3
 856 3 3
 771 4 3
 441 5 3
 643 1 4
 703 2 4
1676 3 4
3325 4 4
1611 5 4
  40 1 5
  48 2 5
 108 3 5
 237 4 5
1832 5 5
end
label values fathocc Occupation
label values sonocc Occupation
label def Occupation 1 "Upper Nonmanual", modify
label def Occupation 2 "Lower Nonmanual", modify
label def Occupation 3 "Upper Manual", modify
label def Occupation 4 "Lower Manual", modify
label def Occupation 5 "Farm", modify


* Reproduce Hout Table 1, p. 11 
tab2  fathocc sonocc [fw = freq]
* Reproduce Hout Table 2, p. 12
tab2  fathocc sonocc [fw = freq], nofreq col
tab2  fathocc sonocc [fw = freq], nofreq row

* Hout, Perfect Mobility, p. 15. See the stats for Deviance, Pearson, and residual d.f.
glm freq  i.fathocc  i.sonocc, family(poisson) link(log)
* Hout, p. 14. Predicted frequencies under perfect mobility
predict pm_xb
list  pm_xb fathocc sonocc

* Quasi-Perfect Mobility - Hout p. 23
* Create a dummy var for each diagonal element
foreach j of numlist 1/5 {
    gen cell`j'`j' = fathocc == `j' & sonocc == `j'
}
glm freq  i.fathocc  i.sonocc  cell11-cell55, family(poisson) link(log)
predict qpm_xb
list  qpm_xb fathocc sonocc
* The predicted values are the same as Hout reports except for the diagonals. To
* get his predicted values:
preserve
foreach var of varlist  cell11-cell55 {
    replace `var' = 0
}
predict qpmhout_xb
list  qpmhout_xb fathocc sonocc
restore

* Corners model, Hout p. 25
gen cell12 = fathocc == 1 & sonocc == 2
gen cell21 = fathocc == 2 & sonocc == 1
gen cell45 = fathocc == 4 & sonocc == 5
gen cell54 = fathocc == 5 & sonocc == 4
glm freq  i.fathocc  i.sonocc  cell11-cell55 cell12-cell54, family(poisson) link(log)
predict cm_xb
list  cm_xb fathocc sonocc
* get Hout's expected Freqs
preserve
foreach var of varlist  cell11-cell54 {
    replace `var' = 0
}
predict cmhout_xb
list  cmhout_xb fathocc sonocc
restore

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam

Comment

Chen Samulsion

Join Date: Jan 2018

Posts: 872
#7

19 Oct 2018, 08:46

Dear Richard Williams, thanks a lot for your attention and kind help. I identified with your student about the inconvenience. I have Hout's book and your codes and suggestion are very very helpful (and practical as to resorting to SPSS).
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4945
#8

19 Oct 2018, 09:07

A lot of loglinear models should be easy to replicate because the table is usually right in the publication. So, if I was trying to do this sort of thing, I would first check to see if I could replicate the work of others, and then use my code as a template for my own analysis. Of course, when possible, making sure you can replicate work is a good way to learn any method. I like the Stata Press books and The Stata Journal because they almost always show you how to replace analyses.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4945
#9

19 Oct 2018, 09:24

If you type -findit loglinear- you will see that there are various user-written loglinear routines. I've used one of them, Adrian Mander's ipf, and it was fine for my purposes at that time. See pp. 7-9 of

https://www3.nd.edu/~rwilliam/stats1...ical-Stata.pdf.

There is an article on the ipf command (pp. 10-12) at

https://www.stata.com/products/stb/journals/stb55.pdf

You might see if ipf or some of the other user-written programs would meet your needs. Mobility tables do all these tricky models, so I'm not sure if ipf would be any easier than poisson or glm for such purposes.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 872
#10

19 Oct 2018, 09:28

I would first check to see if I could replicate the work of others, and then use my code as a template for my own analysis. Of course, when possible, making sure you can replicate work is a good way to learn any method.

So do I. We think alike.
Comment

Announcement