Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpretation of the multilevel logistic model.

    I am reviewing an article by a colleague who ran a two-level logistical hierarchical model, in which the first level is the firm and the second sector. In the unconditional model, the likelihood ratio test passed, indicating that multilevel modeling is more appropriate than GLM logistics. However, when I calculate the statistical significance for the random effects parameters, dividing the variance by its respective standard error, the z value is ~0.96, which is below the critical value of the standard normal. Below I will send the output for you to see, including the rho.

    Click image for larger version

Name:	WhatsApp-Image-2023-12-19-at-19.47.46.png
Views:	1
Size:	157.6 KB
ID:	1737740


    This behavior of the LR test indicating better suitability for the multilevel, but there being no statistical significance in the random components was repeated in what he called the final model (in it z=1.34). What happens in these cases?

  • #2
    This is a very low ICC value, suggesting that there is little between cluster variance. Some questions to consider before moving forward:
    1. What is the residual ICC when predictors are added?
      • Within-cluster predictors can increase the between cluster variance and might push it to a level where a mixed model is warranted.
    2. Are you modeling between-cluster predictors in the final model?
      • If so, then a fixed effect approach would not accommodate those predictors - it controls away (or absorbs) between-cluster variation.
        • You can examine interactions between between cluster variables and clusters if you use logit/logistic for estimation with cluster dummy variables (i.setorecob3#cluster_variable).
    The use of cluster-robust standard errors - vce(cluster setorecob3) - is not appropriate in this case because you have too few clusters.

    Comment


    • #3
      Erik Ruzek I have attached the final model below so you can see the output.
      ​​​​​​​







      Attached Files

      Comment


      • #4
        It appears that the predictors you added to the model increase the between-cluster variance quite a bit. That suggests that the cluster means of some of the within-cluster predictors are quite variable across clusters. That is, one or more of your within-cluster predictors have substantial between-cluster variability and are correlated with the outcome. I have simulated data below that shows how a predictor with such characteristics inflates the between-cluster variance in a linear mixed model (I expect that this would hold in the binary outcome case as well):
        Code:
        version 16.1
        clear *
        set seed 683728
        
        * create scalars to be multplied by rnormal() when creating random intercepts
        scalar sd_sch_id = .5 // ICC = .5/4+.5 = .11
        
        * Schools
        set obs 50
        gen schid = _n 
        gen re_sch_id = sd_sch_id*rnormal()    // random intercept, school level 
        
        * Students
        expand 20 
        by schid, sort: gen stuid = _n
        
        * create a vector that contains the equivalent of a lower triangular correlation matrix for x1, x2, and y
        matrix c = (1, 0.5968, 1, 0.6623, 0.6174, 1)
        * create a vector that contains the means of the variables
        matrix m = (3.23,2.775,15.645)
        * create a vector that contains the standard deviations
        matrix sd = (1.05,1.47,4) 
        * draw a sample of 1000 cases from a normal distribution with specified correlation structure and specified means and standard deviations
        drawnorm x1 x2 y, n(1000) corr(c) cstorage(lower) means(m) sds(sd)
        corr y x1 x2 // looks good
        
        * Add random effects to x1 and x2
        egen pick1sch = tag(schid)
        *x1 and x2
        scalar sd_x1_sch = 1.2
        scalar sd_x2_sch = .01
        foreach v of varlist x1 x2 {
            gen re_`v'_sch = sd_`v'_sch*rnormal() if pick1sch==1
            bysort schid: replace re_`v'_sch = re_`v'_sch[_n-1] if missing(re_`v'_sch)
            replace `v' = `v' + re_`v'_sch 
        }
        
        *Add random effect to y 
        replace y = y + re_sch_id 
        
        *sanity check that code worked (looking for large sigma_u for x1and small sigma_u for x2
        foreach v of varlist x1 x2 y {
            xtreg `v', mle i(schid)
        }
        
        *calculate school means for x1 and x2 as well as school mean centered predictors
        foreach v of varlist x1 x2 y {
            bysort schid: egen sch_mn_`v' = mean(`v')
            gen cws_`v' = `v' - sch_mn_`v'
        }
        
        **Models showing how sizable between-school variability in x1 inflates between-school variance in y
        mixed y || schid: , stddev 
        eststo empty
        mixed y x2 || schid:, stddev
        eststo x2
        mixed y x1 || schid:, stddev 
        eststo x1
        mixed y x1 x2 || schid:, 
        eststo x1_x2
        And the results - notice how the standard deviation does not change when we add x2 but increases substantially when x1 is added.
        Code:
        esttab empty x2 x1 x1_x2, se nostar transform(ln*: exp(@) exp(@))
        ----------------------------------------------------------------
                              (1)          (2)          (3)          (4)
                                y            y            y            y
        ----------------------------------------------------------------
        y                                                               
        x2                               1.592                     0.976
                                      (0.0685)                  (0.0740)
        
        x1                                            2.342        1.480
                                                   (0.0904)      (0.100)
        
        _cons               15.44        11.14        7.478        7.769
                          (0.167)      (0.237)      (0.537)      (0.414)
        ----------------------------------------------------------------
        lns1_1_1                                                        
        _cons               0.763        0.758        3.034        2.045
                          (0.185)      (0.146)      (0.338)      (0.260)
        ----------------------------------------------------------------
        lnsig_e                                                         
        _cons               4.041        3.242        3.054        2.876
                         (0.0927)     (0.0744)     (0.0703)     (0.0666)
        ----------------------------------------------------------------
        N                    1000         1000         1000         1000
        ----------------------------------------------------------------
        Standard errors in parentheses

        Comment

        Working...
        X