Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mixed Linear Regression Postestimation

    Hi to all,
    I am using Stata to find the relationship between a continuous dependent variable and some explanatory variables through mixed linear regression. To check the model's generalization, I have split my data into two parts, train and test.
    How can I apply the model to my test data set considering the random effect parts not just the Xb portion?
    P.S: The model is a random intercept model.
    Thank you'all.

  • #2
    I think what you want is to get the "fitted" prediction. Look in help mixed postestimation##predict, and you will find that the fitted prediction gives "fitted values, fixed-portion linear prediction plus contributions based on predicted random effects".

    Just an observation about predictions from mixed effects models - they are actually quite good at making predictions at the cluster level. What makes them good is they do something called partial pooling, which is the idea that for clusters with few observations or cases, their prediction is pulled toward the weighted sample mean given the covariates. It seems that you are making predictions for cases within clusters, however, so this may not be particularly useful for you.

    Comment


    • #3
      Thank you for your response.
      Yes, I want to get the "fitted" prediction. I have got the fitted values on my training data set. However, I don't know how to get the "fitted" values on my cross-sectional set to explore the generalization of the model.

      Comment


      • #4
        Sorry for not responding sooner. This situation depends on whether your hold out/testing sample is a) of clusters that were not included in the original mixed model or b) is of units within clusters in which some other units in that cluster are observed. If it is the former, then no. You have no information whatsoever about those clusters, so the most appropriate prediction for them is the population (fixed effect - xb in Stata) prediction. But if it is instead the latter, then you will get cluster level predictions (random effects - reffects in Stata) for those clusters.
        Code:
        use http://www.stata-press.com/data/r16/pig.dta, clear
        *Hold out sample of clusters (ids)
        splitsample, cluster(id) split(.85 .15) generate(hold_out_cl) rseed(834098)
        
        gen weight2 = weight
        replace weight2 = . if hold_out_cl==2
        
        qui mixed weight2 week || id: week, cov(un) reml
        
        predict weight2_fix, xb 
        predict weight2_mix, fitted
        predict weight2_eb*, reffects
        
        sum weight2_* if hold_out_cl==2  // no predictions of random effects
        
            Variable |        Obs        Mean    Std. Dev.       Min        Max
        -------------+---------------------------------------------------------
         weight2_fix |         63    50.34553     16.1147   25.57968   75.11138
         weight2_mix |          0
         weight2_eb1 |          0
         weight2_eb2 |          0
        
        *Hold out sample of units (observations w/in clusters)
        splitsample, split(.85 .15) generate(hold_out_unit) rseed(834098)
        
        gen weight3 = weight
        replace weight3 = . if hold_out_unit==2
        
        mixed weight3 week || id: week, cov(un) reml
        
        predict weight3_fix, xb 
        predict weight3_mix, fitted
        predict weight3_eb*, reffects
        
        sum weight3_* if hold_out_unit==2
        
            Variable |        Obs        Mean    Std. Dev.       Min        Max
        -------------+---------------------------------------------------------
         weight3_fix |         65    49.19337    15.72399   25.53997   75.33659
         weight3_mix |         65    49.12059    16.25986   22.35899   83.86142
         weight3_eb1 |         65   -.0157374    .5312524   -1.42909   1.074583
         weight3_eb2 |         65   -.0441666    2.558147  -4.141885   7.773883
        See also:
        https://www.statalist.org/forums/for...f-sample-blups
        https://www.statalist.org/forums/forum/general-stata-discussion/general/1434511-predicting-multi-level-mixed-model-values-using-fixed-random-effects-for-out-of-sample-records

        Comment

        Working...
        X