Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Prediction after Xtregar

    Dear all ,

    I have fitted a model to a subset of my database (90% of the database : training=0). I want to predict the outcome using the 10% remaining data (training=1).

    xtset partid temps2
    Xtregar Var1 Var2 Var3 Var4 if training==0.

    Predict Var5 if training ==1 .

    I am not a regular user of panel data. Unfortunately, my prediction did not match my training database very well.

    I guess i have miss something. Could you give me some advice ?
    I'm sorry that I can't give more details about the context of this work for reasons of confidentiality.

    Many thanks for your help,
    Olivier


  • #2
    In general, why would you expect the predictions to be similar, assuming you are comparing in-sample and out-of-sample predictions? Unless you have a sufficiently large sample and the panels are relatively homogeneous, there is no reason to think that predicted means will be similar. Also, note that the sample division should be at the panel level with panel data (i.e., not at the observation level).

    Comment


    • #3
      Hi ,
      Thanks for your reply,
      I don't expect my prediction to be similar. However, the range of my initial data (between -51 and -115) is larger than the prediction (-71 ; -90) and all predictions are close to the constant of my model. I suspect that my model is not sufficiently informative.

      "Also, note that the sample division should be at the panel level with panel data (i.e., not at the observation level)." I am not sure to have understand your point ...

      Many thanks for your time
      Olivier

      Comment


      • #4
        Originally posted by olivier dejardin View Post
        Hi ,
        Thanks for your reply,
        I don't expect my prediction to be similar. However, the range of my initial data (between -51 and -115) is larger than the prediction (-71 ; -90) and all predictions are close to the constant of my model. I suspect that my model is not sufficiently informative.
        Expecting that the range be close is a similarity measure. This depends on whether selecting subsamples allows you to achieve true randomization as implied by #2.


        "Also, note that the sample division should be at the panel level with panel data (i.e., not at the observation level)." I am not sure to have understand your point ...
        When you separate the 10 percent from the 90 percent, how are you doing that? You should be doing this at the panel level. This implies that if a certain panel is present in the 10% subsample, all its observations should be in this subsample. If you choose the samples at the observation level, you risk having observations belonging to the same panel in both subsamples.

        Comment

        Working...
        X