Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • insufficient observations r(2001) issue

    Hello,

    I am trying to run a regression using a panel data, using something like: "xtreg wages i.high_qual i.illness_disability i.reason_for_leaving_job if training==1 & education==1, re". However, despite that my dataset is large (around 300,000 observations), STATA gives out an error term saying "insufficient observations r(2001)". Does anyone know what the problem may be? I am not convinced that it is a lack of sufficient observations. Would be very grateful!!

  • #2
    xtreg is a panel data command, so if your restrictions define a cross-section, you will get the error.

    Code:
    webuse grunfeld, clear
    xtreg invest mvalue kstock if year==1945
    Res.:

    Code:
    . xtreg invest mvalue kstock if year==1945
    insufficient observations
    r(2001);

    However, having said that, I am speculating here. If you want a definitive answer, present a data example using dataex as recommended in FAQ Advice #12.

    Comment


    • #3
      Have you tried some cross tabulations before fitting the model? Try some cross tabulation of the variables in the model with the codition "if training==1 & education==1" and see if that flags something.

      Roman

      Comment


      • #4
        Thanks, Andrew Musau, prior to using xtreg command, I set the dataset as panel using xtset id time

        Comment


        • #5
          That applies to #2 (implicitly)

          Code:
          webuse grunfeld, clear
          xtset company year
          xtreg invest mvalue kstock if year==1945
          But my restriction (year==1945) defines a single cross-section. That may or may not be the case with your data. If you can replicate it with 100 observations and can share your data, it will be helpful to find out why you get the error. You can copy and paste the output of the following in case you can share your data.

          Code:
          dataex id time wages high_qual illness_disability reason_for_leaving_job training education in 1/100

          Comment


          • #6
            Thank you, Andrew Musau. Here is what it gave me:

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input double id float(time wages) byte(high_qual illness_disability reason_for_leaving_job children general_health marrital_status region age sector) float(training_hrs training education)
              22445 2012  11.49954 3 2  . 0 4 1  7  5 5 168 1 0
              22445 2013 10.007475 3 2  . 0 3 1  7  5 4   0 0 0
              22445 2014  16.31351 3 2  . 0 . 1  7  5 5  30 1 0
              22445 2015 16.195204 3 2  . 0 . 1  7  6 5   0 0 0
              22445 2016 18.026306 3 2  . 0 . 1  7  6 5   0 0 0
              22445 2017 15.332698 3 2  . 0 . 2  7  6 5   0 0 0
              22445 2018 13.799448 3 2  . 1 . 2  7  6 5  60 1 0
              22445 2019 11.073648 1 2  . 2 . 2  7  6 5  80 1 1
              29925 2012         0 1 1  . . 2 .  7  7 .   0 0 1
              29925 2014         0 1 1  . 2 . 4  7  7 4   0 0 1
              29925 2015         0 1 1  . 2 . 5  7  7 4   0 0 1
              29925 2016 14.374425 1 1  0 2 . 5  7  7 3   0 0 1
              29925 2017  13.41613 1 1  . 2 . 5  7  8 3   0 0 1
              29925 2018 16.659334 1 1  . 2 . 5  7  8 3   0 0 1
              29925 2019 14.785123 1 1  . 2 . 5  7  8 3   0 0 1
              76165 2015 17.742147 3 2  . 0 . 2  5  6 2   0 0 0
              76165 2016  22.99908 3 2  . 0 . 2  5  6 2   0 0 0
              76165 2017 101.85307 3 2  . 1 . 2  5  6 2   0 0 0
              76165 2018  26.16638 3 2  . 1 . 2  5  7 2   0 0 0
              76165 2019  28.12459 3 2  . 2 . 2  5  7 2   0 0 0
             223725 2015         0 5 2  . . 2 .  8  8 2   0 0 0
             223725 2016         . 5 2  . . 3 .  8  8 2   0 0 0
             280165 2010  9.318868 4 2  . 1 2 1  8  6 3   0 0 0
             280165 2011         0 4 2  . . 3 .  8  6 2   0 0 0
             280165 2012         0 4 2  . . 2 .  8  6 2   0 0 0
             280165 2013 19.067024 4 1  . 1 1 1  8  6 2   0 0 0
             280165 2014         0 4 2  . . 2 .  8  7 2   0 0 0
             280165 2015  18.15717 4 2  . 1 . 1  8  7 2   0 0 0
             280165 2016  17.24931 4 2  . 1 . 2  8  7 2   0 0 0
             280165 2017  17.24931 4 1  . 1 . 2  8  7 2   0 0 0
             280165 2018  17.24931 4 2  . 1 . 2  8  7 2   0 0 0
             280165 2019 15.130974 4 2  . 1 . 2  8  8 2   0 0 0
             333205 2014  13.20893 4 2  . 0 . 1  5  4 2  56 1 0
             333205 2015  11.49954 2 2  . 0 . 1  5  5 3 420 1 0
             333205 2016  13.20893 2 2  . 0 . 1  5  5 3  72 1 0
             333205 2017 12.431935 2 2  . 0 . 1  5  5 3   0 0 0
             333205 2018 15.572274 2 2  . 0 . 1  5  5 3 147 1 0
             333205 2019 16.783112 2 2  . 0 . 2  5  5 3 280 1 0
             387605 2012         0 3 2  . 0 2 1  9  4 .   0 0 0
             387605 2013         0 3 2  . 0 3 1  9  5 .   0 0 0
             387605 2014         0 3 2  . 0 . 1  9  5 .  56 1 0
             387605 2015 12.882072 3 2  . 0 . 1 10  5 6   0 0 0
             469205 2017  8.213957 5 1  . 1 . 1  4  5 .   0 0 0
             469205 2018  9.774609 5 2  . 1 . 1  4  5 5  16 0 0
             469205 2019  10.45495 5 2  . 2 . 1  4  5 5   0 0 0
             541285 2011         0 1 2  . . 2 .  4  5 5   0 0 1
             541285 2012         0 1 2  . . 1 .  4  5 .   0 0 1
             541285 2013         0 1 2  . . 4 .  4  5 .   0 0 1
             541285 2014         0 1 2  . 0 . 1  4  5 .   0 0 1
             541965 2011         0 3 2  . . 2 .  4  4 3   0 0 0
             599765 2012         0 3 2  . . 3 .  5  5 2   0 0 0
             599765 2013 13.176576 2 2  . 0 1 1  5  5 2 150 1 0
             599765 2017  19.46642 2 2  . 0 . 1  5  6 2  21 1 0
             599765 2018   20.7412 2 2  . 0 . 1  5  6 2  16 0 0
             599765 2019 22.053913 1 2  . 0 . 2  5  6 2 125 1 1
             665045 2011  5.929899 3 2  . 0 2 1  5  5 2   0 0 0
             665045 2012  5.036798 3 2  . 0 2 1  5  6 2   0 0 0
             665045 2013  6.924448 3 2  . 0 2 1  5  6 2   0 0 0
             665045 2014  4.408157 3 2  . 0 . 1  5  6 2   0 0 0
             665045 2016  8.882475 3 2  . 0 . 1  5  6 2   0 0 0
             665045 2018 3.8868446 3 1  0 0 . 1  .  7 2   0 0 0
             665045 2019  5.606026 3 1  . 0 . 1  5  7 2   0 0 0
             732365 2017         0 9 1  . 0 . 1  2  6 .   0 0 0
             732365 2018         0 9 1  . 0 . 1  2  6 .   0 0 0
             732365 2019         0 9 1  . 0 . 1  2  6 .   0 0 0
             813285 2012         0 5 2  . . 2 .  2  8 4   0 0 0
             813285 2013         0 5 2  . . 1 .  2  8 4   0 0 0
             813285 2014         0 5 2  . . 2 .  2  8 4   0 0 0
             813285 2015         0 5 2  . . 2 .  2  9 4   0 0 0
             813285 2016         . 5 2  . . 2 .  2  9 4   0 0 0
             850005 2015         0 4 1  . . 3 .  3  5 4   0 0 0
             956765 2010         . 9 1  . . 4 .  1 11 4   . 1 0
             956765 2011         0 9 1  . . 5 .  1 11 4   0 0 0
             987365 2010         . 3 2  . . 1 .  2  4 .   . 1 0
             987365 2011         0 1 2  . . 1 .  2  4 4   0 0 1
            1114525 2010         . 3 2  . . 3 . 11  7 4   . 1 0
            1558565 2011         0 4 1  . 0 3 1 10  3 .   0 0 0
            1587125 2014         0 2 1  . 0 . 1  1  9 .   0 0 0
            1587125 2015         0 2 2  . 0 . 1  1 10 2   0 0 0
            1587125 2016         . 2 2  . 0 . 1  1 10 2   0 0 0
            1587125 2017         0 2 2  . 0 . 1  1 10 2   0 0 0
            1587125 2018         0 2 2  . 0 . 1  1 10 2   0 0 0
            1587125 2019         0 2 2  . 0 . 1  1 10 2   0 0 0
            1697285 2016  8.432996 1 1  . 0 . 5  4  8 3   0 0 1
            1697285 2017 10.272923 1 1  . 0 . 5  4  8 3   0 0 1
            1697285 2018  8.690586 1 1  . 0 . 5  4  9 3  20 0 1
            1731965 2013  15.33272 4 2  . 0 3 1 11  4 4  84 1 0
            1833965 2010         . 5 1  . 0 4 1  8  9 4   0 0 0
            1833965 2011 10.180926 5 1 97 0 4 1  8  9 3   0 0 0
            1833965 2012         0 5 1  . . 5 .  8  9 2   0 0 0
            1833965 2013  9.255444 5 2 97 0 3 1  8  9 2   0 0 0
            1833965 2014  8.080758 5 2  3 0 . 1  8  9 4   0 0 0
            1833965 2015 15.736213 5 2  0 0 . 1  8 10 2   0 0 0
            1833965 2016 12.777267 5 2  0 0 . 1  8 10 3   0 0 0
            1833965 2017  7.973014 5 2  0 0 . 1  8 10 5   0 0 0
            2067205 2014         0 1 2  . 0 . 1 10  5 .   0 0 1
            2067205 2015  11.51978 1 2  . 0 . 1 10  5 5   0 0 1
            2270525 2016         . 3 2  . 1 . 1  3  3 .   0 0 0
            2270525 2017         0 3 1  . 2 . 1  3  4 .   0 0 0
            2292285 2011         0 9 1  . . 4 .  6  7 .   0 0 0
            end
            label values high_qual b_hiqual_dv
            label def b_hiqual_dv 1 "Degree", modify
            label def b_hiqual_dv 2 "Other higher degree", modify
            label def b_hiqual_dv 3 "A-level etc", modify
            label def b_hiqual_dv 4 "GCSE etc", modify
            label def b_hiqual_dv 5 "Other qualification", modify
            label def b_hiqual_dv 9 "No qualification", modify
            label values illness_disability b_health
            label def b_health 1 "yes", modify
            label def b_health 2 "no", modify
            label values reason_for_leaving_job b_stendreas
            label def b_stendreas 3 "made redundant", modify
            label def b_stendreas 97 "other reason", modify
            label values children b_nchund18resp
            label values general_health b_sf1
            label def b_sf1 1 "excellent", modify
            label def b_sf1 2 "very good", modify
            label def b_sf1 3 "good", modify
            label def b_sf1 4 "fair", modify
            label def b_sf1 5 "or Poor?", modify
            label values marrital_status b_marstat
            label def b_marstat 1 "single, nvr marr/civ p", modify
            label def b_marstat 2 "married", modify
            label def b_marstat 4 "separated legally marr", modify
            label def b_marstat 5 "divorced", modify
            label values region b_gor_dv
            label def b_gor_dv 1 "North East", modify
            label def b_gor_dv 2 "North West", modify
            label def b_gor_dv 3 "Yorkshire and the Humber", modify
            label def b_gor_dv 4 "East Midlands", modify
            label def b_gor_dv 5 "West Midlands", modify
            label def b_gor_dv 6 "East of England", modify
            label def b_gor_dv 7 "London", modify
            label def b_gor_dv 8 "South East", modify
            label def b_gor_dv 9 "South West", modify
            label def b_gor_dv 10 "Wales", modify
            label def b_gor_dv 11 "Scotland", modify
            label values age b_agegr13_dv
            label def b_agegr13_dv 3 "18-19 years old", modify
            label def b_agegr13_dv 4 "20-24 years old", modify
            label def b_agegr13_dv 5 "25-29 years old", modify
            label def b_agegr13_dv 6 "30-34 years old", modify
            label def b_agegr13_dv 7 "35-39 years old", modify
            label def b_agegr13_dv 8 "40-44 years old", modify
            label def b_agegr13_dv 9 "45-49 years old", modify
            label def b_agegr13_dv 10 "50-54 years old", modify
            label def b_agegr13_dv 11 "55-59 years old", modify
            label values sector b_jbrgsc_dv
            label def b_jbrgsc_dv 2 "managerial & technical occupation", modify
            label def b_jbrgsc_dv 3 "skilled non-manual", modify
            label def b_jbrgsc_dv 4 "skilled manual", modify
            label def b_jbrgsc_dv 5 "partly skilled occupation", modify
            label def b_jbrgsc_dv 6 "unskilled occupation", modify

            Comment


            • #7
              Thank you for the data example. In the attached sample, there are only 8 nonmissing values for the variable "reason_for_leaving_job" and all these coincide with training=0. So the restriction training=1 does not define any observations in the sample.

              Code:
              sum wages high_qual illness_disability reason_for_leaving_job training education
              sum wages high_qual illness_disability training education if !missing(reason_for_leaving_job)
              sum wages high_qual illness_disability training education if !missing(reason_for_leaving_job) & training==1
              Res.:

              Code:
               sum wages high_qual illness_disability reason_for_leaving_job training education
              
                  Variable |        Obs        Mean    Std. Dev.       Min        Max
              -------------+---------------------------------------------------------
                     wages |         92    9.132454    12.45144          0   101.8531
                 high_qual |        100        3.34    1.960313          1          9
              illness_di~y |        100        1.72    .4512609          1          2
              reason_for~b |          8      24.625    44.68281          0         97
                  training |        100         .17    .3775252          0          1
              -------------+---------------------------------------------------------
                 education |        100         .19    .3942772          0          1
              
              . sum wages high_qual illness_disability training education if !missing(reason_for_leaving_job)
              
                  Variable |        Obs        Mean    Std. Dev.       Min        Max
              -------------+---------------------------------------------------------
                     wages |          8    10.28311    3.870431   3.886845   15.73621
                 high_qual |          8        4.25    1.488048          1          5
              illness_di~y |          8       1.625    .5175492          1          2
                  training |          8           0           0          0          0
                 education |          8        .125    .3535534          0          1
              
              . sum wages high_qual illness_disability training education if !missing(reason_for_leaving_job) & training==1
              
                  Variable |        Obs        Mean    Std. Dev.       Min        Max
              -------------+---------------------------------------------------------
                     wages |          0
                 high_qual |          0
              illness_di~y |          0
                  training |          0
                 education |          0
              
              .
              There are too many missing values for reason for leaving work variable. What does it mean when this variable is missing? Don't you have the reason or the individual never left work and so the variable does not apply to him or her?

              Comment


              • #8
                Thank you so much, Andrew Musau !!! When I try to run my regression without "reasons for leaving the job" variable, it works!! I think missing values represent those who have responded as 'inapplicable', so I cannot say for sure if they have never left their work or not. There is insufficient information in the survey.

                Comment


                • #9
                  If it is not a crucial variable, you probably should drop it. Otherwise, you need to contact the data providers for more details. Any which way, you do not appear to have a large enough sample of individuals with a reason for leaving a job to conduct any interesting analysis.

                  Comment


                  • #10
                    Andrew Musau, right, got it - thank you so much! The variable is not essential and I decided to drop it - thanks for your help, I really appreciate it !!

                    Comment

                    Working...
                    X