Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error 2000- No Observations

    I am having trouble estimating a negative binomial panel regression in Stata. The error returned is 2000-no observations and I suspect that it is due to having some variables for certain years that are specified in the model that have no data attributed. I'm sure a simple modification of the code is required but I've been racking my brain trying to figure out how to adjust the line of code. I've attached a snapshot of the dataset. This is the code that I'm specifying. Apologies for not using dataex, the number of lines in the dataset are over 500.

    by year, sort : xtnbreg annual_new_jobs_created pct_allagesinpoverty total_black
    > _residents pct_hsgrad pct_somecollegenodegree pct_associates pct_associates pct_
    > bachelors pct_graduateofprofessionaldegree logroadhome logpafund logiafund, re

    ----------------------------------------------------------------------------------
    -> year = 1998
    no observations

    ----------------------------------------------------------------------------------
    -> year = 1999
    no observations

    ----------------------------------------------------------------------------------
    -> year = 2000
    no observations

    ----------------------------------------------------------------------------------
    -> year = 2001
    no observations

    ----------------------------------------------------------------------------------
    -> year = 2002
    no observations

    ----------------------------------------------------------------------------------
    -> year = 2003
    no observations
    Click image for larger version

Name:	Screenshot 2017-08-08 15.23.26.png
Views:	1
Size:	638.0 KB
ID:	1405561

  • #2
    Before turning to your question, I implore you to stop using screenshots to post data examples. What you have here is not readable at all on my setup. I suppose some people may be able to read it. But even if they can, suppose we need to try some code solutions to your problem: how do you get data from a screenshot into Stata? Other than typing it in, you can't. And hardly anybody is going to be willing to do that. The useful way to show data is with the -dataex- command. Run -ssc install dataex- to install it into your Stata setup. The simple directions for using it are at -help dataex-. When you use -dataex-, you enable those who want to help you to create a complete and faithful replica of your Stata example with just a simple copy/paste operation. Moreover, -dataex- also provides information on storage types, labels, and display formats that are, in some cases, crucial to getting the code right, but are not discernible from most other ways of posting data. So please be sure to use -dataex- every time you post example data.

    Now, remember that in any regression analysis, any observation that has a missing value for any variable mentioned in the regression command is omitted from the analysis. So it doesn't take a lot of missing values sprinkled across the data set to end up with no usable observations. Evidently that is what is happening here. Actually, even though I can't read your screen shot, what I can make out is that it looks like there is an enormous amount of missing data here.

    This is not something that you can fix by changing the code, as it looks like nearly all of your variables are implicated, and most of them seem to be missing in most observations. You need to fix your data set: replace those missing values with the actual numbers that should be there. If you don't have and can't get most of those numbers, then you're going to need a plan B because no amount of tweaking the code is going to help here.

    Comment


    • #3
      Yes, I don't see any records among those shown that wouldn't be eliminated because of md. Maybe the records shown are atypical but it doesn't make for a good example.

      Also is the data xtset by year? If so it may be odd to run it for each year separately. Would any panel have more than one record?
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 18.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        Dear Statalist

        As a follow up on the problem of 'no observations', I suffer a similar fate.
        I am using the cmp command to estimate a triple hurdle model(3stages). When I specify the three hurdles jointly, I get the error 'no observation'.

        When I specify separately, I get valid results. However, to get the unconditional average partial effects of the second and third hurdle, I must estimate jointly so that the CAPE and the UAPE are easily gotten. What may be the problem?
        [code]
        cmp(pdtndecision = plantimp head_age head_gen chkpexpr head_edu radio wlkdsmnm dstfrcop dstextag cultarea lPrice_improved offfarm_income TLU totallab No_ofcropsgrown ag_machi lrainfall plantimpbar head_edubar radiobar wlkdsmnmbar dstfrcopbar dstextagbar cultareabar Price_improvedbar offfarm_incomebar TLUbar totallabbar No_ofcropsgrownbar ag_machibar rainfall i.year i.district) (MP = plantimp residual2 head_age head_gen chkpexpr head_edu hh_income motor_tr m_phone wlkdsmnm qldmnmkt tscstmmkt dstfrcop dstextag cultarea Price_improved offfarm_income TLU totalprod tvalue_h i.year i.district plantimpbar hhsizebar motor_trbar m_phonebar wlkdsmnmbar qldmnmktbar tscstmmktbar dstfrcopbar dstextagbar cultareabar Price_improvedbar offfarm_incomebar TLUbar totalprodbar tvalue_hbar) (qtysold = plantimp residual head_age head_gen chkpexpr head_edu hh_income motor_tr m_phone wlkdsmnm qldmnmkt tscstmmkt dstfrcop dstextag cultarea Price_improved offfarm_income TLU totalprod tvalue_h i.year i.district plantimpbar hhsizebar motor_trbar m_phonebar wlkdsmnmbar qldmnmktbar tscstmmktbar dstfrcopbar dstextagbar cultareabar Price_improvedbar offfarm_incomebar TLUbar totalprodbar tvalue_hbar, trunc(0 .)), indicators("pdtndecision*$cmp_probit" "MP*$cmp_probit" $cmp_cont) difficult nonrtolerance qui

        Comment


        • #5
          Martin, take a look at the second paragraph of my response to the original post in #2. In the context of a multiple equation model such as yours, it means that any observation that has a missing value for any variable in any of the equations, is omitted from the estimation sample. You have an enormous number of variables here, 52 if I have counted correctly, and it would only take a modest amount of missing data overall, scattered haphazardly through the data set to leave you with no observations that have non-missing values for every one of these many variables.

          For example, if 5% of all the data are missing and they are randomly distributed among the variables, the probability that any observation will have none of these 52 variables missing will be (1-.05)52, which is approximately 6.9%. If, overall, 10% of the data is missing, a similar calculation shows that only about 0.4% of all observations will, at random, have none of these 52 variables missing and thereby be eligible to participate in the model. I realize that missing data does not generally distribute randomly in this way (independent drdaws from a Bernoulli distribution), but you get the idea. And to have as much as 10% missing data in a data set is not unusual.

          The other thing to remember is that you can get this same message if any of these variables is stored as a string, because in that case that one variable alone is "missing" in every observation in the data set.

          Comment


          • #6
            Thanks very much Clyde.
            Firstly, I have no string variable in my dataset.
            Secondly, i only have missing values for my lagged variables. Since my dataset is a panel with three intervals, I get missing values for the first year.
            Nevertheless, when I drop the lagged variables, I still get this error. How do I proceed with this estimation even with my lagged variables?
            What is your advice

            Comment


            • #7
              Davia:
              as Clyde and Richard have already highlighted, your dataset is plagued with missing values.
              From a (difficult) scan of the screenshot you posted (please do follow Clyde's wise advice for the future), there seems no variable that Stata can actually include in any inferential procedure (dots spread everywhere and Stata applies listwise deletion).
              Unfortunately, I do noth think that you can ask anything informative out of the current version of your database.
              I think you have two options:
              - go out and search for more data (hardly feasible, I think);
              - define missing mechanism and patterns; then, go -mi- (if feasible).
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                Dear all,
                I am using Stata 15.1 on windows 10. I am conducting a research on the impact of access to microfinance on child nutrition. I am using Seemingly Unrelated Regression Model. I am almost done with my Stata do file which is due next week tuesday. The problem that l have always is when l ran a regression on the very last two commands is everytime bringing me "no observations error". l have try everything for the past month like watching youtube videos, consulting friends and reading pdfs on the same but all to no avail.

                Below are the Stata commands that are showing the error as stated above: *Correlation analysis*
                . sureg (haz Loan_access LOCATION sex_hhhead HH_Food_Consumed emplo_head child_illness none_hh plsc_hh sec_hh posts
                > ec_hh hhsize age_n_months SEXChild)(waz Loan_access LOCATION sex_hhhead HH_Food_Consumed emplo_head none_hh pls
                > c_hh sec_hh postsec_hh child_illness hhsize age_n_months SEXChild), corr

                *Regression analysis*
                sureg (haz Loan_access LOCATION sex_hhhead HH_Food_Consumed emplo_head child_illness none_hh plsc_hh sec_hh postsec_hh hhsize age_n_months SEXChild)(waz Loan_access LOCATION sex_hhhead HH_Food_Consumed emplo_head none_hh plsc_hh sec_hh postsec_hh child_illness hhsize age_n_months SEXChild)

                I will greatly appreciate your help on this.

                Comment


                • #9
                  The other advice is already nice, please do include an example of your data. My Stata can hold up to a billion observations, a 500 observation dataex shouldn't give it very much trouble.

                  The issue here is likely a misunderstanding of syntax. Unless you really do wanna run the estimator by the year (which I suspect is not the case) you don't need or want the by year, sort ​​​​​​​syntax at all. In fact, and please correct me if I'm wrong, xt commands automatically sort the data by the panel and time variables, right, so it isn't needed in the first place.

                  Anyways, for me or anyone to really help you, aside from your example data, I'd need to know what all those missing variables are. Seems like survey or census data. What're these data meant to measure, where'd you get them from, how often are they even collected. And anyways, what's even the research question? I know this is about Stata, but it's far easier for us to help when we have some context as to what the problem even is. Davia Downey

                  Comment

                  Working...
                  X