Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ambiguous abbreviation error

    Hi, I am pretty new to Stata. Forgive me if my question sounds stupid.

    I am trying to run regression on
    dependent variable: lnratio
    other variables are all dummies from year2000 to year 2014

    I am trying to ask Stata to omit the first dummy (year2000) but it didn't really follow my instruction and neither did it drop the first variable automatically. Had several attempts below and need help on how I can ask Stata to drop the base year2000 dummy.

    . reg lnratio year2000-year2014 i.year
    year ambiguous abbreviation
    r(111);

    . regress lnratio year2000-year2014 ibyear2000.year
    ibyear2000.year invalid name
    r(198);

    . xtreg lnratio* year*,fe
    must specify panelvar; use xtset
    r(459);

    Thanks

    Esther

  • #2
    It is not clear what format your data is in. Do you have dummy variables named year2000, year2001, year2002... or do you have a single variable named year that contains the values 2000, 2001...

    The error message year ambiguous abbreviation is issued because you have a variable namein your command that (I guess -year-) that Stata thinks is an abbreviation and cannot tell to which variable it refers (if you don't have a variable named -year-, year could be an abbreviation for year2000, year2001, year2002....

    This leads me to think that you have a series of dummy variables named year2000...year2014. If so, then you can just omit this variable in your command: reg lnratio year2001-year2014.

    The second error message ibyear2000.year invalid name is issued because you incorrectly specified the factor variable notation for base number. It should be ib2000.year. However, you would only use factor variable notation if you have an indicator variable, say, -year- , that would have the values 2000, 2001...2014. Then you would give the command: reg lnratio ib2000.year See: help fvvarlist

    The final error message must specify panelvar; use xtset is issued because you are attempting to use a panel command but have not set your data as such. See help xtset.

    I think you and I may both be confused because of the format of the data. You may have "wide" data, and I am imagining that you have "long" data. Take a moment to read the manual entry on [D] reshape and see if this applies to your data. In most cases, you will want your data in long format.
    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3

      Hi Carole,
      Thanks for your help! You are right! I have very long data. about thousands of data points. variable names are lnratio year2000 year2001 year 2001 .... year 2014. The attached photo show the ideal result I am looking for. my question is that what to write to run the regression so that it shows on the result table I omitted base year dummy(for example, year1987 below).

      Thanks
      Esther

      Comment


      • #4
        Hi Esther,
        In order to get the results you want, you need to create a single categorical variable called year containing the values 2000, 2001...2014 (if the value of year2000=1, then year will be 2000; if year2001=1, then year will be 2001; etc). Then you can use the factor notation to tell Stata to automatically create dummy variables for the regression and omit the category for year 2000: reg lnratio ib2000.year. To create the variable, you can use a simple loop like this:

        Code:
        gen year=.
        **This assumes your years run from 2000 to 2014 with no years missing
        **  and if year2000=1, all other year20** variables are 0
        forvalues i=2000/20014 {
             replace year=`i' if year`i'==1
             }
        
        tab year
        
        reg lnratio ib2000.year

        Stata/MP 14.1 (64-bit x86-64)
        Revision 19 May 2016
        Win 8.1

        Comment


        • #5
          Hi Carole,

          Thanks for the code!!

          but one more question regarding the following cuz all my year related vriables are dummies so some of them are -1, some are 1 and others are 0. if this is the feature of my dummies, can I still use your code? or do i need to modify?
          thanks for you reply!!
          ** and if year2000=1, all other year20** variables are 0"

          Comment


          • #6
            meanwhile when i look at the ideal table I want, it didnt write ib1987.year so I am wondering whether there is any other way I can go about it? Hopefully i make myself understood

            Comment


            • #7
              1) You don't have properly formed dummy variables if you have the categories/values -1, 0, 1. What do these values mean? In some coding schemes, negative numbers are missing values (in which case you need to recode all your variables so that -1==. If -1 has a meaning other than missing, then you've got to back up and re-form your variables. A dummy variable usually means the presence or absence of a characteristic/attribute/choice and is usually code 1 for yes and 0 for no. Any other values (except system missing) will mess up the dependency in the system of dummy variables.
              Code:
              *to recode -1 to missing
              forvalues i=2000/2014 {
                  recode year`i' (-1=.)
                  }

              2) This would have prevented Stata from omitting a category so your original regression may now work (reg lnratio year2000-year2014), however, Stata may not drop the dummy variable you want (it may omit year2014). The proper way to specify dummy variables and omitted categories is to use factor notation as described above. You should run the recode loop (if -1 is missing) and then the loop in #4 (although the loop in #4 will still work without having to recode). Once you recode your dummy variables, you can compare the results of:

              Code:
              reg lnratio year2000-year2014
              reg lnratio ib2000.year
              If Stata automatically omitts the year2000 in the first command, then the results will be the same in the 2nd command.
              Stata/MP 14.1 (64-bit x86-64)
              Revision 19 May 2016
              Win 8.1

              Comment


              • #8
                Hi Carole,

                I tried your code. it did help a lot!! but as you foresaw, stata omitted the last variable instead of the first one (year2000 as I wish), do you know how to do that?
                meanwhile, for the method I am using it is a way to calculate housing price index by using repeated sales method and -1,0,1 are all properly
                defined dummies (as attached). The data looks exactly like the sample data I attached. I understand your reasoning for the definition of dummy but here, i guess, is more like a special case.

                hope to hear from you soon

                Esther

                Comment


                • #9
                  I've never heard of a dummy variable that takes on more than two values. In any case, using all year variables will produce collinearity because all observations will necessarily sum to 1. Therefore, if you know the sum of variables year2000-year2013, you know the value of year2014. The way to choose which variable to omit, but still have it displayed as omitted in the results table:

                  Code:
                  reg lnratio o.year2000 year2001-year2014
                  Stata/MP 14.1 (64-bit x86-64)
                  Revision 19 May 2016
                  Win 8.1

                  Comment

                  Working...
                  X