Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating dummy including range of categories

    Statalist,

    I am working with the following longitudinal panel data and need to create a dummy for units that code as 1 for each unit only if it has all the values between 0 and 10 for the variable TImeSeries. Because I have units with only 0-6 or 0-8, not the full range of 0-10, using -inlist- or -cond()- have left me with a dummy (i.e. Ten below) that codes units that do not have the complete range from 0-10 as 1. Any suggestions would be greatly appreciated. - Jenny

    FIPS year TimeSeries Ten
    54107 2010 0 1
    54107 2011 1 1
    54107 2012 2 1
    54107 2013 3 1
    54107 2014 4 1
    54107 2015 5 1
    54107 2016 6 1
    Last edited by Jenny Savely; 23 Dec 2017, 13:28.

  • #2
    Your question leaves a lot to the imagination. What is a "unit" here? Does it correspond to FIPS, or combination of FIPS and year?

    Also, what are all the possible values of TimeSeries? Are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 the only possibilities? If so, there is a very simple solution. If there are other possibilities, then it is still not hard, but a different, slightly more complicated approach is required. Also, is it possible for the same "unit" to have the same value of timeseries more than once?

    Finally, in the example you show, timeseries = year - 2010 in all observations. Is that true throughout your data?
    Last edited by Clyde Schechter; 23 Dec 2017, 14:12.

    Comment


    • #3
      Clyde,

      Yes, FIPS is my unit of measure. I have tsset FIPS year. TImeSeries possible values are only 0-10. For each unit I have data from 1990-2016. but am wanting to create a dummy for a specific 10 year time interval that is unique to each unit, thus the TImeSeries variable for the associated year for a particular unit. The example just happens to be a unit whose interval begins at 2010. I will create a dummy for five year time intervals, but because each unit has a minimum of five years of data available, that dummy is much easier to create. Thank you!

      Comment


      • #4
        So, the following should work:

        Code:
        //  VERIFY timeseries ALWAYS AN INTEGER FROM 0 TO 10
        assert inrange(timeseries, 0, 10) & timeseries == int(timeseries)
        by fips (timeseries), sort: gen byte has_all_0_to_10 = ///
            _N == 11 & timeseries == _n-1

        Comment


        • #5
          Clyde - It seems the missing values (.) in my timeseries variable, where 0-10 do not apply because the years are outside of the interval, is denying the assertion.

          . assert inrange(TimeSeries, 0, 10) & TimeSeries == int(TimeSeries)
          11,470 contradictions in 18,549 observations
          assertion is false
          r(9);

          This results in the dummy coding in only 0. Is there a command appropriate here to ignore the missing values?
          Last edited by Jenny Savely; 23 Dec 2017, 14:59.

          Comment


          • #6
            OK, so 0 through 10 are not the only possible responses, missing value is possible as well. So a different approach is needed.

            Code:
            gen byte has_all_0_to_10 = 1
            forvalues i = 0/10 {
                by fips, sort: egen has_`i' = max(TimeSeries == `i')
                replace has_all_0_to_10 = 0 if has_`i' == 0
            }
            After that has run, you can -drop has_0 -has_10- if you don't think you will need those individual variables later. I left them there mainly so you could see what's going on and how the code works.

            Comment


            • #7
              Thank you. That worked wonderfully. I appreciate your time and help.

              Comment


              • #8
                I am using xtsur in Stata 14.0. I have a panel data set for US states from 2001-2015. I ran a SUR regression with state fixed effects
                Xtsur (Y1 x1 x2 x3 x4 x5 x6 i.state) (Y2 x1 x2 x3 x4 x5 x6 i.state), corr

                Y1 and Y2 are input shares, X1 is output, X2 and X3 are Relative prices, X4 and X5 are interactions between relative price and a dummy (X6).
                Running this model I get this error
                classdef _b_stat() in use
                (nothing dropped)
                (327 lines skipped)
                (error occurred while loading xtsur.ado)
                r(310);

                Please help
                Douglas Caiphas

                Comment


                • #9
                  This question is completely unrelated to the topic of this thread. A substantial fraction of Statalist members search for answers to questions, relying on the title of the thread. By posting an unrelated question in a thread, you deprive others of the opportunity to see the follow-up to your question if they have a similar one, and you waste the time of those who are seeking answers to the question that started this thread.

                  Please repost this as a New Topic.

                  Comment

                  Working...
                  X