Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating quarterly dummy variables

    Hi all:

    I am trying to generate a set of quarterly dummy variables for a range of dates but can't seem to figure out how to do it. I have the date variable formatted as %tq in one column. I tried tabulate with gen but that didn't work because the 1s did not correspond to the correct year and quarter. I am assuming that using the min(date) as the beg date and incrementing it by _n+1 quarter until the max(date) in the variable would be the way to go but can't figure it out how to do it. I would appreciate is someone can show me the correct procedure to generate the dummies. Thanks.

  • #2
    Why do you want to create your own indicator ("dummy") variables. If you are going to be using them in some kind of regression command, it is unnecessary. Stata's factor variable notation will handle it for you automatically. (-help fvvarlist-). So if your quarterly date is in a variable called qdate, it's just

    Code:
    regression_command outcome_var i.qdate other_predictors_and_covariates, appropriate_options
    If you want them for some other purpose, post back and we can point to other ways to create them.

    Comment


    • #3
      Thanks for your response Clyde. I will certainly be using the dummy variables in a regression but at this time I am trying to setup the data to eventually obtain the following result
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte propid int date2 long price2 int date1 long price1 byte(y2012q2 y2012q3 y2012q4 y2013q1 y2013q2 y2013q3 y2013q4 y2014q1 y2014q2 y2014q3 y2014q4)
      3 19857 3800000 19432 2000000 0  0  0 -1 0 0  0 0 1 0 0
      4 19627 2500000 19358 2600910 0  0 -1  0 0 1  0 0 0 0 0
      5 19536 5100000 19038 4500000 0  0  0  0 1 0  0 0 0 0 0
      7 19667 5054085 19319 5000000 0  0 -1  0 0 0  1 0 0 0 0
      7 20013 5830000 19667 5054085 0  0  0  0 0 0 -1 0 0 0 1
      8 19474 6100000 19211 4445885 0 -1  0  0 1 0  0 0 0 0 0
      end
      format %tdnn/dd/CCYY date2
      format %tdnn/dd/CCYY date1
      I would certainly appreciate any guidance. Thanks.




      Comment


      • #4
        1. You have a bunch of -1's scattered among the values of these "dummy" variables. Dummy variables are normally understood to be 0/1 only, and, in any case, I cannot discern the pattern leading you to put -1's in some places and not others. Pending further advice, I will ignore the -1's and put zeroes there.

        2. The following will do it:
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input byte propid int date2 long price2 int date1 long price1 byte(y2012q2 y2012q3 y2012q4 y2013q1 y2013q2 y2013q3 y2013q4 y2014q1 y2014q2 y2014q3 y2014q4)
        3 19857 3800000 19432 2000000 0  0  0 -1 0 0  0 0 1 0 0
        4 19627 2500000 19358 2600910 0  0 -1  0 0 1  0 0 0 0 0
        5 19536 5100000 19038 4500000 0  0  0  0 1 0  0 0 0 0 0
        7 19667 5054085 19319 5000000 0  0 -1  0 0 0  1 0 0 0 0
        7 20013 5830000 19667 5054085 0  0  0  0 0 0 -1 0 0 0 1
        8 19474 6100000 19211 4445885 0 -1  0  0 1 0  0 0 0 0 0
        end
        format %tdnn/dd/CCYY date2
        format %tdnn/dd/CCYY date1
        
        drop y*
        
        gen int quarterly_date = qofd(date2)
        format quarterly_date %tq
        
        // CREATE THE INDICATOR VARIABLES FOR EACH QUARTERLY DATE
        levelsof quarterly_date, local(qds)
        foreach q of local qds {
            local year = year(dofq(`q'))
            local quarter = quarter(dofq(`q'))
            gen y`year'q`quarter' = (quarterly_date == `q')
        }
        Thank you for using -dataex- to post your example data. Saves a lot of time on my end.

        I'm still wondering what you are going to do with the data in this layout. It is hard for me to think of anything in Stata for which this would be useful.

        Comment


        • #5
          Many thanks for the code. I really appreciate it. I will be running a repeat sales regression, and unfortunately, it does require the -1 in the setup. The -1 represent the first time a property is sold and the +1 represents the second time it's sold, all else equal 0 over the sample period. Only repeated pairs of sales transactions are used to create a price index by regressing the log of difference in sale prices over the time dummies. It's a method used in the real estate and housing literature. I know how to run the regressions but the difficult part is getting the data into the proper format for the analysis.

          Comment


          • #6
            Now, I get it! Thank you for explaining. The following code will do what you need:

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input byte propid int date2 long price2 int date1 long price1
            3 19857 3800000 19432 2000000
            4 19627 2500000 19358 2600910
            5 19536 5100000 19038 4500000
            7 19667 5054085 19319 5000000
            7 20013 5830000 19667 5054085
            8 19474 6100000 19211 4445885
            end
            format %tdnn/dd/CCYY date2
            format %tdnn/dd/CCYY date1
            
            // CREATE QUARTERLY DATES FOR DATES 1 AND 2
            forvalues i = 1/2 {
                gen int quarterly_date`i' = qofd(date`i')
                format quarterly_date`i' %tq
            }
            
            // BUILD A LIST OF ALL QUARTERLY DATES
            levelsof quarterly_date1, local(qds1)
            levelsof quarterly_date2, local(qds2)
            local qds: list qds1 | qds2
            
            // GENERATE THE INDICATORS
            foreach q of local qds {
                local year = year(dofq(`q'))
                local quarter = quarter(dofq(`q'))
                gen y`year'q`quarter' = (quarterly_date2 == `q')
                replace y`year'q`quarter' = -1 if quarterly_date1 == `q'
            }
            It's just a minor modification of the earlier version.

            By the way, one small problem: if the first and second time a property is sold are both in the same quarter, it is unclear whether that quarter should be coded 1 or -1. The code above will make it -1, my arbitrary choice.




            Comment


            • #7
              Thank you very much Clyde for providing the code!! It works. In the case of a property being bought and sold in the same quarter, it would be zero and the observation discarded. Only those quarter that differ would comprise the sample. I can make the necessary adjustment to the code for those situations if they arise. Again thanks for your valuable contribution. I really appreciate it.

              Best Regards
              Amrik

              Comment


              • #8
                Originally posted by Amrik Singh View Post
                Many thanks for the code. I really appreciate it. I will be running a repeat sales regression, and unfortunately, it does require the -1 in the setup. The -1 represent the first time a property is sold and the +1 represents the second time it's sold, all else equal 0 over the sample period. Only repeated pairs of sales transactions are used to create a price index by regressing the log of difference in sale prices over the time dummies. It's a method used in the real estate and housing literature. I know how to run the regressions but the difficult part is getting the data into the proper format for the analysis.

                Hi I am also running a repeat sales regression and am having difficulties with using the log of my returns. Alot of my returns are negative, how do you get around this ?
                thanks !!

                Comment


                • #9
                  Johannes: This seems to have nothing at all to do with quarterly dummy variables, so please start a new thread.

                  Comment


                  • #10
                    Thank you for your attention.

                    I generated a number of dummy variables from a variable indicating the the relevant quarter, labelled quarter, with the following command:

                    tabulate quarter, generate(timeq)

                    This generates a set of dummy variables that range from timeq1 to timeq68, such as timeq1, timeq2, timeq3, etc, till timeq68.

                    I am trying to think about a way to rename the abovementioned variable in order to change the names in the following way

                    timeq1 into 1995q1
                    timeq2 into 1995q2
                    timeq3 into 1995q3
                    timeq4 into 1995q4
                    timeq5 into 1996q1

                    .....

                    timeq68 into 2011q1

                    I had an enxtensive look on the previous posts but I am not sure if I have to use a loop or there is a simpler way I am missing.

                    To comply with the rules of the forum and avoid duplication of efforts, I report here the link where I posted the same question:

                    http://stackoverflow.com/questions/4...t-of-variables

                    Many thanks.

                    Comment


                    • #11
                      The variable names you are trying to create are not legal in Stata. All variable names must start with a letter or an underscore character.

                      This question is only tangentially relevant to the topic of the original thread. Please re-frame your question with a set of legal variable names and start a new thread for it.

                      Comment


                      • #12
                        Thank you Prof. Schechter. I will proceed as soon as possible.

                        Comment


                        • #13
                          giansoldati: Although you were asked to start a new thread, it's always a good idea to cross-reference both explicitly. There is no reason to make people search for the new thread if they are interested.

                          Comment


                          • #14
                            Originally posted by Clyde Schechter View Post
                            Now, I get it! Thank you for explaining. The following code will do what you need:

                            Code:
                            * Example generated by -dataex-. To install: ssc install dataex
                            clear
                            input byte propid int date2 long price2 int date1 long price1
                            3 19857 3800000 19432 2000000
                            4 19627 2500000 19358 2600910
                            5 19536 5100000 19038 4500000
                            7 19667 5054085 19319 5000000
                            7 20013 5830000 19667 5054085
                            8 19474 6100000 19211 4445885
                            end
                            format %tdnn/dd/CCYY date2
                            format %tdnn/dd/CCYY date1
                            
                            // CREATE QUARTERLY DATES FOR DATES 1 AND 2
                            forvalues i = 1/2 {
                            gen int quarterly_date`i' = qofd(date`i')
                            format quarterly_date`i' %tq
                            }
                            
                            // BUILD A LIST OF ALL QUARTERLY DATES
                            levelsof quarterly_date1, local(qds1)
                            levelsof quarterly_date2, local(qds2)
                            local qds: list qds1 | qds2
                            
                            // GENERATE THE INDICATORS
                            foreach q of local qds {
                            local year = year(dofq(`q'))
                            local quarter = quarter(dofq(`q'))
                            gen y`year'q`quarter' = (quarterly_date2 == `q')
                            replace y`year'q`quarter' = -1 if quarterly_date1 == `q'
                            }
                            It's just a minor modification of the earlier version.

                            By the way, one small problem: if the first and second time a property is sold are both in the same quarter, it is unclear whether that quarter should be coded 1 or -1. The code above will make it -1, my arbitrary choice.



                            Hi, what would be the regression command for this? As I'm not sure the results I'm obtaining are correct and therefore must be doing something wrong

                            Comment


                            • #15
                              I don't understand your question. The material you quote gives code for creating some variables, but there is no regression involved. If you have a question about creating variables relating to quarterly dates, please clarify it and repost it here. If your question is about something else, then please post it in a new thread (use the New Topic button), and again, provide enough information that those who might want to help you can understand what you are asking for. For good advice about how to do that well, please read the Forum FAQ.

                              Comment

                              Working...
                              X