Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Getting Standard Error of Regression of previous observations

    Hi all,

    I am fairly new to STATA so I would appreciate help. To simplify, I have a dataset with two columns: year and money as shown below (fictitious data, for illustration):
    2000 95256
    2001 98531
    2002 65465
    2003 54612
    2004 54624
    2005 54652
    2006 64671
    2007 32132
    2008 13211
    2009 25896
    What I would like to do is for the year 2009, generate a variable that displays the standard error if I were to regress money onto year for the five previous years. So, the new column in 2009 would be the standard error of regressing money onto year for 2008-2004. I can do this in excel fairly easily, but I cannot seem to do this is STATA. If it helps, I have been able to create five columns for each row that obtains the values for 'money' from the previous five years, but I do not think that will help.

    Thanks everyone. Really appreciate the assistance.
    -David

  • #2
    I don't find the question very clear, but try the following code

    Code:
    gen se=.
    forval i=2005/2009 {
    regress money year if year>`i'-6 & year<`i'
    replace se=_se[year] if year==`i'
    }
    rich

    Comment


    • #3
      David,

      Here is a very simple-minded way to do what you are asking for:

      Code:
      regress money year if inrange(year,2004,2008)
      gen se=_se[mpg] if year==2009
      However, if you want to repeat this process for multiple years (e.g., for each year, generate the SE of the regression for the previous five years), that would require more work. If that is the case, let us know more about your data and we can suggest a more general solution.

      Regards,
      Joe

      Comment


      • #4
        I would need it to do it for multiple years, but I can do that with a loop correct?

        Comment


        • #5
          Hello all,

          First I would like to thank everyone for the previous help, but I am still struggling. I have done this in the past in excel, but I am attempting to push my skills with Stata and to remove the chance of manual error. My goal is to find a way to record the standard error of a regression into a variable, while only regressing the previous five observations. Nick Fox suggested that I look at rolling commands, which I have been trying, but is not working. I would appreciate it if you could give me some advice.

          Since I seem to have been unclear previously, my data looks like below. I have different airline names with different revenues for different years. For some airlines I have more years of data than others. My goal is to regress the yearly operating revenues (yr_op_revenues) onto the year variable, for the five previous observations (not including the current observation, which rolling seems to do). For example, for a 2006 observation I would like to regress the data from 2001-2005 and record the standard error estimate of the year (only independent variable) into 2006. If there is not five years of data, I want the standard error to be missing data.

          I have tried by statements, but the last line of this code gives me a 'weights not allowed' error so I cannot put the se variable into the observation I desire.
          Code:
          gen se = .
          by unique_carrier_name: regress yr_op_revenues year if inrange(year,year[_n],year[_n-6])
          replace se[_n] = _se[year]
          I have tried nested foreach commands to get within the airline, and then do the regression for each year, but the code below has not been working.

          Code:
          foreach airline of varlist unique_carrier_name {
          foreach yearnum of varlist year {
          regress yr_op_revenues year if `yearnum' == `yearnum'[_n-6]+6
          gen se=_se[year]
          }
          }
          I have also tried a foreach with a rolling command. I needed to do a foreach before the rolling or else there were non-unique years of data. However I keep getting invalid syntax.

          Code:
          foreach airline of varlist unique_carrier_name {
          local egen min_year = min(year)
          local egen max_year =max(year)
          
          tsset year
          time variable: year min_year to max_year
          delta: 1 unit
          
          rolling _b _se, window(5) saving(betas, replace): regress yr_op_revenues year
          }
          I know this is a lot to ask, but how in the world do I do this? I do not want to go back to making datasets in excel, but STATA is driving me a little nuts. Thanks so much everyone.
          unique_carrier_name year yr_op_revenues se
          American Airlines 1996 23541 .
          American Airlines 1997 26526 .
          American Airlines 1998 15486 .
          American Airlines 1999 23654 .
          American Airlines 2000 58962 .
          American Airlines 2001 25426 .
          American Airlines 2002 35214 .
          American Airlines 2003 25632 .
          American Airlines 2004 25415 .
          American Airlines 2005 26842 .
          American Airlines 2006 26547 .
          American Airlines 2007 21456 .
          United Airlines 1993 36514 .
          United Airlines 1994 25632 .
          United Airlines 1995 21456 .
          United Airlines 1996 23698 .
          United Airlines 1997 57852 .
          United Airlines 1998 21453 .
          United Airlines 1999 26987 .
          United Airlines 2000 24563 .
          United Airlines 2001 25426 .
          United Airlines 2002 26541 .
          Delta 1985 26523 .
          Delta 1986 25413 .
          Delta 1987 26532 .
          Delta 1988 24123 .
          Delta 1989 25412 .
          Delta 1990 26351 .
          Delta 1991 23652 .
          Delta 1992 23652 .
          Delta 1993 25413 .
          Delta 1994 25632 .
          Delta 1995 25874 .
          Delta 1996 26531 .
          Delta 1997 25412 .
          Delta 1998 36521 .
          Delta 1999 35632 .
          Delta 2000 96542 .
          Delta 2001 25632 .

          Comment


          • #6
            Lots of confusion here! Frankly, the impression is that you are making lots of wild guesses about what the syntax might be and then are surprised and disappointed when you get lots of error messages in return. Still, you are trying something that isn't trivial at your stage of learning Stata.

            On the rolling suggestion that I made (not Nick Fox, whoever he may be), it's an error to tsset your data just in terms of the time variable: it's panel data. You need to declare panel identifier too.

            This code, for example,

            Code:
            local egen min_year = min(year)
            local egen max_year = max(year)
            is fairly bizarre. It is important that you learn not to write code randomly like this. (That may well sound patronising, but it's true.) It happens to be legal. But what does it do?

            The first statement creates a local macro egen containing the text following. The second statement then replaces the macro contents with different contents. But you never use the macro, so nothing is achieved. Fortunately, if you just tell tsset what your set-up is you don't need anything like it.

            You should be able to adapt this example to get (closer to) what you want.

            Code:
            webuse grunfeld
            tsset company year
            rolling  _b _se,  window(6) : regress mvalue year

            Comment


            • #7
              Hi Nick,

              Sorry about getting your name wrong previously. To answer your question, I wrote the min_year and max_year variables to calculate the starting and ending year of data for each airline. As I read rolling, it seemed that I needed to pre-define a time period, but since the years for airlines were different, I was trying to make the time period variable with min_year/max_year.

              Comment


              • #8
                OK, but I am not clear that you take my small point. Your local definitions just define the contents of a little macro, meaning here one piece of text. They would not themselves do anything to calculate minimum and maximum years. The egen statements, applied separately, would themselves fail for other reasons.

                The positive advice remains. As I understand it, your central problem yields easily to rolling.

                Comment

                Working...
                X