Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimate and predict: by groups.

    Dear All, I was asked the following question, and wonder if anyone can help. The data is
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float ID str6 stkcd float(date event_date ret) double market_ret float(dif event_window estimation_window)
    1 "000001" 17548 17611  .027638 -.025007 -40 0 0
    1 "000001" 17549 17611        0  .009484 -39 0 0
    1 "000001" 17552 17611 -.075795 -.048463 -38 0 0
    1 "000001" 17553 17611  -.05291 -.080253 -37 0 0
    1 "000001" 17554 17611  .055307   .04778 -36 0 0
    1 "000001" 17555 17611 -.010323   .01711 -35 0 0
    1 "000001" 17556 17611  .018722  .007905 -34 0 0
    1 "000001" 17559 17611 -.069835  -.07027 -33 0 0
    1 "000001" 17560 17611  .008185  .009767 -32 0 0
    1 "000001" 17561 17611 -.049552 -.006352 -31 0 0
    1 "000001" 17562 17611 -.019146 -.025233 -30 0 1
    1 "000001" 17563 17611  .037538 -.022311 -29 0 1
    1 "000001" 17566 17611  .097829   .08427 -28 0 1
    1 "000001" 17567 17611  -.01661 -.002606 -27 0 1
    1 "000001" 17575 17611 -.024665  -.01694 -26 0 1
    1 "000001" 17576 17611  .016767  .015615 -25 0 1
    1 "000001" 17577 17611 -.025142 -.011151 -24 0 1
    1 "000001" 17580 17611  .011925  .024991 -23 0 1
    1 "000001" 17581 17611   .02165  .021662 -22 0 1
    1 "000001" 17582 17611 -.066792 -.019228 -21 0 1
    1 "000001" 17583 17611 -.022995 -.002151 -20 0 1
    1 "000001" 17584 17611 -.051486 -.032832 -19 0 1
    1 "000001" 17587 17611 -.047146 -.037278 -18 0 1
    1 "000001" 17588 17611  .023763  .001051 -17 0 1
    1 "000001" 17589 17611  .042925  .026486 -16 0 1
    1 "000001" 17590 17611        0 -.002349 -15 0 1
    1 "000001" 17591 17611  .010671   .01271 -14 0 1
    1 "000001" 17594 17611 -.004525  .027603 -13 0 1
    1 "000001" 17595 17611 -.098485 -.020566 -12 0 1
    1 "000001" 17596 17611 -.011429 -.004749 -11 0 1
    1 "000001" 17597 17611  .030262  .011455 -10 0 1
    1 "000001" 17598 17611 -.019802 -.016813  -9 0 1
    1 "000001" 17601 17611 -.074074 -.039445  -8 0 1
    1 "000001" 17602 17611 -.018545  .004833  -7 0 1
    1 "000001" 17603 17611 -.040385  -.02965  -6 0 1
    1 "000001" 17604 17611  .006178  -.02997  -5 0 1
    1 "000001" 17605 17611 -.044129 -.009482  -4 0 1
    1 "000001" 17608 17611  .004416 -.056052  -3 1 0
    1 "000001" 17609 17611  .007194 -.057542  -2 1 0
    1 "000001" 17610 17611  .072222  .036885  -1 1 0
    1 "000001" 17611 17611  .073279  .031677   0 1 0
    1 "000001" 17612 17611 -.023103  .008794   1 1 0
    1 "000001" 17615 17611 -.064243 -.040167   2 1 0
    1 "000001" 17616 17611  .036967  .012121   3 1 0
    2 "000001" 17911 17976  .077086  .042738 -40 0 0
    2 "000001" 17912 17976  .009804   .00339 -39 0 0
    2 "000001" 17913 17976  .031068  .014589 -38 0 0
    2 "000001" 17916 17976  .046139  .007609 -37 0 0
    2 "000001" 17917 17976  .022502  .008721 -36 0 0
    2 "000001" 17918 17976  .037852 -.002518 -35 0 0
    2 "000001" 17919 17976        0   .01292 -34 0 0
    2 "000001" 17920 17976 -.012723 -.004521 -33 0 0
    2 "000001" 17930 17976  .001718   .01843 -32 0 0
    2 "000001" 17931 17976  .024871  .025772 -31 0 0
    2 "000001" 17932 17976  .091213  .025118 -30 0 1
    2 "000001" 17933 17976 -.018405 -.009551 -29 0 1
    2 "000001" 17934 17976  .030469  .041989 -28 0 1
    2 "000001" 17937 17976  .042456  .028243 -27 0 1
    2 "000001" 17938 17976 -.006545  .017543 -26 0 1
    2 "000001" 17939 17976  .010249  .002707 -25 0 1
    2 "000001" 17940 17976 -.037681  .002137 -24 0 1
    2 "000001" 17941 17976  .024849  .035635 -23 0 1
    2 "000001" 17944 17976  .054372  .024336 -22 0 1
    2 "000001" 17945 17976  .000697 -.034699 -21 0 1
    2 "000001" 17946 17976  -.02507  -.04453 -20 0 1
    2 "000001" 17947 17976 -.039286  .013929 -19 0 1
    2 "000001" 17948 17976  .013383  .024308 -18 0 1
    2 "000001" 17951 17976   .09978  .029733 -17 0 1
    2 "000001" 17953 17976  .010674  .001857 -16 0 1
    2 "000001" 17954 17976 -.089109 -.057169 -15 0 1
    2 "000001" 17958 17976  .014493  .015424 -14 0 1
    2 "000001" 17959 17976    -.035 -.004403 -13 0 1
    2 "000001" 17960 17976  .097705  .065896 -12 0 1
    2 "000001" 17961 17976  .021578  .008718 -11 0 1
    2 "000001" 17962 17976 -.010561 -.007622 -10 0 1
    2 "000001" 17965 17976  -.02068 -.040027  -9 0 1
    2 "000001" 17966 17976   .01158  .020552  -8 0 1
    2 "000001" 17967 17976 -.012795 -.007518  -7 0 1
    2 "000001" 17968 17976  .031378 -.000543  -6 0 1
    2 "000001" 17969 17976 -.019841 -.006132  -5 0 1
    2 "000001" 17972 17976  .029015  .015938  -4 0 1
    2 "000001" 17973 17976  .041967  .036093  -3 1 0
    2 "000001" 17974 17976 -.011328  .006787  -2 1 0
    2 "000001" 17975 17976  .005092  .021895  -1 1 0
    2 "000001" 17976 17976 -.029132 -.002316   0 1 0
    2 "000001" 17979 17976  .017613  .024565   1 1 0
    2 "000001" 17980 17976  .004487  .006408   2 1 0
    2 "000001" 17981 17976  .008296 -.022659   3 1 0
    end
    format %dCY-N-D date
    format %dCY-N-D event_date
    The purpose is to do the following (it took too long time to do that by using "loop" as below, more than 16,000+ firms). I post the code from the questioner as follows.
    Code:
    gen predicted_return=.
    forvalues i=1(1)16307 {
            qui reg ret market_ret if ID==`i' & estimation_window==1
            predict p if ID==`i'
            replace predicted_return=p if ID==`i' & event_window==1
            drop p
            }
    Basically, for each ID (firm), we use observations in the estimation window (estimation_window==1) to do the estimation. Given the estimated coefficients, we want to obtain the predicted values of ret (the dependent variable) in the event window (event_window==1). Any suggestion is highly appreciated!
    Ho-Chuan (River) Huang
    Stata 17.0, MP(4)

  • #2
    It should be much faster this way:

    Code:
    capture program drop one_regression
    program define one_regression
        if estimation_window == 1 {
            reg ret market_ret
            predict predicted_return
        }
        exit
    end
    
    runby one_regression, by(ID estimation_window) status
    Note: If you don't already have the -runby- command (by Robert Picard and me, available from SSC) you will need to install it.

    Although this will run much faster than the loop you wrote, you are working with a large data set, so don't expect miracles. I have specified the -status- option in the -runby- command so that Stata will periodically report on its progress through the data set and also give you an estimate on how much more time will be needed to complete the task. That report should help support patience as you can easily see progress being made.

    Added: the -if estimation_window == 1 { - command is not a mistake. When one_regression runs, the data in memory will consist only of those observations for a single ID and value of estimation_window. We want to do the regression only on those batches of data for which estimation_window == 1. Since estimation_window will necessarily be a constant at the time this is executed, the fact that estimation_window will, by default, be interpreted as estimation_window[1] is fine; any observation will do for this purpose.
    Last edited by Clyde Schechter; 10 Jan 2018, 21:43.

    Comment


    • #3
      Dear Clyde, Thanks again for your helpful suggestions.

      Ho-Chuan (River) Huang
      Stata 17.0, MP(4)

      Comment


      • #4
        Dear Clyde, I ran the code you suggested, but found that the results are what I has in mind. Let me explain in more details.
        1. For each ID, I have an event date, occurring at dif=0.
        2. First, I need to estimate the regression using observations in the estimation window (estimation_window=1), which is the 4th day before the event date to the 30th day.
        3. Second, given the coefficient estimates in the first stage, I need to predict the return (ret, the dependent variable) in the event window (event_window=1), which include three days before and after the event date (dif=-3,-2,-1,0,1,2,3).
        If you run the code, you will find that the prediction is made in the estimation window, not the in the event window. Any suggestion? Thanks.

        Ho-Chuan (River) Huang
        Stata 17.0, MP(4)

        Comment


        • #5
          Oh, sorry. I read your original loop-based code too quickly and didn't notice that there were two different windows in play here. The following code will do the regression in the estimation window and the prediction in the event window.

          Code:
          capture program drop one_regression
          program define one_regression
              reg ret market_ret if estimation_window
              predict predicted_return if event_window
              exit
          end
          
          runby one_regression, by(ID) status

          Comment


          • #6
            Dear Clyde, Many thanks. This is what I expected.

            Ho-Chuan (River) Huang
            Stata 17.0, MP(4)

            Comment


            • #7
              @Clyde Schechter Hi Clyde, may I ask you that is your post #5 an out-of-sample prediction? Many thanks!

              Comment

              Working...
              X