Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I actually want to used the code in #15(2) as an extension to some previous code that we have at link:

    https://www.statalist.org/forums/for...70#post1643170

    And i do not want that last few lines of this code should be discarded as you said. Can we write code in #15(2) that runs comfortable even if we have following lines in the main code:

    Code:
    gen str32 group = "q" + string(mcap_group) +"_idiovol_q" +string(idiovol_group) + "_"
    drop *_group
    
    reshape wide vw_mean_@rt unwtd_mean_@rt, i(mdate) j(group) string
    Last edited by Sartaj Hussain; 11 Jan 2022, 12:55.

    Comment


    • #17
      Sure. Just do it in a different frame:

      Code:
      //  CALCULATE MEANS AND STANDARD DEVIATIONS OF EW AND VW MEAN RETURNS
      frame copy default means_and_sds
      frame change means_and_sds
      collapse (mean) mean_vw_mean_rt = vw_mean_rt mean_ew_mean_rt = ew_mean_rt ///
          (sd) sd_vw_mean_rt = vw_mean_rt sd_ew_mean_rt = ew_mean_rt, ///
          by(mcap_quintile idiovol_quintile)
          
      rename *_rt =_idiovol_
      reshape wide *_rt_idiovol_, i(mcap_quintile) j(idiovol_quintile)
      That will give you the means and standard deviations, in "matrix" form, sitting in frame means_and_sds, and from there you can save them, or export them or whatever it is you need to do with them.

      Then, after that, you can return to the data before that and run that other code:
      Code:
      frame change default
      gen str32 group = "q" + string(mcap_group) +"_idiovol_q" +string(idiovol_group) + "_"
      drop *_group
      
      reshape wide vw_mean_@rt unwtd_mean_@rt, i(mdate) j(group) string
      What you can't do is have both of these results in the same Stata data set, because their organizations of the data are incompatible with each other. (That's one of the key differences between Stata and a spreadsheet.)



      Comment


      • #18
        For code 17#(1), this error is encountered:

        variable mcap_quintile not found
        (error in option by())
        r(111);

        Variables are slightly named different here:

        nwtd_mean_q1_idiovol_q1_rt (Label: q1_idiovol_q1_ unwtd_mean_rt)

        vw_mean_q1_idiovol_q2_rt (Label: q1_idiovol_q2_ vw_mean_rt)

        17#(1) needs modification please.

        Comment


        • #19
          Well, the error message is referring specifically to variable mcap_quintile, and it's hard to see why that variable wouldn't be found if used at the point in the code that I originally suggested. I suspect you are using this code in a different place, or have added some other code that removes or renames the mcap_quintile variable, or are operating in a different frame. I need to see the full context. Please post example data and code that reproduces this error, and I'll try to troubleshoot.

          Comment


          • #20
            ok. Here is example data followed by the code.

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input int stock_id str56 stock float(date mdate rt mcap idiovol)
            1 "3M India Ltd."                   15127 497           .  428.19  .10065009
            1 "3M India Ltd."                   15157 498  -.14469875  366.23          .
            1 "3M India Ltd."                   15188 499  -.00645957  363.86  .06956539
            1 "3M India Ltd."                   15219 500   -.1764706  299.65    .088255
            1 "3M India Ltd."                   15249 501  .015037594  304.16  .07604478
            1 "3M India Ltd."                   15280 502   .08518519  330.07  .08619624
            1 "3M India Ltd."                   15310 503   -.0887372  300.78  .06342645
            1 "3M India Ltd."                   15341 504  -.02640445  292.84  .11689825
            1 "3M India Ltd."                   15372 505   .04250813  305.28  .14444776
            1 "3M India Ltd."                   15400 506   .14926204  350.85  .12306575
            1 "3M India Ltd."                   15431 507  .008990168     354  .07602096
            1 "3M India Ltd."                   15461 508  -.10039774  318.46  .09038305
            1 "3M India Ltd."                   15492 509  .008666365  321.22   .0916427
            2 "A B B India Ltd."                15127 497           . 1034.63  .06530816
            2 "A B B India Ltd."                15157 498  .008807095 1043.74  .05433929
            2 "A B B India Ltd."                15188 499   -.0930556  946.62  .05364652
            2 "A B B India Ltd."                15219 500  -.13563773  818.22  .11843422
            2 "A B B India Ltd."                15249 501 -.030118924  793.58    .046893
            2 "A B B India Ltd."                15280 502   .06915453  848.46  .10159888
            2 "A B B India Ltd."                15310 503 -.004149426  864.59   .0457102
            2 "A B B India Ltd."                15341 504    .0906863  942.99  .05675213
            2 "A B B India Ltd."                15372 505     .154382 1088.57  .13100874
            2 "A B B India Ltd."                15400 506   .03387194 1125.45   .1140816
            2 "A B B India Ltd."                15431 507  .002824284 1128.62  .09106217
            2 "A B B India Ltd."                15461 508  -.07641757 1042.38  .06690538
            2 "A B B India Ltd."                15492 509    .0927018 1139.01  .09048145
            3 "A C C Ltd."                      15127 497           . 2328.01   .0797946
            3 "A C C Ltd."                      15157 498   .01906163 2372.39  .09540155
            3 "A C C Ltd."                      15188 499  -.07949643 2183.79  .05714733
            3 "A C C Ltd."                      15219 500 -.026963634 2124.91  .13976797
            3 "A C C Ltd."                      15249 501  .069879495 2273.39   .1118523
            3 "A C C Ltd."                      15280 502   .28115615 2912.57   .1072322
            3 "A C C Ltd."                      15310 503  -.11045995 2591.13  .07732503
            3 "A C C Ltd."                      15341 504   .05401842 2731.58  .11700913
            3 "A C C Ltd."                      15372 505     .021875  2791.8  .11240999
            3 "A C C Ltd."                      15400 506  -.05840981 2628.73  .07147745
            3 "A C C Ltd."                      15431 507  -.02143555 2573.11  .04187184
            3 "A C C Ltd."                      15461 508  .003982782 2583.53  .07652828
            3 "A C C Ltd."                      15492 509   .05057847 2714.39   .0494189
            4 "A D C India Communications Ltd." 15127 497           .   58.65  .22523586
            4 "A D C India Communications Ltd." 15157 498  -.28196076   42.11  .13027842
            4 "A D C India Communications Ltd." 15188 499    .2113599   51.01  .18007316
            4 "A D C India Communications Ltd." 15219 500  -.29711455   35.86  .22130314
            4 "A D C India Communications Ltd." 15249 501 -.031430367   34.73  .12866698
            4 "A D C India Communications Ltd." 15280 502   .22052982   42.39  .14436947
            4 "A D C India Communications Ltd." 15310 503  -.05588716   40.02   .1278039
            4 "A D C India Communications Ltd." 15341 504  -.01724138   39.33  .09830305
            4 "A D C India Communications Ltd." 15372 505  -.03567255   37.93  .14645375
            4 "A D C India Communications Ltd." 15400 506 -.003638513   37.79  .08370786
            4 "A D C India Communications Ltd." 15431 507    .1905052   44.99  .23667724
            4 "A D C India Communications Ltd." 15461 508  -.10838452   40.11  .16488414
            4 "A D C India Communications Ltd." 15492 509 -.028096296   38.98   .0899244
            5 "A G C Networks Ltd."             15127 497           .  116.66  .10551737
            5 "A G C Networks Ltd."             15157 498  -.10980967  103.85  .15191686
            5 "A G C Networks Ltd."             15188 499 -.064418815   97.16  .10186497
            5 "A G C Networks Ltd."             15219 500   -.3108702   66.98  .13506944
            5 "A G C Networks Ltd."             15249 501   .14795916   76.87  .12192778
            5 "A G C Networks Ltd."             15280 502   .13518517   87.26  .06552172
            5 "A G C Networks Ltd."             15310 503    .1663948  101.78   .1620788
            5 "A G C Networks Ltd."             15341 504    .6041958  163.28   .1820102
            5 "A G C Networks Ltd."             15372 505   .15082827  187.91   .2737067
            5 "A G C Networks Ltd."             15400 506   .28606057  241.65  .13774931
            5 "A G C Networks Ltd."             15431 507   .08152694  261.36  .14321175
            5 "A G C Networks Ltd."             15461 508  -.10675384  233.46  .14417796
            5 "A G C Networks Ltd."             15492 509 -.000609793  233.32 .072601065
            6 "Aarti Industries Ltd."           15127 497           .   35.74          .
            6 "Aarti Industries Ltd."           15157 498  -.13281247   31.03          .
            6 "Aarti Industries Ltd."           15188 499    .2072072   37.48          .
            6 "Aarti Industries Ltd."           15219 500  .007462679   37.71          .
            6 "Aarti Industries Ltd."           15249 501   .09629629   41.14          .
            6 "Aarti Industries Ltd."           15280 502  -.00675675   41.03          .
            6 "Aarti Industries Ltd."           15310 503  -.04761908   38.93          .
            6 "Aarti Industries Ltd."           15341 504   .04285719   40.68          .
            6 "Aarti Industries Ltd."           15372 505    .1712329   49.75 .069698885
            6 "Aarti Industries Ltd."           15400 506  .017543843   50.66  .09248569
            6 "Aarti Industries Ltd."           15431 507    .2298851   62.37  .13722476
            6 "Aarti Industries Ltd."           15461 508   .11214953   69.17  .14189382
            6 "Aarti Industries Ltd."           15492 509   .15546213   80.09  .07460695
            7 "Aban Offshore Ltd."              15127 497           .   20.44  .12911019
            7 "Aban Offshore Ltd."              15157 498    .4193548   29.01  .19600622
            7 "Aban Offshore Ltd."              15188 499    -.004329   28.88  .09510764
            7 "Aban Offshore Ltd."              15219 500  -.23804344   22.01   .0899549
            7 "Aban Offshore Ltd."              15249 501   .06419398   23.42          .
            7 "Aban Offshore Ltd."              15280 502  .036193028   24.27   .1055649
            7 "Aban Offshore Ltd."              15310 503 -.029754207   23.55          .
            7 "Aban Offshore Ltd."              15341 504  -.04000003    22.6          .
            7 "Aban Offshore Ltd."              15372 505   .11111114   25.12  .24280624
            7 "Aban Offshore Ltd."              15400 506      .75375   44.05          .
            7 "Aban Offshore Ltd."              15431 507 -.036350626   42.44  .17515473
            7 "Aban Offshore Ltd."              15461 508   .16863903   58.28  .21415813
            7 "Aban Offshore Ltd."              15492 509    .3487341   78.61  .16154057
            8 "Abbott India Ltd."               15127 497           .  479.92 .036127977
            8 "Abbott India Ltd."               15157 498  -.09181438  435.86  .07008448
            8 "Abbott India Ltd."               15188 499   .03865462  452.71  .07183885
            8 "Abbott India Ltd."               15219 500  -.12685636  395.28  .08310906
            8 "Abbott India Ltd."               15249 501   .00573768  397.55  .04329815
            8 "Abbott India Ltd."               15280 502   .05175229  418.12  .07003711
            8 "Abbott India Ltd."               15310 503   -.0803952  384.51 .031918976
            8 "Abbott India Ltd."               15341 504   .03897198  399.49  .05026837
            8 "Abbott India Ltd."               15372 505    .2489862  498.96  .11764716
            end
            format %td date
            format %tm mdate
            Code:
            gen moy = month(dofm(mdate))
            gen year = year(dofm(mdate))
            //  CREATE A "FISCAL YEAR" RUNNING FROM JULY THROUGH SUBSEQUENT JUNE
            gen fyear = cond(moy > 6, year, year-1)
            frame put stock_id fyear mcap idiovol if moy == 6, into(mcap_idiovol_work)
            frame change mcap_idiovol_work
            collapse (count) n_mcap = mcap n_idiovol = idiovol (firstnm) mcap idiovol, by(stock_id fyear)
            assert n_mcap <= 1 & n_idiovol <= 1 // VERIFY UNIQUE VALUE OF MCAP AND idiovol
            replace fyear = fyear + 1 // CHANGE THE FYEAR TO WHICH THEY WILL APPLY
            rename (mcap idiovol) prior_june_=
            frame change default
            
            frlink m:1 stock_id fyear, frame(mcap_idiovol_work)
            frget prior_june_*, from(mcap_idiovol_work)
            frame drop mcap_idiovol_work
            drop mcap_idiovol_work
            egen byte representative = tag(stock_id fyear)
            tab representative
            
            //  QUINTILES BASED ON PREVIOUS FY JUNE VALUE OF mcap AND idiovol
            capture program drop double_quintiles
            program define double_quintiles
                capture assert missing(prior_june_mcap)
                if `c(rc)' {
                    xtile mcap_group = prior_june_mcap, nq(5)
                }
                capture assert missing(prior_june_idiovol)
                if `c(rc)' {
                    xtile idiovol_group = prior_june_idiovol, nq(5)
                }
                exit
            end
            
            frame put stock_id fyear prior_june_mcap prior_june_idiovol if representative, into(representatives)
            frame change representatives
            runby double_quintiles, by(fyear) verbose
            frame change default
            frlink m:1 stock_id fyear, frame(representatives stock_id fyear) // ***
            frget mcap_group idiovol_group, from(representatives)
            frame drop representatives
            drop representatives representative
            
            capture program drop one_weighted_return
            program define one_weighted_return
                    egen numerator = total(prior_june_mcap*rt)
                    egen denominator = total(prior_june_mcap)
                    gen vw_mean_rt = numerator/denominator
                    egen unwtd_mean_rt = mean(rt)
                exit
            end
            
            drop if missing(mcap_group, idiovol_group)
            runby one_weighted_return, by(mdate mcap_group idiovol_group)
            
            collapse (first) vw_mean_rt unwtd_mean_rt, by(mdate mcap_group idiovol_group)
            drop if missing(vw_mean_rt)
            keep mdate mcap_group idiovol_group *_mean_rt
            
            gen str32 group = "q" + string(mcap_group) +"_idiovol_q" +string(idiovol_group) + "_"
            drop *_group
            
            reshape wide vw_mean_@rt unwtd_mean_@rt, i(mdate) j(group) string

            Comment


            • #21
              Thanks. So it was a combination of a few things. I hadn't noticed that we switched from mcap_quintile (and idiovol_quintile) to mcap_group in the variable names. And then there is the question of the correct placement of the new code in the old code. (I also didn't know about the change from ew_ to unwtd_.)

              So the following works with your example:
              Code:
              gen moy = month(dofm(mdate))
              gen year = year(dofm(mdate))
              //  CREATE A "FISCAL YEAR" RUNNING FROM JULY THROUGH SUBSEQUENT JUNE
              gen fyear = cond(moy > 6, year, year-1)
              frame put stock_id fyear mcap idiovol if moy == 6, into(mcap_idiovol_work)
              frame change mcap_idiovol_work
              collapse (count) n_mcap = mcap n_idiovol = idiovol (firstnm) mcap idiovol, by(stock_id fyear)
              assert n_mcap <= 1 & n_idiovol <= 1 // VERIFY UNIQUE VALUE OF MCAP AND idiovol
              replace fyear = fyear + 1 // CHANGE THE FYEAR TO WHICH THEY WILL APPLY
              rename (mcap idiovol) prior_june_=
              frame change default
              
              frlink m:1 stock_id fyear, frame(mcap_idiovol_work)
              frget prior_june_*, from(mcap_idiovol_work)
              frame drop mcap_idiovol_work
              drop mcap_idiovol_work
              egen byte representative = tag(stock_id fyear)
              tab representative
              
              //  QUINTILES BASED ON PREVIOUS FY JUNE VALUE OF mcap AND idiovol
              capture program drop double_quintiles
              program define double_quintiles
                  capture assert missing(prior_june_mcap)
                  if `c(rc)' {
                      xtile mcap_group = prior_june_mcap, nq(5)
                  }
                  capture assert missing(prior_june_idiovol)
                  if `c(rc)' {
                      xtile idiovol_group = prior_june_idiovol, nq(5)
                  }
                  exit
              end
              
              frame put stock_id fyear prior_june_mcap prior_june_idiovol if representative, into(representatives)
              frame change representatives
              runby double_quintiles, by(fyear) verbose
              frame change default
              frlink m:1 stock_id fyear, frame(representatives stock_id fyear) // ***
              frget mcap_group idiovol_group, from(representatives)
              frame drop representatives
              drop representatives representative
              
              capture program drop one_weighted_return
              program define one_weighted_return
                      egen numerator = total(prior_june_mcap*rt)
                      egen denominator = total(prior_june_mcap)
                      gen vw_mean_rt = numerator/denominator
                      egen unwtd_mean_rt = mean(rt)
                  exit
              end
              
              drop if missing(mcap_group, idiovol_group)
              runby one_weighted_return, by(mdate mcap_group idiovol_group)
              
              //  CALCULATE MEANS AND STANDARD DEVIATIONS OF EW AND VW MEAN RETURNS
              frame copy default means_and_sds
              frame change means_and_sds
              collapse (mean) mean_vw_mean_rt = vw_mean_rt mean_unwtd_mean_rt = unwtd_mean_rt ///
                  (sd) sd_vw_mean_rt = vw_mean_rt sd_unwtd_mean_rt = unwtd_mean_rt, ///
                  by(mcap_group idiovol_group)
                  
              rename *_rt =_idiovol_
              reshape wide *_rt_idiovol_, i(mcap_group) j(idiovol_group)
              
              frame change default
              collapse (first) vw_mean_rt unwtd_mean_rt, by(mdate mcap_group idiovol_group)
              drop if missing(vw_mean_rt)
              keep mdate mcap_group idiovol_group *_mean_rt
              
              gen str32 group = "q" + string(mcap_group) +"_idiovol_q" +string(idiovol_group) + "_"
              drop *_group
              
              reshape wide vw_mean_@rt unwtd_mean_@rt, i(mdate) j(group) string
              Just a reminder, your end results will be in two separate frames. The means and standard deviations will be in frame_means_and_sds, and the monthly data will be in frame default. (With the example data, as previously, we do not actually have 25 portfolios because there aren't enough data to populate 5 separate levels of mcap and 5 separate levels of idiovol, but with the full data set there will be no problem.)

              Comment


              • #22
                Well, the problem is still there. I am actually running the code in #20 for each year seperately. Thereafter i get data in wide format comprising of 25 (vw) and 25(ew) series of returns for each year. I save them to excel. Then i consolidate such output for all the years together. Thereafter, i want to get means and sd's matrix for whole period. Above code in #21 will lead to separate matrix for each year which is not desired. Hope i made it very clear.
                Last edited by Sartaj Hussain; 11 Jan 2022, 15:41.

                Comment


                • #23
                  So the whole thing will be:
                  Code:
                  gen moy = month(dofm(mdate))
                  gen year = year(dofm(mdate))
                  //  CREATE A "FISCAL YEAR" RUNNING FROM JULY THROUGH SUBSEQUENT JUNE
                  gen fyear = cond(moy > 6, year, year-1)
                  frame put stock_id fyear mcap idiovol if moy == 6, into(mcap_idiovol_work)
                  frame change mcap_idiovol_work
                  collapse (count) n_mcap = mcap n_idiovol = idiovol (firstnm) mcap idiovol, by(stock_id fyear)
                  assert n_mcap <= 1 & n_idiovol <= 1 // VERIFY UNIQUE VALUE OF MCAP AND idiovol
                  replace fyear = fyear + 1 // CHANGE THE FYEAR TO WHICH THEY WILL APPLY
                  rename (mcap idiovol) prior_june_=
                  frame change default
                  
                  frlink m:1 stock_id fyear, frame(mcap_idiovol_work)
                  frget prior_june_*, from(mcap_idiovol_work)
                  frame drop mcap_idiovol_work
                  drop mcap_idiovol_work
                  egen byte representative = tag(stock_id fyear)
                  tab representative
                  
                  //  QUINTILES BASED ON PREVIOUS FY JUNE VALUE OF mcap AND idiovol
                  capture program drop double_quintiles
                  program define double_quintiles
                      capture assert missing(prior_june_mcap)
                      if `c(rc)' {
                          xtile mcap_group = prior_june_mcap, nq(5)
                      }
                      capture assert missing(prior_june_idiovol)
                      if `c(rc)' {
                          xtile idiovol_group = prior_june_idiovol, nq(5)
                      }
                      exit
                  end
                  
                  frame put stock_id fyear prior_june_mcap prior_june_idiovol if representative, into(representatives)
                  frame change representatives
                  runby double_quintiles, by(fyear) verbose
                  frame change default
                  frlink m:1 stock_id fyear, frame(representatives stock_id fyear) // ***
                  frget mcap_group idiovol_group, from(representatives)
                  frame drop representatives
                  drop representatives representative
                  
                  capture program drop one_weighted_return
                  program define one_weighted_return
                          egen numerator = total(prior_june_mcap*rt)
                          egen denominator = total(prior_june_mcap)
                          gen vw_mean_rt = numerator/denominator
                          egen unwtd_mean_rt = mean(rt)
                      exit
                  end
                  
                  drop if missing(mcap_group, idiovol_group)
                  runby one_weighted_return, by(mdate mcap_group idiovol_group)
                  
                  frame create regress_results str32 indvar int (idiovol_group mcap_group) ///
                      float(intercept tstat r2 adj_r2)
                  foreach v of varlist rmrf smb hml {   
                      forvalues iv = 1/5 {
                          forvalues mc = 1/5 {
                              regress unwtd_mean_mcap_q`mc'_idiovol_q`iv'_rt `v'
                              matrix M = r(table)
                              frame post regress_results ("`v'") (`iv') (`mc') (M["b", "_cons"]) ///
                              (M["t", "_cons"]) (e(r2)) (e(r2_a))
                          }
                      }
                  }
                  frame change regress_results
                  rename (intercept tstat r2 adj_r2) =_idiovol_
                  reshape wide *_idiovol_, i(mcap_group) j(idiovol_group)
                  
                  //  CALCULATE MEANS AND STANDARD DEVIATIONS OF EW AND VW MEAN RETURNS
                  frame copy default means_and_sds
                  frame change means_and_sds
                  collapse (mean) mean_vw_mean_rt = vw_mean_rt mean_unwtd_mean_rt = unwtd_mean_rt ///
                      (sd) sd_vw_mean_rt = vw_mean_rt sd_unwtd_mean_rt = unwtd_mean_rt, ///
                      by(mcap_group idiovol_group)
                      
                  rename *_rt =_idiovol_
                  reshape wide *_rt_idiovol_, i(mcap_group) j(idiovol_group)
                  
                  frame change default
                  collapse (first) vw_mean_rt unwtd_mean_rt, by(mdate mcap_group idiovol_group)
                  drop if missing(vw_mean_rt)
                  keep mdate mcap_group idiovol_group *_mean_rt
                  
                  gen str32 group = "q" + string(mcap_group) +"_idiovol_q" +string(idiovol_group) + "_"
                  drop *_group
                  
                  reshape wide vw_mean_@rt unwtd_mean_@rt, i(mdate) j(group) string
                  The regression results from the italicized code just added will be found in yet a third frame: regress_results.

                  Note: the italicized code is not tested because the example data does not contain the variables rmfr, smb, and hml. But I don't foresee any problems with that part.

                  Comment


                  • #24
                    This is ok. Please read my post @ #22 which seems have missed by you.

                    Comment


                    • #25
                      I had not noticed #22.

                      There is nothing in the code that separates the data by years. So I have to assume that either you have multiple one-year data sets and you are feeding them in succession to the code, or, less likely given that it would require a fair amount of coding, that you have embedded all of the code in a loop that accomplishes the same end.

                      In any case, it is now unclear to me exactly what you want to take the means and standard deviations of. The wide-format 25 portfolio means and sds per year--do you want the means and sds of those? Or do you want to, in effect, ignore those, and undo the splitting of the data into separate years, and do the calculation of means and sds of the vw_* and unwtd_* mean returns from the entire data set of all years? These will, in general, be different.

                      If it is the latter, then I would recommend running the code shown below using the entire starting data set (not split up into years):

                      Code:
                      gen moy = month(dofm(mdate))
                      gen year = year(dofm(mdate))
                      //  CREATE A "FISCAL YEAR" RUNNING FROM JULY THROUGH SUBSEQUENT JUNE
                      gen fyear = cond(moy > 6, year, year-1)
                      frame put stock_id fyear mcap idiovol if moy == 6, into(mcap_idiovol_work)
                      frame change mcap_idiovol_work
                      collapse (count) n_mcap = mcap n_idiovol = idiovol (firstnm) mcap idiovol, by(stock_id fyear)
                      assert n_mcap <= 1 & n_idiovol <= 1 // VERIFY UNIQUE VALUE OF MCAP AND idiovol
                      replace fyear = fyear + 1 // CHANGE THE FYEAR TO WHICH THEY WILL APPLY
                      rename (mcap idiovol) prior_june_=
                      frame change default
                      
                      frlink m:1 stock_id fyear, frame(mcap_idiovol_work)
                      frget prior_june_*, from(mcap_idiovol_work)
                      frame drop mcap_idiovol_work
                      drop mcap_idiovol_work
                      egen byte representative = tag(stock_id fyear)
                      tab representative
                      
                      //  QUINTILES BASED ON PREVIOUS FY JUNE VALUE OF mcap AND idiovol
                      capture program drop double_quintiles
                      program define double_quintiles
                          capture assert missing(prior_june_mcap)
                          if `c(rc)' {
                              xtile mcap_group = prior_june_mcap, nq(5)
                          }
                          capture assert missing(prior_june_idiovol)
                          if `c(rc)' {
                              xtile idiovol_group = prior_june_idiovol, nq(5)
                          }
                          exit
                      end
                      
                      frame put stock_id fyear prior_june_mcap prior_june_idiovol if representative, into(representatives)
                      frame change representatives
                      runby double_quintiles, by(fyear) verbose
                      frame change default
                      frlink m:1 stock_id fyear, frame(representatives stock_id fyear) // ***
                      frget mcap_group idiovol_group, from(representatives)
                      frame drop representatives
                      drop representatives representative
                      
                      capture program drop one_weighted_return
                      program define one_weighted_return
                              egen numerator = total(prior_june_mcap*rt)
                              egen denominator = total(prior_june_mcap)
                              gen vw_mean_rt = numerator/denominator
                              egen unwtd_mean_rt = mean(rt)
                          exit
                      end
                      
                      drop if missing(mcap_group, idiovol_group)
                      runby one_weighted_return, by(mdate mcap_group idiovol_group)
                      
                      
                      //  CALCULATE MEANS AND STANDARD DEVIATIONS OF EW AND VW MEAN RETURNS
                      collapse (mean) mean_vw_mean_rt = vw_mean_rt mean_unwtd_mean_rt = unwtd_mean_rt ///
                          (sd) sd_vw_mean_rt = vw_mean_rt sd_unwtd_mean_rt = unwtd_mean_rt, ///
                          by(mcap_group idiovol_group)
                          
                      rename *_rt =_idiovol_
                      reshape wide *_rt_idiovol_, i(mcap_group) j(idiovol_group)
                      This code omits the other sets of results, and therefore leaves the end-product in the default (and, in this case, only) frame. Again, to emphasize, it should be run with a single unified data set that includes the data from all the years that are in the scope of your study. If you are starting from separate one-year data sets, you need to -append- those all together for this. (If you were starting from a single unified data set and were just looping over years, then remove that layer of looping from your code.)

                      Comment


                      • #26


                        I have multiple one-year data sets and am feeding them in succession to the code. This is what generates 25 vw and ew times series for each year. I put this data for seperate years into one environment. Then i want the means and sds of the vw_* and unwtd_* mean returns from the entire data set of all years.

                        As an illustration. We run the code in #20 for each year separately. This code will give us 25 portfolio returs for each of vw and ew cases. Now, put the out put of this code for each year in a one data file and generate means and sd's for all the years.

                        Put differently, If we run code #20 on example data, it gives us out put for one year. Imagine we have similar format output for 10 years and we have to do calculations of means and sd's on that.

                        Hope I explained it rightly.

                        Actually I want it to do as we did earlier in #2 of this thread. Simple as that. In that case we had to get intercepts and t-stat, now it is mean and sd. Thats only difference.
                        Last edited by Sartaj Hussain; 11 Jan 2022, 17:04.

                        Comment


                        • #27
                          Put differently, If we run code #20 on example data, it gives us out put for one year. Imagine we have similar format output for 10 years and we have to do calculations of means and sd's on that.
                          Actually I want it to do as we did earlier in #2 of this thread. Simple as that. In that case we had to get intercepts and t-stat, now it is mean and sd. Thats only difference.
                          These appear to contradict themselves, unless the code in #2 was run on the fully combined data.

                          In addition, there is some discrepancy about the way things are named. Back in #2 we were calling the mcap_group (resp. idiovol_group) mcap_quintile (resp. idiovol_quintile). And we used ew_ as the prefix for what is now unwtd_. In the code below, I am going with groups and unwtd_. If you want to go back to quintiles and ew_, just make the corresponding substitutions in the code below.

                          Code:
                          frame create means_and_sds int (idiovol_group mcap_group) ///
                              float(mean_vw sd_vw mean_unwtd sd_unwtd)
                          forvalues iv = 1/5 {
                              forvalues mc = 1/5 {
                                  summ vw_mean_mcap_q`mc'_idiovol_q`iv'_rt
                                  local mean_vw = r(mean)
                                  local sd_vw = r(sd)
                                  summ unwtd_mean_mcap_q`mc'_idiovol_q`iv'_rt
                                  local mean_unwtd = r(mean)
                                  local sd_unwtd = r(sd)
                                  frame post means_and_sds (`iv') (`mc') (`mean_vw') (`sd_vw') ///
                                      (`mean_unwtd') (`sd_unwtd')
                              }
                          }
                          frame change means_and_sds
                          rename (mean* sd*) =_idiovol_
                          reshape wide *_idiovol_, i(mcap_group) j(idiovol_group)
                          This would be placed right at the end of the code in #2 (assuming that the code in #2 has been modified to use group and unwtd_ in its variable names.)

                          The way I reconcile in my mind the two quotes above is that you already have a spreadsheet containing the outputs of #2 from all of the years put together and you are going to use this spreadsheet as your input to the code shown in this post without actually re-running #2. That should work as well. You just need to get that spreadsheet data into Stata. As I don't know how your Excel file is named, nor how it is laid out, I'm leaving it to you to figure out how to do that importation.
                          Last edited by Clyde Schechter; 11 Jan 2022, 17:42.

                          Comment


                          • #28
                            These appear to contradict themselves, unless the code in #2 was run on the fully combined data.
                            Yes, the code in #2 runs on fully combined data.

                            the two quotes above is that you already have a spreadsheet containing the outputs of #2 from all of the years put together and you are going to use this spreadsheet as your input to the code shown in this post without actually re-running #2.
                            Yes. Correct.

                            Moreover ew and unwtd has same interpretation. This is what has been put in code to represent equal-weighted means. Earlier we used ew notation.

                            Having said above, I tried code in #27, following error came up:


                            variable vw_mean_mcap_q1_idiovol_q1_rt not found
                            r(111);

                            After running #2, the groups are named as, mcap extension is missing.
                            vw_mean_q1_idiovol_q1_rt
                            unwtd_mean_q1_idiovol_q1_rt

                            Should following corrections be made in code:

                            summ vw_mean_q`mc'_idiovol_q`iv'_rt
                            summ unwtd_mean_q`mc'_idiovol_q`iv'_rt


                            Then it does work.
                            Last edited by Sartaj Hussain; 11 Jan 2022, 19:52.

                            Comment


                            • #29

                              Furthermore, the first code in #15 which is about regression coefficients and R-squared etc. It needs some more corrections. I observe that it runs regression using one of three (rmfr, smb, hml) as independent variable. However, it should run one regression with all these as independent variables simultaneously. Then generate intercept and associated t-stat and slope coefficient for each of (rmfr, smb, hml) and their t-stat along-with R-squared and adj R-square of regression. Having said, i used code in #15 changing it like #28 above but it had error at end.

                              reshape wide *_idiovol_, i(mcap_quintile) j(idiovol_quintile)
                              (j = 1 2 3 4 5)
                              values of variable idiovol_quintile not unique within mcap_quintile
                              Your data are currently long. You are performing a reshape wide. You specified i(mcap_quintile) and j(idiovol_quintile). There are observations
                              within i(mcap_quintile) with the same value of j(idiovol_quintile). In the long data, variables i() and j() together must uniquely identify the
                              observations.

                              long wide
                              +---------------+ +------------------+
                              | i j a b | | i a1 a2 b1 b2 |
                              |---------------| <--- reshape ---> |------------------|
                              | 1 1 1 2 | | 1 1 3 2 4 |
                              | 1 2 3 4 | | 2 5 7 6 8 |
                              | 2 1 5 6 | +------------------+
                              | 2 2 7 8 |
                              +---------------+
                              Type reshape error for a list of the problem variables.
                              r(9);


                              Comment


                              • #30
                                Re #28: Yes, that would be the appropriate change to respond to the different variable names.

                                Re #29, I don't know how to proceed. For starters, I need an example data set that contains the variable rmrf, smb, and hml. I also need to know how these get handled as we move through the code. The issue is that we start out with observations over many firms and many months. But the variables unwtd_mean_mcap*_idiovol*_rt are variables that are aggregated up to the level of mdate, mcap group, and idiovol group. So in order to have these in a regression alongside rmrf, smb, and hml, those, too, must be aggregated up to the mdate, mcap group, and idiovol group level. But those variables do not appear in any of the code up to this point, so I have no idea how they get aggregated. The following block of code will do the regression you want and create the output you want:

                                Code:
                                frame create regress_results int (idiovol_group mcap_group) ///
                                float(intercept tstat b_rmrf t_rmrf b_smb t_smb b_hml t_hml r2 adj_r2)
                                forvalues iv = 1/5 {
                                    forvalues mc = 1/5 {
                                        local group mcap_q`mc'_idiovol_q`iv'
                                        regress unwtd_mean_`group'_rt `group'_rmrf `group'_smb `group'_hml
                                        matrix M = r(table)
                                        local topost (`iv') (`mc') (M["b", "_cons"]) (M["t", "_cons"])
                                        foreach x in rmrf smb hml {
                                            local topost `topost' (M["b", "`group'_`x'"]) (M["t", "`group'_`x'"])
                                        }
                                        frame post regress_results `topost' (e(r2)) (e(r2_a))
                                    }
                                }
                                frame change regress_results
                                rename (intercept tstat b_* t_* r2 adj_r2) =_idiovol_
                                reshape wide *_idiovol_, i(mcap_group) j(idiovol_group)
                                But, to reiterate and emphasize, I have seen no data or code that involves the variables rmrf, smb, and hml. So it is up to you to see to it that these variables are in the data set along with the unwtd_mean_*_rt variables at the time this code is invoked. None of the code I have written for you accomplishes that.

                                Comment

                                Working...
                                X