Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thank you for your help. what I need is to calculate the R2 for each firm in each year. to calculate the weight variable I need to create a new variable that represent the total market value of each industry in each week. To do so I write this code: by industry_id week , sort: egen industry_mv = total( weeklymv).
    is this code OK or should I use another one.

    Comment


    • #17
      Your code
      Code:
      by industry_id week , sort: egen industry_mv = total( weeklymv)
      calculates the total market value of all firms in a specific industry in a specific week. Do I understand that you then want to follow this up with

      Code:
      gen weight = weeklymv/industry_mv

      Since you want to calculate the R2 for each firm in each year, you will probably want to use -statsby- to drive your iterated regressions. So rather than the code shown under "NOW DO YOUR REGRESSION" in #15 it will be something more like this:

      Code:
      gen int year = yofd(date) // CREATE YEAR VARIABLE
      statsby e(r2_o), by(firm_id year) saving(r2_by_firm_by_year, replace): ///
          xtreg weekly_return weekly_market_return L1.weekly_market_return ///
          weighted_mean_industry_return L1.weighted_mean_industry_return
      
      // MAYBE DO OTHER THINGS HERE
      
      // WHEN YOU WANT TO FINALIZE THE CALCULATION F SYNCHRONIZATION
      use r2_by_firm_by_year, clear
      rename _stat_1 r2
      gen synch = log(R2/(1-R2))
      // ...

      Comment


      • #18
        Hi Clyde
        statsby e(r2_o), by(firm_id year) saving(r2_by_firm_by_year, replace): /// xtreg weekly_return weekly_market_return L1.weekly_market_return /// weighted_mean_industry_return L1.weighted_mean_industry_return
        I need your help again. I calculate every thing however when I tried to run the above code to calculate R2 an error appear in the screen no observations
        an error occurred when statsby executed regress.
        what do you think cause this error.

        Comment


        • #19
          Well, it means that for some combination of firm_id and year in your data set, there were no observations that had non-missing values for all of the variables in the regression. You will have to go through your data to find out which firm_ids and years (there may be more than one such combination) are responsible, and why. I'd do something like this:

          Code:
          by firm_id year, sort: egen obs_count = total(!missing(weekly_return, weekly_market_return ///
                  L1.weekly_market_return weighted_mean_industry_return L1.weighted_mean_industry_return))
          tab firm_id year if obs_count == 0

          Comment


          • #20
            Hi Clyde its me again
            I tried to solve the problem but it still exist. However when I try to run the regression with just using the following code it work perfectly. could you help me in this please.

            statsby e(r2), by(firm_id year industry_id firmname)saving("C:\Users\lza023\Desktop\clyde\r22 .dta", replace): regress weekly_return weekly_market_return weighted_mean_industry_return

            I think if we create a new two variables( one for weekly market return and one for weekly industry return) with lagged values we can side step this problem.
            Last edited by Issa Almaharmeh; 21 May 2015, 03:32.

            Comment


            • #21
              Hi
              Could you help me on this.
              How can I create a new two variables that represent the last week market return and last week industry return. I have tried this code however it incorrect because this code generate a new variable with the previous value even if it from not directly from last week.
              gen Lag.weekly_market_return = weekly_market_return[_n-1]

              Comment


              • #22
                Hello,

                I would like to calculate for the variable LLR and LLP one R-squared per year (a yearly R-squared) for the period 1995-2018.

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input float(year id LLR LLP)
                2007  1  .011630812    .018261097
                2017  2  .005206013   .0009052709
                2018  2  .005082208   .0004451569
                2015  2  .004713984             0
                2016  2  .005010014   .0005392081
                2014  2  .007704217  .00026062978
                2014  3           0             0
                2015  3           0             0
                2011  4  .010910933    .002482072
                2010  4  .014617286      .0030025
                2012  4  .011325285  .00013020562
                2008  4  .010616484    .003585204
                2009  4  .016729187    .016583314
                2007  4  .007800257    .001282144
                2006  4  .007627713    .000923452
                2010  5  .017166357             0
                2011  6  .021382164     .01136998
                2008  6  .008355642    .003057135
                2012  6   .01443208    .004534268
                2016  6  .007447635    .001197504
                2006  6  .007908663     .00720697
                2017  6  .007080396             0
                2007  6  .008464329   .0009984137
                2005  6  .007931727    .007931727
                2013  6  .016162274    .005791808
                2014  6  .012107523   .0006828538
                end
                For the first step I tryed it with the collapse command.
                Code:
                collapse(mean) Z_score NPA LLR LLP, by (year)
                Do you have any advice calculate now the R-squared per year?

                Thank you very much!

                Comment


                • #23
                  I think you've started down the wrong direction here. If you collapse the data -by(year)- you are left with only one observation per year, so you cannot regress anything separately each year.

                  If I understand what you want, your need is to do a separate regression of LLR and LLP on each year's data, and you want to keep the resulting values of R2 in your data set. The easiest way to do that is:

                  Code:
                  rangestat (reg) LLR LLP, by(year) interval(LLR . .)
                  To do this, you need to install the -rangestat- command, written by Robert Picard, Nick Cox, and Roberto Ferrer, available from SSC.

                  In addition to saving the R2 in new variable reg_r2, it will also save the number of observations in the regression, and the constant and LLP coefficients with their standard errors. If you really don't care about anything but the R2, you can always just drop the others.

                  Applying this to your data you will get mostly missing values, because most of your years have only one or two associated observations in the example data, so no regression can be done for them. Presumably that will not be a problem in your real data set.

                  Comment


                  • #24
                    Thank you very much for the answer.

                    My model has four proxies, Z_score NPA LLR LLP. I would like to have the R2 of one proxxy among the remaining proxies. If I run the command for the variable Z_score and NPA (sorry, for not being consistent with the var in my example):

                    Code:
                    rangestat (reg) Z_score NPA LLR LLP, by(year) interval(Z_score . .)
                    rangestat (reg) Z_score NPA LLR LLP, by(year) interval(NPA . .)
                    I'm getting the same R2 results for both proxies - but I would like to have different R2 per proxy per year, as they are different.
                    The var r2_Z_y and r2_NPA_y are calculated with your recommended code and provide the same R2 for both proxies.

                    I calculated the var r2Z and r2NPA with the following command:

                    Code:
                    rangestat (reg) NPA LLR LLP, by(year) interval(Z_score . .)
                    rangestat (reg) Z_score LLR LLP, by(year) interval(NPA . .)

                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input float(year id) double(r2_Z_y r2_NPA_y r2Z r2NPA)
                    2007  1  .04292201416867168  .04292201416867155  .24540329410061493  .012139066731453374
                    2018  2 .027077205487478023  .02707720548747762 .054693615906753534  .005906411071376012
                    2014  2  .08566086394481906  .08566086394481863   .1668113368144799   .02088940270461034
                    2016  2  .04419999596636129 .044199995966361295  .08320280306320736  .002741493673789138
                    2015  2  .06083513561600463  .06083513561600479  .09546694490679361 .0011404215331245503
                    2017  2 .009036756095033894 .009036756095033934  .06295331827651317 .0001311581364094393
                    2015  3  .06083513561600463  .06083513561600479  .09546694490679361 .0011404215331245503
                    2014  3  .08566086394481906  .08566086394481863   .1668113368144799   .02088940270461034
                    2012  4  .23470561712117832   .2347056171211783   .3006329192901721    .1158844102105314
                    2009  4  .44657116404401265   .4465711640440104  .42396466949624134   .33783959978501665
                    2008  4  .21507077328341231  .21507077328341367   .3730466284194536   .15567357030252635
                    2011  4  .34858644725672916   .3485864472567291  .37157247935137044   .19935044857404371
                    2007  4  .04292201416867168  .04292201416867155  .24540329410061493  .012139066731453374
                    2010  4  .41871550471256525   .4187155047125658   .3986605814540165   .29012966330277556
                    2006  4 .017296440215681108  .01729644021568104  .11861571059604607 .0035124780274363495
                    2010  5  .41871550471256525   .4187155047125658   .3986605814540165   .29012966330277556
                    2017  6 .009036756095033894 .009036756095033934  .06295331827651317 .0001311581364094393
                    2013  6  .13652960537420314  .13652960537420275   .2479562858147592  .037229684886455235
                    2007  6  .04292201416867168  .04292201416867155  .24540329410061493  .012139066731453374
                    2009  6  .44657116404401265   .4465711640440104  .42396466949624134   .33783959978501665
                    2010  6  .41871550471256525   .4187155047125658   .3986605814540165   .29012966330277556
                    2016  6  .04419999596636129 .044199995966361295  .08320280306320736  .002741493673789138
                    2014  6  .08566086394481906  .08566086394481863   .1668113368144799   .02088940270461034
                    2006  6 .017296440215681108  .01729644021568104  .11861571059604607 .0035124780274363495
                    2018  6 .027077205487478023  .02707720548747762 .054693615906753534  .005906411071376012
                    2011  6  .34858644725672916   .3485864472567291  .37157247935137044   .19935044857404371
                    end
                    I would like to explain on risk proxy with the remaining risk proxies.

                    I am just not sure with my code, because the r2NPA are so low and I do know, that the NPA R2 values are higher.

                    Thank you!
                    Last edited by Katharina Maier; 06 Aug 2019, 03:37.

                    Comment


                    • #25
                      Part of the difficulty is that you do not understand my code; the other part is that I do not understand what you want.

                      The -rangestat- command takes some getting used to, particularly how the -interval()- option works. I can see why you would think that changing which variable you give as the first argument in -interval()- would change the results: in most situations it would. But in this case it doesn't because the -interval()- option's second and third arguments are both missing value. This is the, admittedly unintuitive, way to tell -rangestat- to ignore the interval() option. -interval(variable . .) literally means include observations where variable takes on any value whatsoever--i.e. it does nothing. When either the second or third argument is non-missing, then -interval(variable low high)- tells Stata to use all and only observations where low <= variable and variable <= high. In that situation, changing the variable in the first argument would mean doing the calculations on different observations.

                      It might seem more natural to just omit the -interval()- option altogether when no selection is intended, but the syntax of -rangestat- does not permit that. So you can see now that both of your -rangestat- commands calculate exactly the same thing: they just regress NPA on the rest of the variables, and save the regression results.

                      Now, here's what I don't understand about what you want. First, I don't understand the term proxies as you are using it. I suspect this is some kind of financial jargon--not my discipline. From what I see statistically they are just variables in regressions. More important, it seems that what you are looking for, in each of your -rangestat- commands is something that is particularly focused on Z_score in one case and on NPA in the other. But I have no clue what statistics you are actually looking for with relation to those variables. Z_score is actually the outcome variable in both regressions, so there really aren't any statistics to be gathered about Z_score itself: it will not have any coefficients, and I really have no clue what you are looking for here. And there is no r2NPA: there is an overall R2 for the regression. So I suspect you want something rather different. Perhaps you want to regress Z_score against just NPA, without LLA and LLP and get the R2 from that regression. Or maybe you want to do the regression of Z_score aginst NPA LLA and LLP, and then do it again and get the difference in R2 between those two regressions? Or maybe something else I haven't thought of?

                      Comment


                      • #26
                        Thank you for the answer and sorry for being so unclear - I am a bit in a rush due to my thesis' deadline ..

                        The Z_score NPA LLR LLP are variables of my model. May I ask you to have look at the attached paper, page 5, Figure 2 b ? I need the different R-squares of each different risk proxy per year to make the graphic.
                        Here is my data:

                        Code:
                        * Example generated by -dataex-. To install: ssc install dataex
                        clear
                        input float(year id LLR LLP NPA Z_score)
                        2007  1  .011630812    .018261097   .015062087 -1.5716723
                        2014  2  .007704217  .00026062978    .01757166 -1.6909508
                        2015  2  .004713984             0    .02558387  -1.714116
                        2016  2  .005010014   .0005392081    .01931906 -1.9513158
                        2017  2  .005206013   .0009052709    .01653177  -1.886552
                        2018  2  .005082208   .0004451569   .012838008 -1.8675916
                        2014  3           0             0            0  -3.228743
                        2015  3           0             0            0  -3.197962
                        2006  4  .007627713    .000923452  .0026934016 -1.8396757
                        2007  4  .007800257    .001282144  .0011002329 -1.9295822
                        2008  4  .010616484    .003585204   .013777885  -1.963218
                        2009  4  .016729187    .016583314    .02593907 -1.9244528
                        2010  4  .014617286      .0030025   .017214797 -1.9180492
                        2011  4  .010910933    .002482072   .018298596  -1.586883
                        2012  4  .011325285  .00013020562    .01927564  -1.660184
                        2010  5  .017166357             0    .02333609 -1.6759908
                        2005  6  .007931727    .007931727            0 -2.2908888
                        2006  6  .007908663     .00720697            0 -1.8879156
                        2007  6  .008464329   .0009984137    .02762278 -1.6747932
                        2008  6  .008355642    .003057135   .019206095  -1.449307
                        2009  6   .02964397     .03421423    .09687161  -.8901772
                        2010  6  .029940894    .035797488     .1661957  -.1181851
                        2011  6  .021382164     .01136998    .17405207  -.9856699
                        2012  6   .01443208    .004534268     .1530683 -1.1408491
                        2013  6  .016162274    .005791808    .15281764 -1.3834363
                        2014  6  .012107523   .0006828538    .13720109 -1.4684038
                        end
                        Just as example, I did for a former calculation for the full period R2 this:

                        Code:
                        xtreg Z_score NPA LLR LLP i.year, fe vce(cluster id)
                        xtreg NPA Z_score LLR LLP i.year, fe vce(cluster id)
                        xtreg LLR  NPA Z_score LLP i.year, fe vce(cluster id)
                        xtreg LLP LLR  NPA Z_score i.year, fe vce(cluster id)
                        So I am running four regressions and in each regression it is another dependent variable.

                        This time I need for the Z-score as dependent variable (and NPA, LLR, LLP as explanatory var) the R2 per year;
                        Then I need the NPA as dependent variable (and Z-score, LLR, LLP as explanatory var) the R2 per year
                        etc

                        Goal is to make a graph with the R2 values on the y-axis and the years (1995-2018) on the y-axis.

                        Thank you for your help - and I hope this time it was clearer


                        Attached Files

                        Comment


                        • #27
                          I think I understand what you are looking for now. I think the following code will do it:

                          Code:
                          local vbles LLR LLP NPA Z_score
                          levelsof year, local(years)
                          
                          foreach v of varlist `vbles' {
                              local outcome `v'
                              local ind_vars: subinstr local vbles "`v'" ""
                              gen r2_`v' = .
                              foreach y of local years {
                                  capture regress `outcome' `ind_vars' if year == `y'
                                  if c(rc) == 0 { // SUCCESSFUL REGRESSION
                                      replace r2_`v' = e(r2) if year == `y'
                                  }
                                  else if !inlist(c(rc), 2000, 2001) { // UNEXPECTED REGRESSION ERROR
                                      display as error "Unanticipated error when year == `y'"
                                      exit c(rc)
                                  }
                              }
                          }
                          Notes.

                          1. This code does not work with your example data because in your example data no year has enough observations to support a regression with three predictors. Presumably this probably will not occur with your real data. But if the code doesn't work, when you post back, be sure to include a new set of example data that contains a minimum of 6 observations per year.

                          2. I have written the code anticipating that although most years will have enough data to support the regressions, some won't. When years with too few observations for the regressions are encountered, they will be skipped over and you will not be given any warnings or error messages: you will be able to recognize them because all the r2 variables will have missing values for those years. If, however, any other error arises during the attempt to do regression, you will get an error message and execution will terminate so that you can identify and fix the problem, and then start over.

                          Comment


                          • #28
                            I agree with Clyde's approach--that will get you the R-squared for the regressions of each indicator on the other indicators by year, which is what I understand your guiding paper to be doing. The final step to make a graph would be

                            Code:
                            sort year id
                            by year: gen graph = _n==1
                            line r2_* year if graph
                            In other words, although the r2 values are being saved for every id in each year, you will only need to plot one point per year.

                            Comment


                            • #29
                              Thank you so much, Clyde and Kye. It worked perfectly! Can't express how thankful I am
                              During my whole thesis and working first time with Stata - I learned so much from the Statalist members.
                              I really appreciate your support!
                              Last edited by Katharina Maier; 08 Aug 2019, 02:36.

                              Comment


                              • #30
                                Hello Clyde Schechter
                                i have read the above discussions and tried to calculate my variable SYNCHi (a measure of annual synchronicity for firm i.) In estimating our model we require that daily return data be available for at least 200 trading days in each fiscal year.
                                But i am confused in the # 6 that how can i generate my date variable?
                                i want to calculate R2 for each firm in each year, by using daily data. although i have split my date variable (trddt) into 3 parts (month date year) but still i am confused how can i generate year in # 6 above
                                Code:
                                 gen year = yofd(date)
                                This is the format of my data set trddt is string variable which i have split in 3 parts i.e data1 is month ,date2 is day, and date3 is year


                                Code:
                                * Example generated by -dataex-. To install: ssc install dataex
                                clear
                                input long firm_id str10 trddt float(firm_return market_return) byte(date1 date2) int date3
                                2 "7/25/2006"  -.006623  .012842  7 25 2006
                                2 "9/20/2006"   .047026   .00358  9 20 2006
                                2 "7/12/2006"   .047782 -.000361  7 12 2006
                                2 "3/7/2006"   -.035316 -.029588  3  7 2006
                                2 "6/30/2006"         0  -.00171  6 30 2006
                                2 "3/6/2006"   -.012844 -.002697  3  6 2006
                                2 "6/5/2006"    .028369  .017466  6  5 2006
                                2 "12/8/2006"   -.04906  -.03576 12  8 2006
                                2 "1/16/2006"  -.029213  -.01354  1 16 2006
                                2 "3/29/2006"   .016897   .00541  3 29 2006
                                2 "3/1/2006"    .005495  .004738  3  1 2006
                                2 "1/18/2006"   .022523  .020927  1 18 2006
                                2 "4/20/2006"  -.023495 -.004301  4 20 2006
                                2 "8/8/2006"        .02  .032161  8  8 2006
                                2 "4/13/2006"  -.032984 -.025997  4 13 2006
                                2 "3/22/2006"   .036395  .011777  3 22 2006
                                2 "6/19/2006"   .001795  .016827  6 19 2006
                                2 "5/29/2006"         0  .032838  5 29 2006
                                2 "6/28/2006"         0 -.000707  6 28 2006
                                2 "4/12/2006"  -.030523 -.005528  4 12 2006
                                2 "10/31/2006"   .02375  .009524 10 31 2006
                                2 "2/24/2006"   .010811   .00931  2 24 2006
                                2 "12/20/2006"    .0349  .014965 12 20 2006
                                2 "9/8/2006"    .010432    .0015  9  8 2006
                                2 "6/7/2006"   -.007156  -.07001  6  7 2006
                                2 "5/10/2006"  -.027417  .018669  5 10 2006
                                2 "8/11/2006"   .001709  .004782  8 11 2006
                                2 "9/7/2006"   -.008863 -.015603  9  7 2006
                                2 "7/26/2006"  -.011667 -.000868  7 26 2006
                                2 "12/6/2006"  -.015748 -.013642 12  6 2006
                                2 "3/24/2006"  -.010017 -.003826  3 24 2006
                                2 "2/21/2006"   .012456  .019256  2 21 2006
                                2 "6/13/2006"  -.024436  .004039  6 13 2006
                                2 "1/17/2006"   .027778   .00371  1 17 2006
                                2 "12/13/2006" -.002317  .003021 12 13 2006
                                2 "5/18/2006"   .016447  .005524  5 18 2006
                                2 "10/12/2006" -.022069  -.00773 10 12 2006
                                2 "4/7/2006"   -.020086  .007383  4  7 2006
                                2 "9/22/2006"  -.003906 -.007956  9 22 2006
                                2 "12/12/2006"   .01251  .002495 12 12 2006
                                2 "9/11/2006"   .041298  .006474  9 11 2006
                                2 "10/17/2006"  .012658 -.001869 10 17 2006
                                2 "12/11/2006"  .099742  .040086 12 11 2006
                                2 "12/1/2006"   .010017  .008993 12  1 2006
                                2 "11/29/2006"  .003687   .01667 11 29 2006
                                2 "8/15/2006"   .049236  .019047  8 15 2006
                                2 "11/1/2006"  -.007326  .005581 11  1 2006
                                2 "8/9/2006"     .00713  -.00176  8  9 2006
                                2 "1/12/2006"   .004338  .017924  1 12 2006
                                2 "3/20/2006"   .045788  .013699  3 20 2006
                                end
                                when i run the below command i just got per firm one value for R2 not per firm per year.
                                so do i need to mergre this value in my main dataset by using (merge m:1 year code using.......) command or not ? or i am doing some mistakes, because i need per firm per year value of R2 so that I can calculate my DV .

                                Code:
                                gen year = yofd( date2 )
                                statsby e(r2), by(firm_id year) saving(regression_resultsa, replace): regress firm_return market_return
                                use regression_resultsa, clear
                                rename _stat_1 r2
                                gen funny_statistic = log(r2/(1-r2))
                                Code:
                                * Example generated by -dataex-. To install: ssc install dataex
                                clear
                                input long firm_id float(year r2 funny_statistic)
                                 2 1960   .4400587  -.2409238
                                 4 1960   .3260496   -.726107
                                 5 1960   .4442286 -.22401793
                                 6 1960   .4421526  -.2324303
                                 7 1960  .24893935 -1.1042771
                                 8 1960   .3595278  -.5774141
                                 9 1960   .4568741  -.1729334
                                10 1960   .2068534  -1.343998
                                11 1960   .2573787 -1.0596377
                                12 1960   .4286664 -.28729412
                                14 1960   .4194842   -.324891
                                16 1960   .5378363  .15163514
                                17 1960 .006158102  -5.083809
                                18 1960   .3026383  -.8347657
                                19 1960   .4388093 -.24599595
                                20 1960  .04612292  -3.029225
                                21 1960   .5403796  .16187085
                                22 1960   .5623184  .25057647
                                23 1960   .4775431 -.08988801
                                24 1960     .35695 -.58862656
                                25 1960   .3736776  -.5164719
                                26 1960   .4445098 -.22287884
                                27 1960   .5481623   .1932483
                                28 1960   .3831651  -.4761356
                                29 1960   .5094335  .03773851
                                30 1960   .1908997 -1.4441748
                                31 1960   .4573157  -.1711536
                                32 1960   .5858448   .3468142
                                33 1960  .43767145 -.25061777
                                34 1960  .19768927 -1.4007995
                                35 1960  .22116867  -1.258869
                                36 1960   .4440883   -.224586
                                37 1960  .47868785 -.08530027
                                38 1960  .16475293 -1.6232806
                                39 1960    .554401  .21846873
                                40 1960   .4472794  -.2116692
                                42 1960   .4545721 -.18221404
                                43 1960   .4725165 -.11004477
                                45 1960   .4447936  -.2217294
                                46 1960  .39308295  -.4343715
                                48 1960   .4174713  -.3331627
                                49 1960   .4001679  -.4047656
                                50 1960  .52243215  .08978887
                                55 1960    .511385  .04554797
                                56 1960  .30922085  -.8037644
                                58 1960     .34058  -.6607108
                                59 1960  .56699234  .26959038
                                60 1960  .52493787   .0998343
                                61 1960   .3612084 -.57012314
                                62 1960   .4895909 -.04164248
                                end

                                thank you in advance for your time
                                Last edited by Ayub UOM; 13 Sep 2019, 03:46.

                                Comment

                                Working...
                                X