Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimate the C-score and G-score following Khan and Watts(2009) - Journal of Accounting and Economics

    All: I tried to estimate the C-score and G-score following Khan and Watts(2009) - Journal of Accounting and Economics:

    1) Step 1: estimate the following cross-sectional equation:
    Earni = β0 + β1Di + Ri*(α1+ α2Sizei + α3MTBi+ α4Leveragei) + Di*Ri*( δ1+ δ2Sizei + δ3MTBi+ δ4Leveragei) + (λ1Sizei + λ2MTBi+ λ3Leveragei + λ4Di*Sizei + λ5Di*MTBi + λ6DiLeveragei)+ εi

    2) Step 2: calculate G-score and C-score from the above estimates:

    G-score = β2 = α1+ α2Sizei + α3MTBi+ α4Leveragei

    C-score = β31+ δ2Sizei + δ3MTBi+ δ4Leveragei

    My codes are as follow:

    sort fyear
    by fyear: reg Earn D R R*Size R*MTB R*Leverage D*R D*R*Size D*R*MTB D*R*Leverage Size MTB Leverage D*Size D*MTB D*Leverage
    gen G_score = _b[R]*R + _b[R*Size]*R*Size + _b[R*MTB]*R*MTB + _b[R*Leverage]*R*Leverage
    gen C_score = _b[D*R]*R*D + _b[D*R*Size]*D*R*Size + _b[D*R*MTB]*D*R*MTB + _b[D*R*Leverage]*D*R*Leverage


    The results of C_score and G_score do not seem right.

    Any help? I really appreciate.

    Ben

  • #2
    I haven't sorted through the details here. But I see one obvious problem with your code. Your -by fyear: reg...- command will perform a separate regression for each group of observations defined by a value of fyear, but all of those results will be lost except for the last value of fyear. Your subsequent calculations are therefore based only on the _b values from that last regression. You cannot use -by- for this. The simplest workable approach will be a loop, for which the following gives an outline:

    Code:
    forvalues j = 1/4 {
        gen alpha`j' = .
        gen delta`j' = .
    }
    
    levelsof fyear, local(fys)
    foreach f of local fys {
         
    reg Earn ... if fyear == `f' replace alpha1 = _b[R] // ... etc. replace delta4 = _b[D*R*Leverage] if fyear == `f' // USE CORRECT NAME FOR COEFFICIENT, not D*R*LEVERAGE } gen G_score = alpha1 + alpha2*Size + alpha3*MTB + alpha4*Leverage gen C_score = delta1 + delta2*Size + delta3*MTB + delta4*Leverage


    You'll have to fill in the details, but you can see the general idea. By the way, at least in the current version of Stata you cannot insert prouct terms into a regression using the * operator. To do that you have to use factor variable notation. See -help fvvarlist- for details.

    In the future, when asking for help with code, show example data, and use the -dataex- command to do that. See FAQ#12 for information about using -dataex-.

    Comment


    • #3
      Clyde: Thank you so much. The code now works. I really appreciate it.
      Ben Lee

      Comment


      • #4
        Ben lee, glad that the code suggested by Clyde worked for you. I am new to stata and I am taking my baby steps. I am trying to calculate the C score and G score for measuring conditional conservatism. Can you please elaborate the code suggested by Clyde.

        Comment


        • #5
          Dear Clyde, May I seek your comments/suggestions on the following post (https://www.statalist.org/forums/for...ar-regressions)?
          Ho-Chuan (River) Huang
          Stata 19.0, MP(4)

          Comment


          • #6
            River, see my response at that link, just posed a moment ago.

            Comment


            • #7
              Hello,

              I am trying to run the same model which is Khan and Watts (2009), but I cannot spot the problem

              forvalues j = 1/4 { gen alpha`j' = . gen delta`j' = . } levelsof fyear, local(fys) foreach f of local fys { reg Earn ... if fyear == `f' replace alpha1 = _b[R] // ... etc. replace delta4 = _b[D*R*Leverage] if fyear == `f' // USE CORRECT NAME FOR COEFFICIENT, not D*R*LEVERAGE } gen G_score = alpha1 + alpha2*Size + alpha3*MTB + alpha4*Leverage gen C_score = delta1 + delta2*Size + delta3*MTB + delta4*Leverage


              Also Can I ask what is f and what is fys? and whetehr I should write fyears or fys? I am so confused is it firm years? and it depend on how I am defining it but what is fys then?

              Best regards,
              Sally

              Comment


              • #8
                The code you have posted is all jumbled together and is not readable. It appears, as best I can tell, to have been largely copied verbatim from what I posted in #2. As stated there, that block of code was an outline of how to structure the loop--it was not actual code and would not run. For one thing, ... is not a thing in Stata syntax. I put that there because the original regression command in #1 was both lengthy and not valid syntax. And as I was just trying to demonstrate the looping, I did not take the time to fix it. So if you just copied it directly, there will just be error messages, no results.

                I suggest you post back, showing the actual code you are trying to run, along with example data. Before doing that, be sure to read forum FAQ #12 so you understand how to use code delimiters to make the code readable here. And be sure to use the -dataex- command to show your example data. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

                When asking for help with code, always show example data. When showing example data, always use -dataex-.

                Concerning the questions in your second paragraph, fyear was the name of a variable in O.P.'s data set. I do not know what it means in real world terms: fyear is a commonly used name for a variable denoting fiscal year, though it also could be firm-year, or family-year, or foobar-year, or even just an arbitrary name for something that has nothing to do with years at all. The -levelsof- command scans that variable, fyear, to create a list of its distinct values and saves that list in a local macro named fys. The -foreach- command that follows tells Stata to repeat the following bracketed code, once for each value in the list fys, using a new local macro named f, and on each iteration, filling f with the next value it finds in fys. The notation `f' means "substitute the current value of local macro f here." Working with macros in Stata has a learning curve, and if you have not used them before, it will seem confusing as to when you use what. Over time with practice it gets easier.

                Comment


                • #9
                  Hello Clyde,

                  Thanks for your reply. I have now modified my question taken into consideration the points you recommended me to check in the FAQ no 12. My question is that I'm trying to run Khan and Watts (2009) model of conservatism which is a regression of Earnings (E) on Returns and Negative returns (which is a dummy variable that takes 1 if return is <0 and zero otherwise) taking into consideration firm characteristics which are Firm size(SIZE), Market-to-book ratio (MTB) and Leverage (LEV) for a sample of firms per year. I tried to use the earlier command mentioned in #2 but I am still confused how can I translate the original regression into STATA command so I can run the regression for each firm per year. Specifically I am using levelsof and foreach in Stata 14.0 which you have suggested using in the earlier code.

                  The error message I receive is "no observations r(2000)".


                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input float(E D R D_R SIZE MTB LEV)
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                    8.934457 1   -.57  -.57 10.890348  .0014381703   .10444312
                   .52446735 0  8.322     0 11.818806  .0005541936   .24405296
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                   65.948204 0   4.08     0 14.993145 .00006559786   .28262252
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                   -.7984112 0   1.33     0  9.867861  .0005241101  .003212435
                    42.97538 1   -.32  -.32 12.319973   .002730465    .0986936
                    2.024913 .      .     . 13.857718            .    .6285228
                           . .      .     .         .            .           .
                   22.905127 .      .     . 13.765524            .    .1873971
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                    .5742144 0 6.5888     0 13.181875   .002503851  .013422932
                           . .      .     .         .            .           .
                     2.23913 .      .     . 12.850452            .   .27596262
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                    80.72317 0 4.4094     0  12.17528   .002712326   .27106506
                           . .      .     .         .            .           .
                   1.9234886 0 6.0998     0 14.279523  .0007602429   .07072809
                    .4253883 1   -.86  -.86  9.716616  .0028930684    .1979989
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                   2.9938745 0 3.0534     0 15.038124  .0013911395    .1655735
                           . .      .     .         .            .           .
                    68.42975 0 1.2944     0 13.690545  .0008711907   .34125075
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                    1180.204 0      3     0  15.30934   .003312945   .24263737
                   133.72017 .      .     .  13.95988            .           0
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                    41.23319 0 4.6076     0 14.914902  .0013196827    .2330447
                    4.811265 0  33.12     0 15.080725   .002301481   .12670316
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                   18.553102 0  14.53     0 11.787925  .0014549926      .16346
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                           . .      .     .         .            .           .
                    .5892057 1   -.44  -.44  11.76897  .0022691735 .0001315983
                  Code:
                  
                  forvalues j = 1/4 {
                      gen alpha`j' = .
                      gen delta`j' = .
                  }
                  
                  levelsof yearz, local(yereg)
                  foreach CodeX of local yereg { 
                      reg E D R R * SIZE R * MTB R * LEV D_R D_R * SIZE D_R * MTB D_R * LEV SIZE MTB LEV D * SIZE D * MTB D * LEV 
                      replace alpha1 = _b[R]
                      replace alpha2 = _b[R*SIZE]
                      replace alpha3 = _b[R*MTB]
                      replace alpha4 = _b[R*LEV]
                       
                      replace delta1 = _b[D*R]
                      replace delta2 = _b[D*R*SIZE] 
                      replace delta3 = _b[D*R*MTB] 
                      replace delta4 = _b[D*R*LEV]  if yearz == `yearz' // 
                  }
                  gen G_score = alpha1 + alpha2*SIZE + alpha3*MTB + alpha4*LEV
                  gen C_score = delta1 + delta2*SIZE + delta3*MTB + delta4*LEV
                  I would appreciate your further help.

                  Comment


                  • #10
                    Well, your example data does not include a year variable, so I can't attempt to replicate your results and troubleshoot them.

                    That said, there are a few things wrong with your loop.

                    1. You start with -foreach CodeX of local yereg { -, but then you never again refer to CodeX. I suspect that CodeX somehow got in there by mistake and you meant to say -yearz- there.

                    2. Your regress command, and all of the -replace- commands following it other than the last are not conditioned on year: they are written to be carried out on the entire data set. I'm pretty sure that's not your intent.

                    3. You are using obsolete notation for your interactions, and that will ultimately prevent you from taking advantage of the -margins- command to interpret this complicated model.

                    So I would revise the loop as:
                    Code:
                    levelsof yearz, local(yereg)
                    foreach y of local yereg {
                        reg E (i.D##c.R)##c.(SIZE LEV MTB)if yearz == `y'
                        replace alpha1 = _b[R] if yearz == `y'
                        replace alpha2 = _b[c.R#c.SIZE] if yearz == `y'
                        replace alpha3 = _b[c.R#c.MTB] if yearz == `y'
                        replace alpha4 = _b[c.R#c.LEV] if yearz == `y'
                         
                        replace delta1 = _b[1.D#c.R] if yearz == `y'
                        replace delta2 = _b[1.D#c.R#c.SIZE]  if yearz == `y'
                        replace delta3 = _b[1.D#c.R#c.MTB]  if yearz == `y'
                        replace delta4 = _b[1.D#c.R#c.LEV]  if yearz == `y'
                    }
                    Now, as for the no observations error message, it means what it says. Remember that any observation which contains a missing value for any of the regression variables is omitted from the estimation sample. There may be some years where all observations end up being omitted, so there is nothing left to regress on. Looking at the frequency of missing values in your example data, this seems to be a problem you are quite likely to have.

                    So you need to decide whether this is just a limitation of your data you have to live with, or whether it means your data set is somehow incomplete or incorrect and needs to be fixed before you do anything more with it. If you decide that this kind of missingness is an acceptable fact of life, then you can code around the years with no usable observations as follows:

                    Code:
                    levelsof yearz, local(yereg)
                    foreach y of local yereg {
                        capture reg E (i.D##c.R)##c.(SIZE LEV MTB)if yearz == `y'
                        if c(rc) == 0 {
                            replace alpha1 = _b[R] if yearz == `y'
                            replace alpha2 = _b[c.R#c.SIZE] if yearz == `y'
                            replace alpha3 = _b[c.R#c.MTB] if yearz == `y'
                            replace alpha4 = _b[c.R#c.LEV] if yearz == `y'
                             
                            replace delta1 = _b[1.D#c.R] if yearz == `y'
                            replace delta2 = _b[1.D#c.R#c.SIZE]  if yearz == `y'
                            replace delta3 = _b[1.D#c.R#c.MTB]  if yearz == `y'
                            replace delta4 = _b[1.D#c.R#c.LEV]  if yearz == `y'
                        }
                        else if inlist(c(rc), 2000, 2001) {
                            display as text "Insufficient observations in year `y'"
                        }
                        else {
                            display as error "Unexpected regression error in year `y'"
                            exit c(rc)
                        }
                    }
                    This will enable Stata to just skip over the years with no (or too few) observations to do the regression. It will display a message informing you that the year has been skipped, but will move on to the next year. On the other hand, if some unexpected error occurs in the regression, it will throw an error message and halt execution there.

                    Finally, I will just note that you have been talking about firm-year regressions, but there is nothing in the example data, nor in the code, that makes any reference to firm identifier variables. So I'm not sure what's up with that.

                    Note: The above code is not tested because the example data did not contain the crucial yearz variable.

                    Comment


                    • #11
                      Thank you Clyde for your further help. Yes it is a cross sectional regression. I added now yearz variable to the dataex yearz represents years from 2002 to 2018 and CodeX is each firm code. I also tried the two codes you kindly provided and for the second one it says 0 real changes made what does that mean?. In addition, regarding the missing variables it is normal in my data set I think.
                      Code:
                      * Example generated by -dataex-. To install: ssc install dataex
                      clear
                      input float(E D R D_R SIZE MTB LEV yearz) int CodeX
                               . .      .     .         .            .           . .  223
                               . .      .     .         .            .           . .  203
                        8.934457 1   -.57  -.57 10.890348  .0014381703   .10444312 . 1106
                       .52446735 0  8.322     0 11.818806  .0005541936   .24405296 .  722
                               . .      .     .         .            .           . . 1377
                               . .      .     .         .            .           . .  196
                       65.948204 0   4.08     0 14.993145 .00006559786   .28262252 .  751
                               . .      .     .         .            .           . .   64
                               . .      .     .         .            .           . .  119
                               . .      .     .         .            .           . .    7
                       -.7984112 0   1.33     0  9.867861  .0005241101  .003212435 . 1387
                        42.97538 1   -.32  -.32 12.319973   .002730465    .0986936 .  447
                        2.024913 .      .     . 13.857718            .    .6285228 . 1176
                               . .      .     .         .            .           . .  535
                       22.905127 .      .     . 13.765524            .    .1873971 .  154
                               . .      .     .         .            .           . .  918
                               . .      .     .         .            .           . .  125
                        .5742144 0 6.5888     0 13.181875   .002503851  .013422932 . 1303
                               . .      .     .         .            .           . .  258
                               . .      .     .         .            .           . .  146
                               . .      .     .         .            .           . .  576
                               . .      .     .         .            .           . .  157
                               . .      .     .         .            .           . . 1276
                         2.23913 .      .     . 12.850452            .   .27596262 . 1188
                               . .      .     .         .            .           . .  247
                               . .      .     .         .            .           . .  827
                               . .      .     .         .            .           . . 1233
                        80.72317 0 4.4094     0  12.17528   .002712326   .27106506 .  710
                               . .      .     .         .            .           . .  652
                       1.9234886 0 6.0998     0 14.279523  .0007602429   .07072809 . 1255
                        .4253883 1   -.86  -.86  9.716616  .0028930684    .1979989 . 1368
                               . .      .     .         .            .           . .  171
                               . .      .     .         .            .           . .  819
                               . .      .     .         .            .           . .  833
                       2.9938745 0 3.0534     0 15.038124  .0013911395    .1655735 . 1079
                               . .      .     .         .            .           . .  220
                        68.42975 0 1.2944     0 13.690545  .0008711907   .34125075 . 1322
                               . .      .     .         .            .           . . 1309
                               . .      .     .         .            .           . .  748
                        1180.204 0      3     0  15.30934   .003312945   .24263737 . 1223
                       133.72017 .      .     .  13.95988            .           0 .  537
                               . .      .     .         .            .           . .   10
                               . .      .     .         .            .           . . 1092
                               . .      .     .         .            .           . . 1284
                               . .      .     .         .            .           . .  180
                               . .      .     .         .            .           . .  897
                               . .      .     .         .            .           . . 1275
                        41.23319 0 4.6076     0 14.914902  .0013196827    .2330447 . 1256
                        4.811265 0  33.12     0 15.080725   .002301481   .12670316 . 1218
                               . .      .     .         .            .           . . 1136
                               . .      .     .         .            .           . .   70
                               . .      .     .         .            .           . .  983
                               . .      .     .         .            .           . .  646
                       18.553102 0  14.53     0 11.787925  .0014549926      .16346 . 1423
                      So After running the code you provided I will run the following code to get C_Score and G_Score

                      Code:
                      gen G_score = alpha1 + alpha2*SIZE + alpha3*MTB + alpha4*LEV
                      gen C_score = delta1 + delta2*SIZE + delta3*MTB + delta4*LEV

                      Comment


                      • #12
                        Your new example data contains a yearz variable, but has only missing values, so still no way to check this out.

                        I am now unclear whether you want a yearly regression involving all firms, or if you want a separate regression for each firm-year combination. If the latter, the example data is also unsuitable in another way: there is only one observation for each firm, so no regression is possible for any firm.

                        Comment


                        • #13
                          I am sorry I am still beginner in STATA so sorry for confusing you. Yes I need for firm-year combination (Cross-sectional). But what shall I include in the dataex to make it possible?

                          Comment


                          • #14
                            You need all of the regression variables themselves, plus yearz plus CodeX. You don't need a large number of years or firms; a couple of each will do. But make sure you include a large number of observations for each yearz CodeX combination. So maybe 2 firms and 2 years with 20 observations for each firm-year pair will do the trick.

                            Comment


                            • #15
                              Do you mean a sample like this?

                              dataex E D R D_R SIZE MTB LEV yearz CodeX if yearz==2002, count (10)

                              ----------------------- copy starting from the next line -----------------------
                              Code:
                              * Example generated by -dataex-. To install: ssc install dataex
                              clear
                              input float(E D R D_R SIZE MTB LEV yearz) int CodeX
                                38.42353 0    29.7 0 13.799076 .0036831354 .41323575 2002  623
                               38.128666 0    4.54 0 14.771496 .0011314551 .18587354 2002  570
                               23.355656 0   71.55 0 16.414972  .008305379 .19664103 2002 1213
                                       . .       . . 10.162963           .  .5082353 2002  338
                               162.21573 0    5.48 0 13.126437 .0040707514 .03984336 2002  682
                               116.36494 0 30.7507 0 16.604809 .0010867633  .3089809 2002 1147
                                24.39782 0  8.4924 0 16.629642  .001347422 .12797527 2002 1318
                               115.80703 0  2.4038 0 14.207417 .0004944297 .20658197 2002  462
                              -347.66165 0 13.8935 0 18.053883 .0011979023 .28331047 2002 1206
                                72.06727 0     .68 0 14.338293 .0009973231  .4652537 2002 1003
                              end
                              dataex E D R D_R SIZE MTB LEV yearz CodeX if yearz==2018, count (10)

                              ----------------------- copy starting from the next line -----------------------
                              Code:
                              * Example generated by -dataex-. To install: ssc install dataex
                              clear
                              input float(E D R D_R SIZE MTB LEV yearz) int CodeX
                               42.71773 0   16.9     0 14.831863 .0012447307     .3223191 2018  946
                              -65.86325 0   4.88     0 14.218187  .000590005    .43722925 2018  687
                               46.76556 0  34.94     0 17.774345 .0009128993     .2707811 2018 1269
                               47.36972 1   -.23  -.23 13.593572    .0013823     .3368329 2018  332
                               40.84294 0  28.32     0 13.494812  .002827908            0 2018 1157
                              34.535786 0   19.4     0   13.5616  .003172649    .26874644 2018  112
                               46.58312 0  45.36     0 14.633336 .0003264726    .09835415 2018 1422
                              -4.992886 1  -.294 -.294 13.685555 .0007296276 .00004327358 2018 1180
                               31.36625 0 54.907     0 17.074263 .0019131257     .3281638 2018  782
                                      . 0  13.18     0         .           .            . 2018  496
                              end

                              Comment

                              Working...
                              X