Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate an Fama French industry index in the CRSP dataset.

    Hi Statisters,

    I have another question for which I couldn't find any information on the internet or in this forum. I feel like this is actually quite classical finance stuff, yet I couldn't find any guidance. It concerns the following two datasets: the CRSP daily dataset and the Fama French 48 industry portfolio from his website https://mba.tuck.dartmouth.edu/pages....html#Research.
    The CRSP dataset is on a daily basis for all stock companies in the USA. The Fama French 48 industry classification has only business days, if im correct, while the CRSP Data has also weekends included.
    For the industry index I have created a daily variable. I have also successfully completed the m:1 merge between the two datasets.

    Now my question:
    For my thesis, I need an industry index. From what I understand from my example paper, I would like to assign to each CRSP company on the respective date the corresponding industry index value from the French Industry Index on the specific date and I don't know how I could accomplish that.
    So in my head this could be achieved through a new variable (Industry_index), where all industry returns from the French dataset should, depending on the industry to which the company belongs and the date, be included in the new variable (industry_index) as the respective return of the industry on that day.

    The Fama French Industry Index variables are: date, agric food soda beer smoke toys fun books, and many more
    The CRSP daily variables are: PERMNO date ndate bcal_date SICCD SICCD RET ffi ff_industry_48

    Data explanation:
    CRSP:

    PERMNO=company identifier
    SICCD= CRSP´s SIC Code which is a industry identifier from CRISP (if needed here)
    RET = return of the company (only for example, that there are many more variables
    ffi = is the fama French industry classification in words (for example: electronics)
    ff_industry_48 = is the same but in numeric (for example: 36)

    Fama French Industry Index:

    Industry Returns from French website: agric food soda beer smoke toys fun books ... and many more

    The French Industry Index looks like this:
    date agric soda beer ...
    01.01.2001 0,02 -0,03 0,05 ...
    02.01.2001
    .
    .
    .
    0,025 0,04 0,06 ...
    The CRSP Daily Data looks like this for example:
    date PERMNO (company identification) ffi & ff_industry_48
    01.01.2001 14612 agric or 36
    02.01.2001 14613 electronics or 13
    Maybe it can be done within the merge, than I can provide separate dataex examples.
    If the merge was a good first step, than you see below the datex:

    What aim looking for is, the following :
    01.01.2001 Company A Industry_index: 0,02 Industry: Agric
    02.01.2001 Company A Industry_index: 0,04 Industry: Agric
    01.01.2001 Company B Industry_index: 0,06 Industry: Soda
    Or I might be completely off track, and data like the Fama French Industry Index is typically used in a totally different way, if you want an industry index. In that case, I would certainly appreciate it if you could enlighten me and maybe even provide links where I can transfer the concept to my problem.

    Basically the EQ(1) is what aim try to accomplish, and therefore I need an industry return index (r k,w), or am I totally wrong and I only have to bring all the industry returns to weekly format (because they are alredy value weighted) and do the regression with the 48 weekly industry returns? (for example: reg r ikw r mw weekly_argic weekly_food weekly_soda .... )
    Click image for larger version

Name:	Bildschirmfoto 2023-12-03 um 17.11.08.png
Views:	1
Size:	82.2 KB
ID:	1735933





    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long PERMNO int(date ndate bcal_date SICCD) double RET float ffi byte ff_industry_48 float(agric food soda beer smoke toys fun books)
    10012 14612 14612  0 3670 -.0038910505827516317 36 36      -.0341      -.0237      -.0052     -.0116       .0167     -.0057      -.0168      -.0255
    10012 14613 14613  1 3670             -.0859375 36 36      -.0268      -.0227      -.0219     -.0111       .0064     -.0303       .0027      -.0275
    10012 14614 14614  2 3670    .09401709586381912 36 36         .03      -.0003       .0112      .0133      -.0041     -.0217      -.0021       -.022
    10012 14615 14615  3 3670                -.0625 36 36       .0024       .0099       .0369      .0129      -.0128      .0003 -.018099999       .0027
    10012 14616 14616  4 3670   .008333333767950535 36 36       .0277       .0118       .0149      .0525       .0275      .0071       .0085      -.0127
    10012 14619 14619  5 3670                     0 36 36      -.0098      -.0112       .0049      -.031       .0107      .0247       .1054       .1945
    10012 14620 14620  6 3670   -.03305784985423088 36 36      -.0049       .0085       .0019      .0174       .0056     -.0024      -.0021       -.026
    10012 14621 14621  7 3670    .04273504391312599 36 36       .0127       -.009       .0186      .0083       .0039      -.002      -.0472      -.0385
    10012 14622 14622  8 3670    .14754098653793335 36 36       .0096       .0025      -.0127     -.0135       .0094      .0121       .0271       .0361
    10012 14623 14623  9 3670    .04642857238650322 36 36       .0081      -.0177       .0063      .0041 -.016900001     -.0189      -.0178      -.0178
    10012 14627 14627 10 3670   .037542663514614105 36 36      -.0055      -.0004       .0241      .0359      -.0397      .0055      -.0005      -.0165
    10012 14628 14628 11 3670   -.02631578966975212 36 36      -.0026      -.0107      -.0087      .0059       .0146     -.0105         .02       .0385
    10012 14629 14629 12 3670  -.054054055362939835 36 36      -.0015      -.0189       .0073      .0031      -.0439     -.0144       -.019      -.0036
    10012 14630 14630 13 3670   -.11071428656578064 36 36      -.0132       -.009       .0294 .007900001      -.0096     -.0039      -.0111       .0235
    10012 14633 14633 14 3670   .012048192322254181 36 36      -.0141      -.0215      -.0107     -.0273      -.0009     -.0167      -.0031 -.036199998
    10012 14634 14634 15 3670    .02380952425301075 36 36       .0024      -.0115        .077       .002      -.0183     -.0226       .0626      -.0122
    10012 14635 14635 16 3670   -.03100775182247162 36 36       .0061       .0165       .0196     -.0309      -.0053      .0123      -.0024      -.0067
    10012 14636 14636 17 3670  -.024000000208616257 36 36       -.005      -.0128      -.0449      -.038      -.0202     -.0174       .0058       .0113
    10012 14637 14637 18 3670  -.016393441706895828 36 36      -.0117      -.0118      -.0419     -.0307       -.016     -.0167       -.019 -.031400003
    10012 14640 14640 19 3670   -.03333333507180214 36 36       .0005       .0239       .0911      .0129       .0043     -.0239      -.0173      -.0131
    10012 14641 14641 20 3670   -.06034482643008232 36 36      -.0053      -.0127       .0279      .0097       .0058          0      -.0008       -.007
    10012 14642 14642 21 3670     .0458715595304966 36 36       .0035      -.0122       .0064     -.0231      -.0173      .0414       .0183       .0489
    10012 14643 14643 22 3670     .0767543837428093 36 36        .016       .0118       .0476     -.0156       .0162 .018099999       .0226       .0085
    10012 14644 14644 23 3670  -.006109979469329119 36 36       .0022      -.0059       .0393      .0059      -.0147     -.0326       .0194      -.0078
    10012 14647 14647 24 3670   -.07377049326896667 36 36 -.016900001      -.0072      -.0197      .0059       .0004      .0018       -.012      -.0118
    10012 14648 14648 25 3670  -.044247787445783615 36 36       .0147        .012      -.0096      .0074      -.0505      .0037       .0117       .0105
    10012 14649 14649 26 3670                     0 36 36        .005       -.014       .0067      .0023       .0089      .0038      -.0099      -.0112
    10012 14650 14650 27 3670   -.07407407462596893 36 36      -.0093      -.0228 -.007900001      .0015         .01     -.0204      -.0102       -.001
    10012 14651 14651 28 3670  -.019999999552965164 36 36       .0128       -.009        .002     -.0175      -.0383      .0105       .0151      -.0164
    10012 14654 14654 29 3670   .010204081423580647 36 36       .0028      -.0028      -.0116     -.0225      -.0051      .0123      -.0133      -.0087
    10012 14655 14655 30 3670                     0 36 36       .0381       .0258       .0001      .0193       .0046     -.0064      -.0118      -.0238
    10012 14656 14656 31 3670    .09090909361839294 36 36      -.0153      -.0083      -.0489     -.0212      -.0087     -.0016      -.0143      -.0253
    10012 14657 14657 32 3670    .03703703731298447 36 36       .0166       .0149       .0151     -.0224       .0286     -.0082      -.0054       .0002
    10012 14658 14658 33 3670     .0223214291036129 36 36      -.0188      -.0172      -.0171     -.0123       .0034     -.0103      -.0251      -.0165
    10012 14662 14662 34 3670  -.013100436888635159 36 36       .0134       .0189      -.0001      .0111       .0586      .0046      -.0273      -.0165
    10012 14663 14663 35 3670  -.017699114978313446 36 36       .0126      -.0162      -.0455     -.0082      -.0274     -.0149       .0222        .047
    10012 14664 14664 36 3670  -.045045044273138046 36 36       .0025       -.029      -.0327     -.0271       -.013     -.0216      -.0268  .016900001
    10012 14665 14665 37 3670   .009433962404727936 36 36      -.0057      -.0128      -.0175     -.0195      -.0209      .0126      -.0465       -.015
    10012 14668 14668 38 3670  -.018691588193178177 36 36      -.0004       .0106      -.0019      .0142  .015800001      .0028       .0262       .0287
    10012 14669 14669 39 3670    .25238096714019775 36 36      -.0004        .004      -.0171     -.0097       .0085      .0111       .0388        .002
    10012 14670 14670 40 3670    .01901140622794628 36 36       -.018      -.0016       .0171      .0292        .014     -.0067       .0214       -.004
    10012 14671 14671 41 3670   -.13059701025485992 36 36       -.011      -.0276      -.0144     -.0295      -.0145     -.0088      -.0065      -.0094
    10012 14672 14672 42 3670    .07296137511730194 36 36       .0021       .0152      -.0115      .0095       .0084      .0295       .0292       .0045
    10012 14675 14675 43 3670    .03200000151991844 36 36       .0259      -.0192      -.0487     -.0216      -.0226     -.0152      -.0036      -.0099
    10012 14676 14676 44 3670   .007751937955617905 36 36      -.0186      -.0232      -.0051     -.0304      -.0148      -.018      -.0072      -.0062
    10012 14677 14677 45 3670   -.03846153989434242 36 36      -.0165       .0299       .0388       .002      -.0152     -.0089      -.0163      -.0153
    10012 14678 14678 46 3670   .024000000208616257 36 36       -.005      -.0026       .0084     -.0032      -.0067      .0231       .0243       .0574
    10012 14679 14679 47 3670               .046875 36 36      -.0116 -.015800001      -.0246     -.0224      -.0049      .0013      -.0201      -.0186
    10012 14682 14682 48 3670   -.02985074557363987 36 36      -.0094      -.0074      -.0268     -.0187       .0364      .0022      -.0121       .0078
    10012 14683 14683 49 3670    .10000000149011612 36 36      -.0113       -.008       .0209     -.0118      -.0253     -.0025      -.0144       .0111
    10012 14684 14684 50 3670   -.13636364042758942 36 36       .0178       .0573       .0505      .0881        .042      .0091        .005      -.0017
    10012 14685 14685 51 3670    .08502024412155151 36 36       .0386        .088       .0281      .0406       .0294      .0284       .0307       .0322
    10012 14686 14686 52 3670    .04477611929178238 36 36      -.0067      -.0279       -.023     -.0172       .0004      .0161       .0277       .0271
    10012 14689 14689 53 3670  -.014285714365541935 36 36      -.0078       .0108      -.0108     -.0089      -.0136      .0037  .032899998       .0096
    10012 14690 14690 54 3670  -.014492753893136978 36 36       .0105       .0053       .0011      .0197       .0189     -.0016        .004       .0179
    10012 14691 14691 55 3670    -.0882352963089943 36 36       .0069      -.0204      -.0127     -.0357       -.043       .005       .0241       .0254
    10012 14692 14692 56 3670   .012096773833036423 36 36      -.0113       .0008      -.0069      .0026       .0065      .0093  .015800001       .0055
    10012 14693 14693 57 3670  .0039840638637542725 36 36      -.0077      -.0089      -.0028     -.0103       .0142      .0187      -.0041       .0029
    10012 14696 14696 58 3670   -.02380952425301075 36 36       .0076       -.023      -.0361     -.0296      -.0073      .0072      -.0038       .0072
    10012 14697 14697 59 3670  -.056910570710897446 36 36      -.0013       .0006       .0196      .0025      -.0056     -.0051      -.0144      -.0046
    10012 14698 14698 60 3670   -.03448275849223137 36 36        .045       .0307       .0128      .0374      -.0313     -.0298      -.0041      -.0377
    10012 14699 14699 61 3670     .2053571492433548 36 36       .0058       .0486       .0117      .0486        .098      .0055 -.015700001      -.0341
    10012 14700 14700 62 3670    .12222222238779068 36 36        .022       .0071  .016900001     -.0287       -.003      .0155       .0162       .0587
    10012 14703 14703 63 3670  -.056105609983205795 36 36       .0112      -.0111       .0228      .0485       .0787     -.0064       .0038 -.031600002
    10012 14704 14704 64 3670    .00699300691485405 36 36      -.0358       .0325       .0022      .0388      -.0271     -.0198       -.016      -.0299
    10012 14705 14705 65 3670    -.0520833320915699 36 36       .0128      -.0103       .0013      -.051       .0101      .0364      -.0094       .0046
    10012 14706 14706 66 3670    .07692307978868484 36 36       .0161      -.0116      -.0353     -.0235       .0246      .0255  .031600002       .0232
    10012 14707 14707 67 3670    .03401360660791397 36 36      -.0049      -.0168      -.0234      -.001      -.0186      .0021       .0087       .0221
    10012 14710 14710 68 3670   -.05263157933950424 36 36      -.0099       .0003       -.008     -.0087      -.0233     -.0099       .0032      -.0068
    10012 14711 14711 69 3670    -.0902777761220932 36 36       .0077        .013       .0505      .0435      -.0097      .0021       .0114 -.015800001
    10012 14712 14712 70 3670   -.04580152779817581 36 36      -.0056       .0237      -.0025      .0134       .0082     -.0066      -.0373      -.0339
    10012 14713 14713 71 3670    -.1459999978542328 36 36      -.0153       .0151       .0113     -.0196       .0193     -.0112      -.0183      -.0367
    10012 14714 14714 72 3670   -.17564402520656586 36 36 -.037100002       -.044      -.0375      -.015      -.0505     -.0706      -.0336      -.0408
    10012 14717 14717 73 3670    .09090909361839294 36 36       -.021       .0174       .0356      .0222       .0062     -.0013      -.0089       .0239
    10012 14718 14718 74 3670                 .0625 36 36       .0166      -.0112       .0163     -.0123      -.0235      .0343       .0363       .0382
    10012 14719 14719 75 3670   -.05882352963089943 36 36      -.0104      -.0064      -.0124      .0124       .0042     -.0047       .0239        .026
    10012 14720 14720 76 3670  .0052083334885537624 36 36       .0007      -.0031      -.0055      .0251       .0223     -.0022       .0132      -.0172
    10012 14724 14724 77 3670    -.0984455943107605 36 36       .0059       .0156      -.0025      .0146      -.0034     -.0177 -.032899998      -.0285
    10012 14725 14725 78 3670     .1034482792019844 36 36       .0492       .0151       .0573      .0057  .031600002       .044       .0187       .0349
    10012 14726 14726 79 3670  -.010416666977107525 36 36       -.019      -.0243      -.0261     -.0094       .0146      .0025       .0363      -.0041
    10012 14727 14727 80 3670   .031578946858644485 36 36      -.0224      -.0234      -.0395     -.0229      -.0124      .0107      -.0113      -.0108
    10012 14728 14728 81 3670    .10204081982374191 36 36       .0048      -.0193      -.0038     -.0187      -.0067     -.0004       .0291       .0094
    10012 14731 14731 82 3670     .0694444477558136 36 36       .0198       .0123      -.0014     -.0105        .076      .0018      -.0081      -.0097
    10012 14732 14732 83 3670   -.03030303120613098 36 36      -.0059      -.0012      -.0019      .0037       .0094      -.021       .0046      -.0203
    10012 14733 14733 84 3670    -.1071428582072258 36 36      -.0001       .0599      -.0102      .0317      -.0222     -.0072      -.0303      -.0077
    10012 14734 14734 85 3670    .03999999910593033 36 36       .0009       .0164       .0141     -.0008       .0021     -.0081      -.0026      -.0096
    10012 14735 14735 86 3670   -.01923076994717121 36 36       .0129      -.0004      -.0195     -.0139       .0076      .0192      -.0054      -.0001
    10012 14738 14738 87 3670   -.04901960864663124 36 36      -.0119       .0109       .0131      .0211       .0254     -.0016       .0124       -.015
    10012 14739 14739 88 3670  -.020618556067347527 36 36      -.0001      -.0004       .0268      .0258      -.0218      .0092       .0037        .004
    10012 14740 14740 89 3670   -.05263157933950424 36 36      -.0207       .0006      -.0069      .0398       .0183     -.0104      -.0091      -.0139
    10012 14741 14741 90 3670    .05000000074505806 36 36       .0205       .0122        .022      .0177      -.0234      .0098       .0026       .0049
    10012 14742 14742 91 3670   .018518518656492233 36 36       .0027       .0031       .0113     -.0214        .036       .001       .0171       .0125
    10012 14745 14745 92 3670    .03896103799343109 36 36       .0093       .0186       .0275     -.0033       .1063       .016       .0056       .0377
    10012 14746 14746 93 3670   -.03500000014901161 36 36       .0156       .0036      -.1922     -.0339       .0171      .0036       .0326       .0006
    10012 14747 14747 94 3670  -.015544041059911251 36 36      -.0159      -.0147       .0239     -.0108      -.0069      .0259       -.005      -.0224
    10012 14748 14748 95 3670   .031578946858644485 36 36      -.0136      -.0065       .0033      .0177      -.0325      .0114      -.0138 -.018099999
    10012 14749 14749 96 3670     .0357142873108387 36 36      -.0019       .0014      -.0463       .002       .0409     -.0291      -.0074      -.0244
    10012 14752 14752 97 3670   -.07389162480831146 36 36      -.0029       .0118       .0064      .0138      -.0028      -.015      -.0146      -.0162
    10012 14753 14753 98 3670   -.01595744676887989 36 36      -.0051       .0014       .0091     -.0072       .0031      .0166      -.0167 -.007900001
    10012 14754 14754 99 3670     .0324324332177639 36 36       .0045       .0306       .0584      .0431       .0389      .0078      -.0229        .009
    end
    format %td date
    format %tdDD.NN.CCYY ndate
    format %tbcalendar bcal_date
    label values ffi ffi
    label def ffi 36 "Electronic Equipment", modify
    Last edited by Max he; 03 Dec 2023, 09:22.

  • #2
    It's great that you have used datex to pull a sample of your data, but this particular sample makes it a little hard to figure out what's going on. It's all for a single permno, (10012) and thus all for single sic code (3670) and FF industry (36). You seem to have merged in by date the returns for a number of FF industries. In fact, I think all you need is the return for the pertinent FF industry, which is industry 36 in this example.
    Last edited by Devra Golbe; 03 Dec 2023, 13:20.
    Devra Golbe
    Professor Emerita, Dept. of Economics
    Hunter College, CUNY

    Comment


    • #3
      Hi Devra, thank you very much for your feedback. In my data example, only PERMNO 10012 is included, which is correct but a bit unfortunate. However, my dataset consists of all stock companies in the entire USA (on a daily basis), meaning there are many PERMNOs. I have successfully merged the Fama French Industry Index on a daily basis, maintaining the format as it can be downloaded from the Fama French website. This format has all 48 industries in columns sequentially (as seen in the data example). Now, I am facing the challenge of creating an industry index for daily stock data of all US companies (as the authors of my study that i want to repliacte do also). My thought was to create a variable (industry_index) where only the Fama French 48 industry return for the respective PERMNO on the specific day is included. This would mean that for PERMNO 10012, the return for Industry 36 is included on the respective days. However, since I have not only PERMNO 10012 but around 20 million observations, I have no idea how to solve this problem in Stata.

      Comment


      • #4
        I think you need to back up a step or two to accomplish what you want. Let's go back to the original (almost) data files.


        The CRSP data, which you've augmented with a couple of extra dates and the FF industry code looks like this once the date variables are formatted:


        Code:
        clear
        input long PERMNO int(date ndate bcal_date SICCD) double RET float ffi byte ff_industry_48
        10012 14612 14612  0 3670 -.0038910505827516317 36 36    
        10012 14613 14613  1 3670             -.0859375 36 36
        10012 14614 14614  2 3670    .09401709586381912 36 36
        10012 14615 14615  3 3670                -.0625 36 36   
        10012 14616 14616  4 3670   .008333333767950535 36 36    
        10012 14619 14619  5 3670                     0 36 36
        10012 14620 14620  6 3670   -.03305784985423088 36 36  
        end
        format date ndate %td
        The Fama French industry returns data (from Ken French's website) looks like this (once we've converted and formatted the date variables):

        Code:
        clear
        input str8(dateraw) float(agric food soda beer smoke chips)
        19991231  2.37    .2  -.59   -.3 -2.75   .76
        20000103 -3.41 -2.37  -.52 -1.16  1.67  2.85
        20000104 -2.68 -2.27 -2.19 -1.11   .64  -5.1
        20000105     3  -.03  1.12  1.33  -.41 -1.77
        20000106   .24   .99  3.69  1.29 -1.28  -7.9
        20000107  2.77  1.18  1.49  5.25  2.75  3.19
        20000110  -.98 -1.12   .49  -3.1  1.07  6.49
        end
        gen date=date(dateraw, "YMD")
        drop dateraw
        format date $td
        The return variables are labeled with the industry names, not the codes. You need to attach the codes in order to merge them with the CRSP data. You can rename them like this:

        Code:
        rename(agric-chips) (FFindret1 FFindret2 FFindret3  FFindret4 FFindret5 FFindret36)

        In order to merge this with the CRSP data you want a single variable holding the appropriate FF industry return. So reshape the data from wide to long:

        Code:
         reshape long FFindret, i(date) j(ind)
        Now you can merge your CRSP data with the reshaped FF data by date and industry code.

        I have not tested this code. I suggest that you take an extract of your CRSP data containing observations from multiple firms, industries and dates and see if the merge works. (The entire FF file will be relatively small, so I think no need to take an extract there. If you want to do that, make sure you extract industries to match what you have in the CRSP file.)

        Devra Golbe
        Professor Emerita, Dept. of Economics
        Hunter College, CUNY

        Comment

        Working...
        X