Hi Guys,
For my research I need firm complexity as a control variable.
The way in which well-known authors proxy for firm complexity is with the (revenue-based) Hirfindahl-Hirschman index.
It is calculated as "the sum of the squares of each segment's sales as a percentage of the total firm sales".
Now my dataset, retrieved from Compustat, looks like this (for two firms from my sample):
What I want to have is for every firm-year combination one row with the squared net sales from all segments that the firm has in that particular year, so that I can merge this together with the total sales later in order to compute the H-H index. Can anyone help me out with this?
However, Compustat gives me this dataset, where there are different rows for each firm-year combination, but with the same amount of net sales... That is strange (or isn't it?).
Best,
Mixa
For my research I need firm complexity as a control variable.
The way in which well-known authors proxy for firm complexity is with the (revenue-based) Hirfindahl-Hirschman index.
It is calculated as "the sum of the squares of each segment's sales as a percentage of the total firm sales".
Now my dataset, retrieved from Compustat, looks like this (for two firms from my sample):
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str6 SegmentType double NetSales str28 CompanyName long CIKNumber str4 fyear "BUSSEG" 6.614 "CERES INC" 767884 "2010" "BUSSEG" 6.616 "CERES INC" 767884 "2011" "BUSSEG" 6.616 "CERES INC" 767884 "2011" "BUSSEG" 6.616 "CERES INC" 767884 "2011" "BUSSEG" 5.371 "CERES INC" 767884 "2012" "BUSSEG" 5.371 "CERES INC" 767884 "2012" "BUSSEG" 5.371 "CERES INC" 767884 "2012" "BUSSEG" 5.243 "CERES INC" 767884 "2013" "BUSSEG" 5.243 "CERES INC" 767884 "2013" "BUSSEG" 5.243 "CERES INC" 767884 "2013" "BUSSEG" 2.404 "CERES INC" 767884 "2014" "BUSSEG" 2.404 "CERES INC" 767884 "2014" "BUSSEG" 2.72 "CERES INC" 767884 "2015" "BUSSEG" 2.539 "EAGLE PHARMACEUTICALS INC" 827871 "2012" "BUSSEG" 2.539 "EAGLE PHARMACEUTICALS INC" 827871 "2012" "BUSSEG" 2.539 "EAGLE PHARMACEUTICALS INC" 827871 "2012" "BUSSEG" 13.679 "EAGLE PHARMACEUTICALS INC" 827871 "2013" "BUSSEG" 13.679 "EAGLE PHARMACEUTICALS INC" 827871 "2013" "BUSSEG" 13.679 "EAGLE PHARMACEUTICALS INC" 827871 "2013" "BUSSEG" 19.099 "EAGLE PHARMACEUTICALS INC" 827871 "2014" "BUSSEG" 19.099 "EAGLE PHARMACEUTICALS INC" 827871 "2014" "BUSSEG" 19.099 "EAGLE PHARMACEUTICALS INC" 827871 "2014" "BUSSEG" 66.227 "EAGLE PHARMACEUTICALS INC" 827871 "2015" "BUSSEG" 66.227 "EAGLE PHARMACEUTICALS INC" 827871 "2015" "BUSSEG" 66.227 "EAGLE PHARMACEUTICALS INC" 827871 "2015" "BUSSEG" 189.482 "EAGLE PHARMACEUTICALS INC" 827871 "2016" "BUSSEG" 189.482 "EAGLE PHARMACEUTICALS INC" 827871 "2016" "BUSSEG" 189.482 "EAGLE PHARMACEUTICALS INC" 827871 "2016" "BUSSEG" 236.707 "EAGLE PHARMACEUTICALS INC" 827871 "2017" "BUSSEG" 236.707 "EAGLE PHARMACEUTICALS INC" 827871 "2017" "BUSSEG" 213.312 "EAGLE PHARMACEUTICALS INC" 827871 "2018"
However, Compustat gives me this dataset, where there are different rows for each firm-year combination, but with the same amount of net sales... That is strange (or isn't it?).
Best,
Mixa
Comment