Hi,
I need to collapse a patents dataset at firm and year level.
Code:
bys permno filing_year: egen n_citations = sum(cites) // I computed how many citation per company I have per year
bys permno filing_year: egen sum_xi_nominal = sum(xi_nominal) // I have computed the sum per company per year of xi_nominal
bys permno filing_year: egen sum_xi_real = sum(xi_real) // I have computed the sum per company per year of xi_nominal
bys permno filing_year: egen n_patents = total(inrange(scheme_code, 0,9))
rename filing_year fyear
sort permno fyear
gen patents_per_year = .
by permno fyear: egen patents_per_year_per_company = count(patent_num)
list fyear permno patents_per_year_per_company in 1/20
// I need to collapse this dataset at company and year level
sort permno fyear
//collapse (sum) patent_num (sum)cites (sum) xi_nominal (sum) xi_real (sum)y02a (sum)y02b (sum)y02c (sum)y02d (sum)y02e (sum)y02p (sum)y02t (sum)y02w (sum)y04s (sum)no_green (sum)green_patent (sum)no_green_patent (sum)n_citations (sum)n_patents, by(permno permco fyear)
collapse (sum) patent_num (sum)cites (sum) xi_nominal (sum) xi_real (sum)y02a (sum)y02b (sum)y02c (sum)y02d (sum)y02e (sum)y02p (sum)y02t (sum)y02w (sum)y04s (sum)no_green (sum)green_patent (sum)patents_per_year_per_company, by(permno permco fyear)
sort fyear permno
list in 1/20
/*
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| permno permco fyear patent~m cites xi_nom~l xi_real y02a y02b y02c y02d y02e y02p y02t y02w y04s no_green green_~t patent~y |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1. | 10001 7953 1993 5327647 5 .1825723 .1190317 0 0 0 0 1 0 0 0 0 0 1 1 |
2. | 10001 7953 1997 6012812 60 .2043032 .1145976 0 0 0 0 0 0 0 0 0 1 0 1 |
3. | 10001 7953 2002 6902370 35 .3731241 .1850453 0 0 0 0 1 0 0 0 0 0 1 1 |
4. | 10006 22156 1921 9758571 13 .0883718 .4960096 0 0 0 0 0 0 0 0 0 6 0 36 |
5. | 10006 22156 1922 11160721 7 .0846827 .4702989 0 0 0 0 0 0 0 0 0 7 0 49 |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
6. | 10006 22156 1923 22820352 29 .1887543 1.061288 0 0 0 0 0 0 0 0 0 14 0 196 |
7. | 10006 22156 1924 11540931 11 .1231909 .7043297 0 0 0 0 0 0 0 0 0 7 0 49 |
8. | 10006 22156 1925 48333109 35 1.011997 5.889353 0 0 0 0 0 0 0 0 0 29 0 841 |
9. | 10006 22156 1926 58357652 59 .5742458 3.294533 0 0 0 0 0 0 0 0 0 35 0 1225 |
10. | 10006 22156 1927 49720592 149 .6092515 3.556817 0 0 0 0 0 0 0 0 0 29 0 841 |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
11. | 10006 22156 1928 87650102 83 2.213037 13.38762 0 0 0 0 0 0 0 0 0 50 0 2500 |
12. | 10006 22156 1929 34622138 40 .6641808 4.233489 0 0 0 0 0 0 1 0 0 18 1 361 |
13. | 10006 22156 1930 37431801 30 .6321546 4.488455 0 0 0 0 0 0 0 0 0 20 0 400 |
14. | 10006 22156 1931 72999782 69 1.318254 9.527059 0 0 0 0 0 0 0 0 0 38 0 1444 |
15. | 10006 22156 1932 40906863 39 .9460848 6.884185 0 0 0 0 1 0 1 0 0 19 2 441 |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
16. | 10006 22156 1933 34182431 39 1.342568 9.494991 0 0 0 0 0 0 0 0 0 17 0 289 |
17. | 10006 22156 1934 45028417 52 2.549865 17.73491 0 0 0 0 0 0 0 0 0 22 0 484 |
18. | 10006 22156 1935 35799230 39 1.721348 11.9177 0 0 0 0 0 0 0 0 0 17 0 289 |
19. | 10006 22156 1936 57247703 110 2.079623 14.33627 0 0 0 0 0 0 0 0 0 27 0 729 |
20. | 10006 22156 1937 65384414 130 1.565747 10.78133 0 0 0 0 0 0 1 0 0 29 1 900 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
*/
What I would expect is that the latter column (patents_per_year_per_company) would be the sum of (no_green and green_patent) columns.
What am I doing wrong?
Thanks!
I need to collapse a patents dataset at firm and year level.
Code:
bys permno filing_year: egen n_citations = sum(cites) // I computed how many citation per company I have per year
bys permno filing_year: egen sum_xi_nominal = sum(xi_nominal) // I have computed the sum per company per year of xi_nominal
bys permno filing_year: egen sum_xi_real = sum(xi_real) // I have computed the sum per company per year of xi_nominal
bys permno filing_year: egen n_patents = total(inrange(scheme_code, 0,9))
rename filing_year fyear
sort permno fyear
gen patents_per_year = .
by permno fyear: egen patents_per_year_per_company = count(patent_num)
list fyear permno patents_per_year_per_company in 1/20
// I need to collapse this dataset at company and year level
sort permno fyear
//collapse (sum) patent_num (sum)cites (sum) xi_nominal (sum) xi_real (sum)y02a (sum)y02b (sum)y02c (sum)y02d (sum)y02e (sum)y02p (sum)y02t (sum)y02w (sum)y04s (sum)no_green (sum)green_patent (sum)no_green_patent (sum)n_citations (sum)n_patents, by(permno permco fyear)
collapse (sum) patent_num (sum)cites (sum) xi_nominal (sum) xi_real (sum)y02a (sum)y02b (sum)y02c (sum)y02d (sum)y02e (sum)y02p (sum)y02t (sum)y02w (sum)y04s (sum)no_green (sum)green_patent (sum)patents_per_year_per_company, by(permno permco fyear)
sort fyear permno
list in 1/20
/*
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| permno permco fyear patent~m cites xi_nom~l xi_real y02a y02b y02c y02d y02e y02p y02t y02w y04s no_green green_~t patent~y |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1. | 10001 7953 1993 5327647 5 .1825723 .1190317 0 0 0 0 1 0 0 0 0 0 1 1 |
2. | 10001 7953 1997 6012812 60 .2043032 .1145976 0 0 0 0 0 0 0 0 0 1 0 1 |
3. | 10001 7953 2002 6902370 35 .3731241 .1850453 0 0 0 0 1 0 0 0 0 0 1 1 |
4. | 10006 22156 1921 9758571 13 .0883718 .4960096 0 0 0 0 0 0 0 0 0 6 0 36 |
5. | 10006 22156 1922 11160721 7 .0846827 .4702989 0 0 0 0 0 0 0 0 0 7 0 49 |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
6. | 10006 22156 1923 22820352 29 .1887543 1.061288 0 0 0 0 0 0 0 0 0 14 0 196 |
7. | 10006 22156 1924 11540931 11 .1231909 .7043297 0 0 0 0 0 0 0 0 0 7 0 49 |
8. | 10006 22156 1925 48333109 35 1.011997 5.889353 0 0 0 0 0 0 0 0 0 29 0 841 |
9. | 10006 22156 1926 58357652 59 .5742458 3.294533 0 0 0 0 0 0 0 0 0 35 0 1225 |
10. | 10006 22156 1927 49720592 149 .6092515 3.556817 0 0 0 0 0 0 0 0 0 29 0 841 |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
11. | 10006 22156 1928 87650102 83 2.213037 13.38762 0 0 0 0 0 0 0 0 0 50 0 2500 |
12. | 10006 22156 1929 34622138 40 .6641808 4.233489 0 0 0 0 0 0 1 0 0 18 1 361 |
13. | 10006 22156 1930 37431801 30 .6321546 4.488455 0 0 0 0 0 0 0 0 0 20 0 400 |
14. | 10006 22156 1931 72999782 69 1.318254 9.527059 0 0 0 0 0 0 0 0 0 38 0 1444 |
15. | 10006 22156 1932 40906863 39 .9460848 6.884185 0 0 0 0 1 0 1 0 0 19 2 441 |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
16. | 10006 22156 1933 34182431 39 1.342568 9.494991 0 0 0 0 0 0 0 0 0 17 0 289 |
17. | 10006 22156 1934 45028417 52 2.549865 17.73491 0 0 0 0 0 0 0 0 0 22 0 484 |
18. | 10006 22156 1935 35799230 39 1.721348 11.9177 0 0 0 0 0 0 0 0 0 17 0 289 |
19. | 10006 22156 1936 57247703 110 2.079623 14.33627 0 0 0 0 0 0 0 0 0 27 0 729 |
20. | 10006 22156 1937 65384414 130 1.565747 10.78133 0 0 0 0 0 0 1 0 0 29 1 900 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
*/
What I would expect is that the latter column (patents_per_year_per_company) would be the sum of (no_green and green_patent) columns.
What am I doing wrong?
Thanks!
Comment