Hi,
I need to collapse my dataset of patents at company and year level.
But I want to keep all my variables in the dataset:
describe
storage display value
variable name type format label variable label
patent_num long %12.0g
permno long %8.0g
issue_date long %12.0g
issue_year float %9.0g
filing_date long %12.0g
filing_year float %9.0g
cpc strL %9s
scheme_code float %9.0g
xi_nominal float %9.0g
xi_real float %9.0g
cites int %8.0g
permco long %12.0g
Sorted by: filing_year issue_year
I have done the following code:
sort filing_year permco
order patent_num permco patent_num permno permco filing_year issue_year scheme_code cites xi_nominal xi_real
gen green_d = 1
replace green_d = 0 if scheme_code == 0
summarize green_d scheme_code
/*
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
green_d | 3,160,451 .0434232 .2038079 0 1
scheme_code | 3,160,451 .2368975 1.172737 0 9
*/
sort permco filing_year
collapse (sum) patent_num (sum)cites (sum) xi_nominal (sum) xi_real (sum)green_d, by(filing_year permco cpc scheme_code)
list in 1/20
list in 1/20
+------------------------------------------------------------------------------------------------------------------+
| permco filing~r scheme~e cpc patent~m cites xi_nom~l xi_real green_d |
|------------------------------------------------------------------------------------------------------------------|
1. | 20626 1834 0 G03B21/16 1666047 7 .2017942 1.151761 0 |
2. | 21912 1834 0 H01F19/02 1807022 0 .0858394 .5738302 0 |
3. | 22533 1834 0 F16C33/106 1682190 1 .0643711 .3674046 0 |
4. | 22534 1834 0 A21D2/04 1651718 0 .5634137 3.178565 0 |
5. | 20792 1835 0 1837913 0 .2310263 1.544395 0 |
|------------------------------------------------------------------------------------------------------------------|
6. | 20792 1835 0 H01J2893/0013;H01J19/28 1717239 1 .2393945 1.358425 0 |
7. | 20792 1835 0 H02K47/08 1628379 0 .1100432 .6208216 0 |
8. | 20833 1835 0 E03D1/34 1682306 2 .1912527 1.091595 0 |
9. | 22876 1835 0 B21L15/00 1790464 9 .0824105 .5509083 0 |
10. | 22223 1836 0 B62D5/10 1848464 3 .0463756 .3455161 0 |
|------------------------------------------------------------------------------------------------------------------|
11. | 20792 1837 0 G01R5/04 1804330 4 .3967184 2.652035 0 |
12. | 22541 1837 0 F16D55/26 1698056 1 .5901366 3.348682 0 |
13. | 22615 1837 0 B41D7/04;F16J15/061;Y10S220/03 1705454 7 .0345968 .1963165 0 |
14. | 20792 1838 0 H01F27/245 1717200 0 .2393945 1.358425 0 |
15. | 22496 1868 0 C08G73/0273;C08G73/0627 3127359 1 2.22735 6.967609 0 |
|------------------------------------------------------------------------------------------------------------------|
16. | 20103 1883 0 Y10S101/29;B41M3/001 1642774 6 .3180774 1.794472 0 |
17. | 21329 1887 0 C11D3/3953 1716014 1 .1859463 1.055137 0 |
18. | 22518 1903 0 B65D88/78 3189224 1 .4522994 1.388189 0 |
19. | 20103 1904 0 H04Q3/00 1601061 0 .2148499 1.184709 0 |
20. | 20103 1907 0 H01H67/16;H01H67/06 1585024 0 .6957758 3.836594 0 |
+------------------------------------------------------------------------------------------------------------------+
But I still have more than one row per company per year.
What should I do to collapse it into one raw per company per year?
Thanks a lot!
I need to collapse my dataset of patents at company and year level.
But I want to keep all my variables in the dataset:
describe
storage display value
variable name type format label variable label
patent_num long %12.0g
permno long %8.0g
issue_date long %12.0g
issue_year float %9.0g
filing_date long %12.0g
filing_year float %9.0g
cpc strL %9s
scheme_code float %9.0g
xi_nominal float %9.0g
xi_real float %9.0g
cites int %8.0g
permco long %12.0g
Sorted by: filing_year issue_year
I have done the following code:
sort filing_year permco
order patent_num permco patent_num permno permco filing_year issue_year scheme_code cites xi_nominal xi_real
gen green_d = 1
replace green_d = 0 if scheme_code == 0
summarize green_d scheme_code
/*
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
green_d | 3,160,451 .0434232 .2038079 0 1
scheme_code | 3,160,451 .2368975 1.172737 0 9
*/
sort permco filing_year
collapse (sum) patent_num (sum)cites (sum) xi_nominal (sum) xi_real (sum)green_d, by(filing_year permco cpc scheme_code)
list in 1/20
list in 1/20
+------------------------------------------------------------------------------------------------------------------+
| permco filing~r scheme~e cpc patent~m cites xi_nom~l xi_real green_d |
|------------------------------------------------------------------------------------------------------------------|
1. | 20626 1834 0 G03B21/16 1666047 7 .2017942 1.151761 0 |
2. | 21912 1834 0 H01F19/02 1807022 0 .0858394 .5738302 0 |
3. | 22533 1834 0 F16C33/106 1682190 1 .0643711 .3674046 0 |
4. | 22534 1834 0 A21D2/04 1651718 0 .5634137 3.178565 0 |
5. | 20792 1835 0 1837913 0 .2310263 1.544395 0 |
|------------------------------------------------------------------------------------------------------------------|
6. | 20792 1835 0 H01J2893/0013;H01J19/28 1717239 1 .2393945 1.358425 0 |
7. | 20792 1835 0 H02K47/08 1628379 0 .1100432 .6208216 0 |
8. | 20833 1835 0 E03D1/34 1682306 2 .1912527 1.091595 0 |
9. | 22876 1835 0 B21L15/00 1790464 9 .0824105 .5509083 0 |
10. | 22223 1836 0 B62D5/10 1848464 3 .0463756 .3455161 0 |
|------------------------------------------------------------------------------------------------------------------|
11. | 20792 1837 0 G01R5/04 1804330 4 .3967184 2.652035 0 |
12. | 22541 1837 0 F16D55/26 1698056 1 .5901366 3.348682 0 |
13. | 22615 1837 0 B41D7/04;F16J15/061;Y10S220/03 1705454 7 .0345968 .1963165 0 |
14. | 20792 1838 0 H01F27/245 1717200 0 .2393945 1.358425 0 |
15. | 22496 1868 0 C08G73/0273;C08G73/0627 3127359 1 2.22735 6.967609 0 |
|------------------------------------------------------------------------------------------------------------------|
16. | 20103 1883 0 Y10S101/29;B41M3/001 1642774 6 .3180774 1.794472 0 |
17. | 21329 1887 0 C11D3/3953 1716014 1 .1859463 1.055137 0 |
18. | 22518 1903 0 B65D88/78 3189224 1 .4522994 1.388189 0 |
19. | 20103 1904 0 H04Q3/00 1601061 0 .2148499 1.184709 0 |
20. | 20103 1907 0 H01H67/16;H01H67/06 1585024 0 .6957758 3.836594 0 |
+------------------------------------------------------------------------------------------------------------------+
But I still have more than one row per company per year.
What should I do to collapse it into one raw per company per year?
Thanks a lot!
Comment