Hi, I am a Master student writing my thesis on the patent value of Chinese patents on AI using the citations a patent has received as a proxy.
I sourced my dataset an I am now trying to find which would be the best way to go at this. I thought of using the extended family size of the patents and then controlling for the publication year so that the older the patent is, the less the weight the citations received on such patent have (as older patents are going to be more cited compared to newer patents).
It's a dataset of 46,658 observations spanning from the year 2003 to 2023.
I tried using the command: hhi citedbypatentcount, by ( extendedfamilysize publicationyear) but the results were very big and not reliable.
The command mean sum_cit, over(publicationyear) gave better results but when I tried the command mean sum_cit, over(publicationyear extendedfamilysize) the results were not good.
I am a bit out of ideas on how to do this, my hope is that someone more knowledgeable than me could know what to do and get me out of this.
Here is a sample of it:
I sourced my dataset an I am now trying to find which would be the best way to go at this. I thought of using the extended family size of the patents and then controlling for the publication year so that the older the patent is, the less the weight the citations received on such patent have (as older patents are going to be more cited compared to newer patents).
It's a dataset of 46,658 observations spanning from the year 2003 to 2023.
I tried using the command: hhi citedbypatentcount, by ( extendedfamilysize publicationyear) but the results were very big and not reliable.
The command mean sum_cit, over(publicationyear) gave better results but when I tried the command mean sum_cit, over(publicationyear extendedfamilysize) the results were not good.
I am a bit out of ideas on how to do this, my hope is that someone more knowledgeable than me could know what to do and get me out of this.
Here is a sample of it:
Code:
input long v1 int(publicationyear citedbypatentcount extendedfamilysize) 1 2012 1368 1191 2 2012 807 12 3 2005 783 151 4 2012 551 11 5 2017 519 51 6 2016 469 77 7 2012 463 1191 8 2013 452 15 9 2016 451 1191 10 2019 441 9 11 2013 430 66 12 2011 410 20 13 2014 398 7 14 2007 395 16 15 2012 393 11 16 2012 393 9 17 2013 379 65 18 2017 341 9 19 2012 321 19 20 2011 302 12 21 2017 281 1191 22 2016 275 14 23 2008 273 47 24 2011 269 20 25 2007 263 14 26 2010 261 7 27 2012 258 18 28 2018 253 17 29 2010 250 48 30 2013 245 39 31 2014 237 13 32 2012 233 19 33 2006 233 17 34 2014 233 59 35 2007 230 8 36 2013 229 8 37 2010 226 40 38 2012 223 11 39 2012 223 19 40 2020 221 17 41 2015 219 41 42 2009 218 23 43 2009 217 12 44 2005 217 30 45 2011 214 10 46 2009 212 10 47 2014 210 5 48 2013 209 7 49 2012 208 11 50 2014 208 41 51 2011 208 14 52 2016 207 32 53 2015 207 65 54 2006 206 153 55 2012 206 9 56 2012 203 19 57 2012 202 57 58 2018 200 411 59 2014 197 8 60 2012 194 12 61 2007 192 5 62 2008 190 4 63 2012 189 5 64 2018 186 29 65 2010 185 17 66 2012 185 47 67 2013 185 27 68 2015 185 12 69 2011 184 8 70 2017 182 19 71 2011 179 15 72 2006 178 7 73 2007 176 7 74 2010 173 46 75 2009 172 15 76 2012 170 23 77 2012 169 5 78 2013 168 14 79 2004 166 7 80 2006 166 44 81 2007 165 20 82 2014 164 49 83 2013 164 57 84 2007 161 10 85 2013 161 7 86 2007 158 8 87 2013 156 15 88 2015 156 3 89 2013 156 7 90 2006 155 8 91 2013 154 3 92 2015 154 2 93 2006 153 5 94 2013 153 34 95 2012 153 2 96 2012 152 10 97 2012 151 77 98 2010 151 13 99 2012 150 28 100 2018 150 9 end