Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • aggregating values from several variables with IF AND conditions

    Hello Statalisters,

    I have a dataset containing geographic segment sales that I have standardized into four geographic regions. Now I would like to add all segments that are equal to one region.

    I have tried several ways (that are most likely wrong because I had Excel logic in my mind) but was not successful in solving this yet.
    It will probably turn out to be quite basic for Stata professionals, so I hope that everyone could bear with me (thank you for your patience).

    quick infos: I am using a registered version of StataSE15 for Mac.

    First, I need to identify the segment that I want to aggregate, so the basic code is:

    generate Americas = geosales1 if regionsADJseg01=="Americas"
    or
    generate EMEA = geosales1 if regionsADJseg01=="EMEA"

    But now, what I need to get is the "AND" condition, because I want to aggregate all geosales(1-10) IF regionsADJseg(01-10)=="Americas" [or "EMEA"; "APAC"; "ROW" respectively].

    So I tried two options, following the Stata manual for generating new variables, but received invalid syntax errors(198) because I couldnt find information on how to integrate several conditions into one code.

    generate Americas = geosales1 if regionsADJseg01=="Americas" + geosales2 if regionsADJseg02=="Americas"
    generate Americas = geosales1 if regionsADJseg01=="Americas" & geosales2 if regionsADJseg02=="Americas"

    I also tried to use the variable generator, creating following code (which also did not yield any values for my new variable):

    generate Americas = geosales1 + geosales2 + geosales3 + geosales4 + geosales5 + geosales6 + geosales7 + geosales8 + geosales9 + geosales10 if regionsADJseg01 =="Americas" & regionsADJseg02 =="Americas" & regionsADJseg03 =="Americas" & regionsADJseg04 =="Americas" & regionsADJseg05 =="Americas" & regionsADJseg06 =="Americas" & regionsADJseg07 =="Americas" & regionsADJseg08 =="Americas" & regionsADJseg09 =="Americas" & regionsADJseg10 =="Americas"

    In the end, I would like to aggregate the ten segments into the four that I determined (Americas, APAC, EMEA, ROW).

    Here is my dataex for the first five out of ten segments:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int year str6 compID str18 regionsADJseg01 double geosales1 str18 regionsADJseg02 double geosales2 str18 regionsADJseg03 double geosales3 str18 regionsADJseg04 double geosales4 str18 regionsADJseg05 double geosales5
    2000 "130042" "Americas"   90176 ""              . ""             . ""        . ""    .
    2001 "130042" "Americas"  114723 ""              . ""             . ""        . ""    .
    2002 "130042" "Americas"  133096 ""              . ""             . ""        . ""    .
    2003 "130042" "Americas"  146569 ""              . ""             . ""        . ""    .
    2004 "130042" "Americas"  168341 ""              . ""             . ""        . ""    .
    2005 "130042" "Americas"  207170 ""              . ""             . ""        . ""    .
    2006 "130042" "Americas"  257485 "ROW"       15878 ""             . ""        . ""    .
    2007 "130042" "Americas"  290161 "ROW"       28141 ""             . ""        . ""    .
    2008 "130042" "Americas"  356303 "EMEA"      53126 "EMEA"      5975 "ROW"     0 ""    .
    2009 "130042" "Americas"  316177 "EMEA"      55024 "EMEA"      3260 ""        . ""    .
    2010 "130042" "Americas"  273900 "EMEA"      51511 "Americas"  3168 "EMEA" 2178 "ROW" 0
    2011 "130042" "Americas"       0 "Americas" 302724 "EMEA"     61704 "EMEA"    0 "ROW" 0
    2012 "130042" "Americas"       0 "Americas" 314076 "EMEA"     67183 "ROW"     0 ""    .
    2013 "130042" "Americas"       . "Americas" 331252 "EMEA"     75234 ""        . ""    .
    2017 "130042" "Americas"  533549 ""              . ""             . ""        . ""    .
    2018 "130042" "Americas"  593229 ""              . ""             . ""        . ""    .
    2018 "130042" "Americas"  593229 ""              . ""             . ""        . ""    .
    2000 "130062" "Americas" 1333000 ""              . ""             . ""        . ""    .
    2001 "130062" "Americas" 1505691 ""              . ""             . ""        . ""    .
    2002 "130062" "Americas" 1497101 ""              . ""             . ""        . ""    .
    2003 "130062" "Americas" 1711453 ""              . ""             . ""        . ""    .
    2004 "130062" "Americas" 1759613 ""              . ""             . ""        . ""    .
    2005 "130062" "Americas" 2004243 ""              . ""             . ""        . ""    .
    2006 "130062" "Americas" 2206401 ""              . ""             . ""        . ""    .
    2007 "130062" "Americas" 2207141 ""              . ""             . ""        . ""    .
    2008 "130062" "Americas" 2120081 ""              . ""             . ""        . ""    .
    2009 "130062" "Americas" 1702603 ""              . ""             . ""        . ""    .
    2010 "130062" "Americas" 1782857 ""              . ""             . ""        . ""    .
    2011 "130062" "Americas" 1672077 "ROW"       41746 ""             . ""        . ""    .
    2012 "130062" "Americas" 1996142 "ROW"       41525 ""             . ""        . ""    .
    2013 "130062" "Americas" 2113068 "ROW"       42483 ""             . ""        . ""    .
    2014 "130062" "Americas" 2912115 "ROW"       45836 ""             . ""        . ""    .
    2015 "130062" "Americas" 3493462 "ROW"       46108 ""             . ""        . ""    .
    2016 "130062" "Americas" 3761651 "ROW"       57098 ""             . ""        . ""    .
    2017 "130062" "Americas" 3901323 "ROW"       64271 ""             . ""        . ""    .
    2018 "130062" "Americas" 4166339 "ROW"       77926 ""             . ""        . ""    .
    2000 "130086" "Americas" 1063000 "APAC"     115000 "ROW"      49000 ""        . ""    .
    2001 "130086" "Americas" 1012000 "APAC"     113000 "ROW"      40000 ""        . ""    .
    2002 "130086" "Americas"  989000 "APAC"      87000 "ROW"      41000 ""        . ""    .
    2003 "130086" "Americas" 1002852 "APAC"      86000 "ROW"      12000 ""        . ""    .
    2004 "130086" "Americas" 1087000 "APAC"      94000 "ROW"      26000 ""        . ""    .
    2005 "130086" "Americas" 1122000 "APAC"      49000 "ROW"      10000 ""        . ""    .
    2006 "130086" "Americas" 1182000 "APAC"      32000 "ROW"      16000 ""        . ""    .
    2007 "130086" "Americas" 1172000 "APAC"      43000 "ROW"      10000 ""        . ""    .
    2008 "130086" "Americas" 1220000 "ROW"       12000 "APAC"         0 ""        . ""    .
    2009 "130086" "Americas" 1136000 "APAC"      33000 ""             . ""        . ""    .
    2010 "130086" "Americas" 1228000 "APAC"      87000 ""             . ""        . ""    .
    2011 "130086" "Americas" 1378000 "APAC"     111000 "ROW"          0 ""        . ""    .
    2012 "130086" "Americas" 1467000 "APAC"     104000 "ROW"          0 ""        . ""    .
    2013 "130086" "Americas" 1428000 "APAC"     280000 ""             . ""        . ""    .
    2014 "130086" "Americas"  318000 "APAC"     286000 ""             . ""        . ""    .
    2015 "130086" "Americas"  302074 "APAC"     242800 ""             . ""        . ""    .
    2016 "130086" "Americas"  507391 "APAC"     280887 ""             . ""        . ""    .
    2017 "130086" "Americas"  419403 "APAC"     400193 ""             . ""        . ""    .
    2018 "130086" "Americas"  390396 "APAC"     425742 ""             . ""        . ""    .
    2000 "130088" "Americas"  759037 ""              . ""             . ""        . ""    .
    2001 "130088" "Americas"  849799 ""              . ""             . ""        . ""    .
    2002 "130088" "Americas" 1209990 ""              . ""             . ""        . ""    .
    2003 "130088" "Americas" 1472885 ""              . ""             . ""        . ""    .
    2004 "130088" "Americas" 1738843 ""              . ""             . ""        . ""    .
    2005 "130088" "Americas" 2067979 ""              . ""             . ""        . ""    .
    2006 "130088" "Americas" 2369612 ""              . ""             . ""        . ""    .
    2007 "130088" "Americas" 2703212 ""              . ""             . ""        . ""    .
    2008 "130088" "Americas" 3007949 ""              . ""             . ""        . ""    .
    2009 "130088" "Americas" 3206937 ""              . ""             . ""        . ""    .
    2010 "130088" "Americas" 3638336 ""              . ""             . ""        . ""    .
    2011 "130088" "Americas" 4232743 ""              . ""             . ""        . ""    .
    2012 "130088" "Americas" 4664120 ""              . ""             . ""        . ""    .
    2013 "130088" "Americas" 5164784 ""              . ""             . ""        . ""    .
    2014 "130088" "Americas" 5711715 ""              . ""             . ""        . ""    .
    2015 "130088" "Americas" 6226507 ""              . ""             . ""        . ""    .
    2016 "130088" "Americas" 6779579 ""              . ""             . ""        . ""    .
    2017 "130088" "Americas" 7256382 ""              . ""             . ""        . ""    .
    2018 "130088" "Americas" 7911046 ""              . ""             . ""        . ""    .
    2013 "130104" "Americas"   57737 "EMEA"       9603 "APAC"      8700 ""        . ""    .
    2014 "130104" "Americas"   49496 "EMEA"      10525 "APAC"      9666 ""        . ""    .
    2015 "130104" "Americas"   54467 "EMEA"      16310 "APAC"     11855 ""        . ""    .
    2016 "130104" "Americas"   65172 "EMEA"      24054 "APAC"     10482 ""        . ""    .
    2017 "130104" "Americas"  103310 "EMEA"      20733 "APAC"      5246 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        . ""    .
    2000 "130169" "Americas" 6267865 ""              . ""             . ""        . ""    .
    2001 "130169" "Americas" 7694952 ""              . ""             . ""        . ""    .
    2002 "130169" "Americas" 5462704 ""              . ""             . ""        . ""    .
    2003 "130169" "Americas" 6715197 ""              . ""             . ""        . ""    .
    2004 "130169" "Americas" 7497238 ""              . ""             . ""        . ""    .
    2005 "130169" "Americas" 8895994 ""              . ""             . ""        . ""    .
    2006 "130169" "Americas" 9197004 ""              . ""             . ""        . ""    .
    end
    ------------------ copy up to and including the previous line ------------------

    Listed 100 out of 110816 observations


    I highly appreciate your feedback and time to help me out!

    Kind regards,
    Jennifer

  • #2
    You want to generate variable called Americas that takes the value of (goalsales 1 , 2) if regionsADJseg01, 02 is America, is that right? In case of overlapping like id 12 13 14 which value you want? You can't gen new variable that take two values at the same time!

    Comment


    • #3
      Hello Mohammad, thanks for your quick response!

      yes, that is correct, I want to generate a new variables "Americas" from (geosales 1,2,3....) if regionsADJseg01,02,03.... is "Americas".
      In case of overlapping values, such as EMEA or ROW, I will generate for each geographic region a new variable.

      So, in the end, I will have four completely new variables, that do not overlap with each other, but are aggregates from the total ten segments that I have (but I only listed 5 due to space issues):
      gen Americas = for (geosales 1,2,3....) if regionsADJseg01,02,03.... is "Americas".
      gen APAC = for (geosales 1,2,3....) if regionsADJseg01,02,03.... is "APAC".
      gen EMEA = for (geosales 1,2,3....) if regionsADJseg01,02,03.... is "EMEA".
      gen ROW = for (geosales 1,2,3....) if regionsADJseg01,02,03.... is "ROW".

      That should be possible, right?

      Kind regards

      Comment


      • #4
        Maybe another question as a follow up, because I will need the same logic again:
        Is it possible to kind of collapse the data if the values of my variables sicseg`j' (1/10) are equal, as can be seen in this second scenario?
        Because I have more than 300 variations of 4-digit SIC codes, I will not be able to create new variables based on individual SIC because it doesnt make any sense.

        Out of the companies' reported SIC segments, they will not have more than 10 different segments anyways, and out of them, many actually mention the same industry twice or even three and four times, but in a different segment.

        Example id 1-5:

        So, is there a way, to collapse the values in a newly generated variable "SICseg`j'(1/10)new" that adds up all the sicsales`j'(1/10) if the sicseg`j'(1/10) are equal?

        Thanks for any feedback!

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input float id int year str6 CompanyID str4 sicseg1 double sicsales1 str4 sicseg2 double sicsales2 str4 sicseg3 double sicsales3 str4 sicseg4 double sicsales4 str4 sicseg5 double sicsales5
          1 2000 "130042" "7519"    76084 "7519"   13406 "7519"      686 "."         . "."          .
          2 2001 "130042" "7519"    99684 "7519"   14519 "7519"      520 "."         . "."          .
          3 2002 "130042" "7519"   116169 "7519"   16007 "7519"      920 "."         . "."          .
          4 2003 "130042" "7519"   128482 "7519"   17249 "7519"      838 "."         . "."          .
          5 2004 "130042" "7519"   149856 "7519"   17919 "7519"      566 "."         . "."          .
          6 2005 "130042" "7519"    17499 "7519"    1093 "7519"   188578 "."         . "."          .
          7 2006 "130042" "7519"   245105 "7519"   26824 "7519"     1434 "."         . "."          .
          8 2007 "130042" "7519"   284638 "7519"   31644 "9999"     2020 "."         . "."          .
          9 2008 "130042" "7519"   371560 "7519"   41267 "9999"     2577 "."         . "."          .
         10 2009 "130042" "7519"   284638 "7519"   31644 "9999"     2020 "."         . "."          .
         11 2010 "130042" "7519"   295034 "7519"   33156 "9999"     2567 "."         . "."          .
         12 2011 "130042" "7519"   318863 "7519"   42842 "9999"     2723 "."         . "."          .
         13 2012 "130042" "7519"   340797 "7519"   38281 "9999"     2181 "."         . "."          .
         14 2013 "130042" "4225"   366286 "3499"   38050 "3443"     2150 "."         . "."          .
         15 2014 "130042" "4225"   410362 "4225"   31585 "3443"     3527 "."         . "."          .
         16 2015 "130042" "4225"   494715 "4225"   29953 "3443"     6109 "."         . "."          .
         17 2016 "130042" "4225"   329603 "3443"   98061 "3499"    80958 "."         . "."          .
         18 2017 "130042" "4225"   349694 "3443"   98095 "4225"    85760 "."         . "."          .
         19 2018 "130042" "4225"   387025 "3443"  116205 "4225"    89999 "."         . "."          .
         20 2000 "130062" "1423"  1202581 "3297"  130419 "."           . "."         . "."          .
         21 2001 "130062" "1423"  1406179 "1422"   99512 "."           . "."         . "."          .
         22 2002 "130062" "1423"  1422673 "3297"   74428 "9999"        0 "."         . "."          .
         23 2003 "130062" "1423"  1617040 "3297"   94413 "9999"        0 "."         . "."          .
         24 2004 "130062" "1423"  1641690 "3297"  117923 "9999"        . "."         . "."          .
         25 2005 "130062" "1423"  1859685 "3297"  144558 "9999"        . "."         . "."          .
         26 2006 "130062" "1423"  2039840 "3297"  166561 "9999"        0 "."         . "."          .
         27 2007 "130062" "1423"  2035469 "3297"  171672 "9999"        0 "."         . "."          .
         28 2008 "130062" "1423"  1933248 "3297"  186833 "9999"        0 "."         . "."          .
         29 2009 "130062" "1423"  1542692 "3297"  159911 "9999"        0 "."         . "."          .
         30 2010 "130062" "1423"  1589677 "3297"  193180 "9999"        0 "."         . "."          .
         31 2011 "130062" "1423"  1494700 "3297"  219123 "9999"        0 "."         . "."          .
         32 2012 "130062" "1423"  1816954 "3297"  220713 "9999"        0 "."         . "."          .
         33 2013 "130062" "1422"   924691 "3297"  740703 "3297"   245340 "2899" 244817 "9999"       .
         34 2014 "130062" "1422"  1356283 "3297"  848855 "3297"   274352 "2899" 256702 "3241"  221759
         35 2015 "130062" "1422"  1675021 "3297"  926251 "3241"   387947 "3297" 304472 "2899"  245879
         36 2016 "130062" "1422"  1970165 "3297" 1017098 "3241"   375814 "3297" 321078 "2899"  257058
         37 2017 "130062" "1422"  2279723 "3297" 1053325 "3297"   362555 "2899" 269991 "1442"       .
         38 2018 "130062" "1422"  2309924 "3297" 1223236 "3297"   423382 "2899" 287723 "9999"       .
         39 2000 "130086" "2611"   576000 "0811"  401000 "0811"   280000 "."    -30000 "."          .
         40 2001 "130086" "2823"   547000 "2499"  358000 "2411"   281000 "."    -21000 "."          .
         41 2002 "130086" "2823"   526000 "2499"  247000 "2411"   137000 "9999"  -9000 "9999"  216000
         42 2003 "130086" "2823"   534000 "2499"  259000 "2411"   138000 "9999"  -2000 "9999"  172000
         43 2004 "130086" "2823"   581996 "2499"  282000 "2411"   170000 "9999"  -1000 "9999"  174000
         44 2005 "130086" "2823"   628000 "2499"  294000 "2411"   137000 "9999"  -1000 "9999"  123000
         45 2006 "130086" "2823"   672000 "2499"  207000 "2499"   112000 "2411" 111000 "9999"       0
         46 2007 "130086" "2823"   722000 "2499"  222000 "2499"   116000 "2411"  88000 "9999"       0
         47 2008 "130086" "2823"   798000 "2499"  190000 "2499"   127000 "2411"  86000 "9999"   31000
         48 2009 "130086" "2823"   839000 "2499"  159000 "2499"   101000 "2411"  51000 "9999"       0
         49 2010 "130086" "2823"   881000 "2499"  177000 "2499"    96000 "2411"  68000 "9999"       0
         50 2011 "130086" "2823"  1020000 "2499"  215000 "2499"    71000 "2411"  68000 "9999"       0
         51 2012 "130086" "6512"  1093000 "2411"  230000 "5111"    88000 "6512"  57000 "5111"       .
         52 2013 "130086" "6512"  1042000 "2411"  382000 "6512"   149000 "5111"      . "2611"       .
         53 2014 "130086" "6512"   182000 "2611"  142000 "5031"   104000 "2611" 102000 "6512"   77000
         54 2015 "130086" "6512"   161570 "2611"  139093 "6512"    86493 "5031"  81230 "2611"   76488
         55 2016 "130086" "6512"   299350 "0811"  172574 "2611"   132855 "5031" 108312 "2611"   75187
         56 2017 "130086" "6512"   247609 "6512"  183016 "5031"   152584 "2611" 144510 "2611"   91877
         57 2018 "130086" "6512"   249014 "2611"  169995 "5031"   148814 "6512" 138575 "2611"  109832
         58 2000 "130088" "5999"   166988 "0100"  151807 "5261"   121446 "5261" 121446 "5999"   98675
         59 2001 "130088" "5191"   254939 "5261"  186956 "5261"   144466 "5251" 110474 "5651"   76482
         60 2002 "130088" "5191"  1209990 "."          . "."           . "."         . "."          .
         61 2003 "130088" "5191"  1472885 "."          . "."           . "."         . "."          .
         62 2004 "130088" "5191"  1738843 "."          . "."           . "."         . "."          .
         63 2005 "130088" "5191"  2067979 "."          . "."           . "."         . "."          .
         64 2006 "130088" "5191"  2369612 "."          . "."           . "."         . "."          .
         65 2007 "130088" "5191"  2703212 "."          . "."           . "."         . "."          .
         66 2008 "130088" "5191"  3007949 "."          . "."           . "."         . "."          .
         67 2009 "130088" "5191"  3206937 "."          . "."           . "."         . "."          .
         68 2010 "130088" "5191"  3638336 "."          . "."           . "."         . "."          .
         69 2011 "130088" "5191"  4232743 "."          . "."           . "."         . "."          .
         70 2012 "130088" "5191"  4664120 "."          . "."           . "."         . "."          .
         71 2013 "130088" "5191"  5164784 "."          . "."           . "."         . "."          .
         72 2014 "130088" "5191"  5711715 "."          . "."           . "."         . "."          .
         73 2015 "130088" "5191"  6226507 "."          . "."           . "."         . "."          .
         74 2016 "130088" "5191"  6779579 "."          . "."           . "."         . "."          .
         75 2017 "130088" "5191"  7256382 "."          . "."           . "."         . "."          .
         76 2018 "130088" "5191"  3718192 "5191" 1740430 "5191"  1503099 "5191" 632884 "5191"  316442
         77 2003 "130104" "7372"     3831 "."          . "."           . "."         . "."          .
         78 2004 "130104" "7372"     4731 "."          . "."           . "."         . "."          .
         79 2005 "130104" "3674"    10626 "."          . "."           . "."         . "."          .
         80 2006 "130104" "3674"    16951 "."          . "."           . "."         . "."          .
         81 2007 "130104" "3674"    28419 "."          . "."           . "."         . "."          .
         82 2008 "130104" "3674"    24109 "."          . "."           . "."         . "."          .
         83 2009 "130104" "3674"    21241 "."          . "."           . "."         . "."          .
         84 2010 "130104" "3674"    32008 "."          . "."           . "."         . "."          .
         85 2011 "130104" "3674"    37026 "."          . "."           . "."         . "."          .
         86 2012 "130104" "3674"    50218 "."          . "."           . "."         . "."          .
         87 2013 "130104" "3674"    76041 "."          . "."           . "."         . "."          .
         88 2014 "130104" "3674"    69687 "."          . "."           . "."         . "."          .
         89 2015 "130104" "3674"    82631 "."          . "."           . "."         . "."          .
         90 2016 "130104" "3674"    99707 "."          . "."           . "."         . "."          .
         91 2017 "130104" "3674"   129289 "."          . "."           . "."         . "."          .
         92 2018 "130104" "3674"   129033 "."          . "."           . "."         . "."          .
         93 2009 "130169" "4911" 15545632 "9999"  745122 "9999" -1342485 "."         . "."          .
         94 2010 "130169" "4911" 10853118 "4911" 3514174 "4911"  1754589 "4911" 395871 "."          .
         95 2011 "130169" "4911" 10236478 "4911" 4170373 "4911"  1323010 "4911" 103453 "."          .
         96 2012 "130169" "4911" 10102328 "4911" 4243528 "4911"  2211232 "4911"  80075 "."          .
         97 2013 "130169" "4911"  7073588 "4911" 1906897 "4911"  1781636 "4911"  30422 "9999" -689461
         98 2015 "130169" "4911"  4610786 "4911" 2616538 "4911"  1432003 "9999"  87967 "9999" -510036
         99 2016 "130169" "4911" 10309969 "4911" 5441721 "4911"  3561492 "9999"  54508 "9999" -702645
        100 2017 "130169" "4911"  6003387 "4911" 3052763 "4911"  2893350 "9999"  65098 "9999" -596166
        end

        Comment


        • #5
          Check out functions like rowtotal() in

          Code:
          help egen
          and indeed

          Code:
          help collapse

          Comment


          • #6
            Hello Nick, thank you for your response!
            I tried egen and towtotal() together and tested if I can integrate the IF condition, here using only the first regionsADJseg01 to see if the new variable "Americas" would then only give me the sales from geosales1 if regionsADJseg01=="Americas".

            However, it seems like the way I wanted to integrate the IF condition doesn't work.
            The code simply gives me the rowtotal(geosales`j' (1/10)) without considering the IF condition.

            What am I doing wrong? Also, I need to combine the IF condition for all ten segments that I have, so I am unsure how to use the " & | " if they are appropriate here at all.
            Because " & " would imply that all of my segments have to be =="Americas" whereas " | " would imply that only one of the ten segments needs to be =="Americas", no?

            This is what I tried:
            Code:
            egen Americas2=rowtotal(geosales1 geosales2 geosales3 geosales4 geosales5 geosales6 geosales7 geosales8 geosales9 geosales10) if regionsADJseg01=="Americas"
            And the variable "Americas2" I created, which does not equal the values of geosales1 if regionsADJseg01=="Americas", specifically, but simply aggregates all sales and therefore equals total sales.

            How can I integrate the multiple if conditions when using egen newvar = rowtotal() ?

            Here
            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float id int year str6 compID str18 regionsADJseg01 double geosales1 str18 regionsADJseg02 double geosales2 str18 regionsADJseg03 double geosales3 str18 regionsADJseg04 double(geosales4 ttsales) float Americas2
              1 2000 "130042" "Americas"   90176 ""              . ""             . ""        .   90176   90176
              2 2001 "130042" "Americas"  114723 ""              . ""             . ""        .  114723  114723
              3 2002 "130042" "Americas"  133096 ""              . ""             . ""        .  133096  133096
              4 2003 "130042" "Americas"  146569 ""              . ""             . ""        .  146569  146569
              5 2004 "130042" "Americas"  168341 ""              . ""             . ""        .  168341  168341
              6 2005 "130042" "Americas"  207170 ""              . ""             . ""        .  207170  207170
              7 2006 "130042" "Americas"  257485 "ROW"       15878 ""             . ""        .  273363  273363
              8 2007 "130042" "Americas"  290161 "ROW"       28141 ""             . ""        .  318302  318302
              9 2008 "130042" "Americas"  356303 "EMEA"      53126 "EMEA"      5975 "ROW"     0  415404  415404
             10 2009 "130042" "Americas"  316177 "EMEA"      55024 "EMEA"      3260 ""        .  374461  374461
             11 2010 "130042" "Americas"  273900 "EMEA"      51511 "Americas"  3168 "EMEA" 2178  330757  330757
             12 2011 "130042" "Americas"       0 "Americas" 302724 "EMEA"     61704 "EMEA"    0  364428  364428
             13 2012 "130042" "Americas"       0 "Americas" 314076 "EMEA"     67183 "ROW"     0  381259  381259
             14 2013 "130042" "Americas"       . "Americas" 331252 "EMEA"     75234 ""        .  406486  406486
             18 2017 "130042" "Americas"  533549 ""              . ""             . ""        .  533549  533549
             19 2018 "130042" "Americas"  593229 ""              . ""             . ""        .  593229  593229
             19 2018 "130042" "Americas"  593229 ""              . ""             . ""        .  593229  593229
             20 2000 "130062" "Americas" 1333000 ""              . ""             . ""        . 1333000 1333000
             21 2001 "130062" "Americas" 1505691 ""              . ""             . ""        . 1505691 1505691
             22 2002 "130062" "Americas" 1497101 ""              . ""             . ""        . 1497101 1497101
             23 2003 "130062" "Americas" 1711453 ""              . ""             . ""        . 1711453 1711453
             24 2004 "130062" "Americas" 1759613 ""              . ""             . ""        . 1759613 1759613
             25 2005 "130062" "Americas" 2004243 ""              . ""             . ""        . 2004243 2004243
             26 2006 "130062" "Americas" 2206401 ""              . ""             . ""        . 2206401 2206401
             27 2007 "130062" "Americas" 2207141 ""              . ""             . ""        . 2207141 2207141
             28 2008 "130062" "Americas" 2120081 ""              . ""             . ""        . 2120081 2120081
             29 2009 "130062" "Americas" 1702603 ""              . ""             . ""        . 1702603 1702603
             30 2010 "130062" "Americas" 1782857 ""              . ""             . ""        . 1782857 1782857
             31 2011 "130062" "Americas" 1672077 "ROW"       41746 ""             . ""        . 1713823 1713823
             32 2012 "130062" "Americas" 1996142 "ROW"       41525 ""             . ""        . 2037667 2037667
             33 2013 "130062" "Americas" 2113068 "ROW"       42483 ""             . ""        . 2155551 2155551
             34 2014 "130062" "Americas" 2912115 "ROW"       45836 ""             . ""        . 2957951 2957951
             35 2015 "130062" "Americas" 3493462 "ROW"       46108 ""             . ""        . 3539570 3539570
             36 2016 "130062" "Americas" 3761651 "ROW"       57098 ""             . ""        . 3818749 3818749
             37 2017 "130062" "Americas" 3901323 "ROW"       64271 ""             . ""        . 3965594 3965594
             38 2018 "130062" "Americas" 4166339 "ROW"       77926 ""             . ""        . 4244265 4244265
             39 2000 "130086" "Americas" 1063000 "APAC"     115000 "ROW"      49000 ""        . 1226878 1227000
             40 2001 "130086" "Americas" 1012000 "APAC"     113000 "ROW"      40000 ""        . 1164913 1165000
             41 2002 "130086" "Americas"  989000 "APAC"      87000 "ROW"      41000 ""        . 1117431 1117000
             42 2003 "130086" "Americas" 1002852 "APAC"      86000 "ROW"      12000 ""        . 1100852 1100852
             43 2004 "130086" "Americas" 1087000 "APAC"      94000 "ROW"      26000 ""        . 1206996 1207000
             44 2005 "130086" "Americas" 1122000 "APAC"      49000 "ROW"      10000 ""        . 1180700 1181000
             45 2006 "130086" "Americas" 1182000 "APAC"      32000 "ROW"      16000 ""        . 1229807 1230000
             46 2007 "130086" "Americas" 1172000 "APAC"      43000 "ROW"      10000 ""        . 1224654 1225000
             47 2008 "130086" "Americas" 1220000 "ROW"       12000 "APAC"         0 ""        . 1232100 1232000
             48 2009 "130086" "Americas" 1136000 "APAC"      33000 ""             . ""        . 1168567 1169000
             49 2010 "130086" "Americas" 1228000 "APAC"      87000 ""             . ""        . 1315233 1315000
             50 2011 "130086" "Americas" 1378000 "APAC"     111000 "ROW"          0 ""        . 1488642 1489000
             51 2012 "130086" "Americas" 1467000 "APAC"     104000 "ROW"          0 ""        . 1571000 1571000
             52 2013 "130086" "Americas" 1428000 "APAC"     280000 ""             . ""        . 1707822 1708000
             53 2014 "130086" "Americas"  318000 "APAC"     286000 ""             . ""        .  603521  604000
             54 2015 "130086" "Americas"  302074 "APAC"     242800 ""             . ""        .  544874  544874
             55 2016 "130086" "Americas"  507391 "APAC"     280887 ""             . ""        .  788278  788278
             56 2017 "130086" "Americas"  419403 "APAC"     400193 ""             . ""        .  819596  819596
             57 2018 "130086" "Americas"  390396 "APAC"     425742 ""             . ""        .  816138  816138
             58 2000 "130088" "Americas"  759037 ""              . ""             . ""        .  759037  759037
             59 2001 "130088" "Americas"  849799 ""              . ""             . ""        .  849799  849799
             60 2002 "130088" "Americas" 1209990 ""              . ""             . ""        . 1209990 1209990
             61 2003 "130088" "Americas" 1472885 ""              . ""             . ""        . 1472885 1472885
             62 2004 "130088" "Americas" 1738843 ""              . ""             . ""        . 1738843 1738843
             63 2005 "130088" "Americas" 2067979 ""              . ""             . ""        . 2067979 2067979
             64 2006 "130088" "Americas" 2369612 ""              . ""             . ""        . 2369612 2369612
             65 2007 "130088" "Americas" 2703212 ""              . ""             . ""        . 2703212 2703212
             66 2008 "130088" "Americas" 3007949 ""              . ""             . ""        . 3007949 3007949
             67 2009 "130088" "Americas" 3206937 ""              . ""             . ""        . 3206937 3206937
             68 2010 "130088" "Americas" 3638336 ""              . ""             . ""        . 3638336 3638336
             69 2011 "130088" "Americas" 4232743 ""              . ""             . ""        . 4232743 4232743
             70 2012 "130088" "Americas" 4664120 ""              . ""             . ""        . 4664120 4664120
             71 2013 "130088" "Americas" 5164784 ""              . ""             . ""        . 5164784 5164784
             72 2014 "130088" "Americas" 5711715 ""              . ""             . ""        . 5711715 5711715
             73 2015 "130088" "Americas" 6226507 ""              . ""             . ""        . 6226507 6226507
             74 2016 "130088" "Americas" 6779579 ""              . ""             . ""        . 6779579 6779579
             75 2017 "130088" "Americas" 7256382 ""              . ""             . ""        . 7256382 7256382
             76 2018 "130088" "Americas" 7911046 ""              . ""             . ""        . 7911046 7911046
             90 2013 "130104" "Americas"   57737 "EMEA"       9603 "APAC"      8700 ""        .   76041   76040
             91 2014 "130104" "Americas"   49496 "EMEA"      10525 "APAC"      9666 ""        .   69687   69687
             92 2015 "130104" "Americas"   54467 "EMEA"      16310 "APAC"     11855 ""        .   82631   82632
             93 2016 "130104" "Americas"   65172 "EMEA"      24054 "APAC"     10482 ""        .   99707   99708
             94 2017 "130104" "Americas"  103310 "EMEA"      20733 "APAC"      5246 ""        .  129289  129289
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .  129033  129033
             96 2000 "130169" "Americas" 6267865 ""              . ""             . ""        . 6267865 6267865
             97 2001 "130169" "Americas" 7694952 ""              . ""             . ""        . 7694952 7694952
             98 2002 "130169" "Americas" 5462704 ""              . ""             . ""        . 5462704 5462704
             99 2003 "130169" "Americas" 6715197 ""              . ""             . ""        . 6715197 6715197
            100 2004 "130169" "Americas" 7497238 ""              . ""             . ""        . 7497238 7497238
            101 2005 "130169" "Americas" 8895994 ""              . ""             . ""        . 8895994 8895994
            102 2006 "130169" "Americas" 9197004 ""              . ""             . ""        . 9197004 9197004
            end

            Comment


            • #7
              Update: I have found a code provided by Nick a few years ago that does the trick for me to aggregate the variables, however, it doesn't work for missing values.

              (1)
              Code:
              gen Americas = 0
              forval j=1/10 { 
                     replace Americas = Americas + geosales`j' if regionsADJseg`j' == "Americas"
                 }
              rowtotal() would be helpful here, but I cannot seem to combine the codes so that I receive the same results as (1) but not treating some missing values as an overall missing value for "Americas", as can be seen in line 14.

              Should I I then replace the missing values with 0 for any given region (Americas, APAC, EMEA, ROW) if it is not already reported as 0?
              But there should be a way to account for the missing values . no?


              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input float id int year str6 compID str18 regionsADJseg1 double geosales1 str18 regionsADJseg2 double geosales2 str18 regionsADJseg3 double geosales3 str18 regionsADJseg4 double(geosales4 ttsales) float Americas
                1 2000 "130042" "Americas"   90176 ""              . ""             . ""        .   90176   90176
                2 2001 "130042" "Americas"  114723 ""              . ""             . ""        .  114723  114723
                3 2002 "130042" "Americas"  133096 ""              . ""             . ""        .  133096  133096
                4 2003 "130042" "Americas"  146569 ""              . ""             . ""        .  146569  146569
                5 2004 "130042" "Americas"  168341 ""              . ""             . ""        .  168341  168341
                6 2005 "130042" "Americas"  207170 ""              . ""             . ""        .  207170  207170
                7 2006 "130042" "Americas"  257485 "ROW"       15878 ""             . ""        .  273363  257485
                8 2007 "130042" "Americas"  290161 "ROW"       28141 ""             . ""        .  318302  290161
                9 2008 "130042" "Americas"  356303 "EMEA"      53126 "EMEA"      5975 "ROW"     0  415404  356303
               10 2009 "130042" "Americas"  316177 "EMEA"      55024 "EMEA"      3260 ""        .  374461  316177
               11 2010 "130042" "Americas"  273900 "EMEA"      51511 "Americas"  3168 "EMEA" 2178  330757  277068
               12 2011 "130042" "Americas"       0 "Americas" 302724 "EMEA"     61704 "EMEA"    0  364428  302724
               13 2012 "130042" "Americas"       0 "Americas" 314076 "EMEA"     67183 "ROW"     0  381259  314076
               14 2013 "130042" "Americas"       . "Americas" 331252 "EMEA"     75234 ""        .  406486       .
               18 2017 "130042" "Americas"  533549 ""              . ""             . ""        .  533549  533549
               19 2018 "130042" "Americas"  593229 ""              . ""             . ""        .  593229  593229
               19 2018 "130042" "Americas"  593229 ""              . ""             . ""        .  593229  593229
               20 2000 "130062" "Americas" 1333000 ""              . ""             . ""        . 1333000 1333000
               21 2001 "130062" "Americas" 1505691 ""              . ""             . ""        . 1505691 1505691
               22 2002 "130062" "Americas" 1497101 ""              . ""             . ""        . 1497101 1497101
               23 2003 "130062" "Americas" 1711453 ""              . ""             . ""        . 1711453 1711453
               24 2004 "130062" "Americas" 1759613 ""              . ""             . ""        . 1759613 1759613
               25 2005 "130062" "Americas" 2004243 ""              . ""             . ""        . 2004243 2004243
               26 2006 "130062" "Americas" 2206401 ""              . ""             . ""        . 2206401 2206401
               27 2007 "130062" "Americas" 2207141 ""              . ""             . ""        . 2207141 2207141
               28 2008 "130062" "Americas" 2120081 ""              . ""             . ""        . 2120081 2120081
               29 2009 "130062" "Americas" 1702603 ""              . ""             . ""        . 1702603 1702603
               30 2010 "130062" "Americas" 1782857 ""              . ""             . ""        . 1782857 1782857
              
              end
              Also, is there a way to do this with the collapse function when I need to aggregate the SIC segment sales, without specifically stating every single SIC code?

              Code:
               * Example generated by -dataex-. To install: ssc install dataex clear input float id int year str6 CompanyID str4 sicseg1 double sicsales1 str4 sicseg2 double sicsales2 str4 sicseg3 double sicsales3 str4 sicseg4 double sicsales4 str4 sicseg5 double sicsales5   1 2000 "130042" "7519"    76084 "7519"   13406 "7519"      686 "."         . "."          .   2 2001 "130042" "7519"    99684 "7519"   14519 "7519"      520 "."         . "."          .   3 2002 "130042" "7519"   116169 "7519"   16007 "7519"      920 "."         . "."          .   4 2003 "130042" "7519"   128482 "7519"   17249 "7519"      838 "."         . "."          .   5 2004 "130042" "7519"   149856 "7519"   17919 "7519"      566 "."         . "."          .   6 2005 "130042" "7519"    17499 "7519"    1093 "7519"   188578 "."         . "."          .   7 2006 "130042" "7519"   245105 "7519"   26824 "7519"     1434 "."         . "."          .   8 2007 "130042" "7519"   284638 "7519"   31644 "9999"     2020 "."         . "."          .   9 2008 "130042" "7519"   371560 "7519"   41267 "9999"     2577 "."         . "."          .  10 2009 "130042" "7519"   284638 "7519"   31644 "9999"     2020 "."         . "."          .  11 2010 "130042" "7519"   295034 "7519"   33156 "9999"     2567 "."         . "."          .  12 2011 "130042" "7519"   318863 "7519"   42842 "9999"     2723 "."         . "."          .  13 2012 "130042" "7519"   340797 "7519"   38281 "9999"     2181 "."         . "."          .  14 2013 "130042" "4225"   366286 "3499"   38050 "3443"     2150 "."         . "."          .  15 2014 "130042" "4225"   410362 "4225"   31585 "3443"     3527 "."         . "."          .  16 2015 "130042" "4225"   494715 "4225"   29953 "3443"     6109 "."         . "."          .  17 2016 "130042" "4225"   329603 "3443"   98061 "3499"    80958 "."         . "."          .  18 2017 "130042" "4225"   349694 "3443"   98095 "4225"    85760 "."         . "."          .  19 2018 "130042" "4225"   387025 "3443"  116205 "4225"    89999 "."         . "."          .  20 2000 "130062" "1423"  1202581 "3297"  130419 "."           . "."         . "."          .  21 2001 "130062" "1423"  1406179 "1422"   99512 "."           . "."         . "."          .  22 2002 "130062" "1423"  1422673 "3297"   74428 "9999"        0 "."         . "."          .  23 2003 "130062" "1423"  1617040 "3297"   94413 "9999"        0 "."         . "."          .  24 2004 "130062" "1423"  1641690 "3297"  117923 "9999"        . "."         . "."          .  25 2005 "130062" "1423"  1859685 "3297"  144558 "9999"        . "."         . "."          .  26 2006 "130062" "1423"  2039840 "3297"  166561 "9999"        0 "."         . "."          .  27 2007 "130062" "1423"  2035469 "3297"  171672 "9999"        0 "."         . "."          .  28 2008 "130062" "1423"  1933248 "3297"  186833 "9999"        0 "."         . "."          .  29 2009 "130062" "1423"  1542692 "3297"  159911 "9999"        0 "."         . "."          .  30 2010 "130062" "1423"  1589677 "3297"  193180 "9999"        0 "."         . "."          .  31 2011 "130062" "1423"  1494700 "3297"  219123 "9999"        0 "."         . "."          .  32 2012 "130062" "1423"  1816954 "3297"  220713 "9999"        0 "."         . "."          .  33 2013 "130062" "1422"   924691 "3297"  740703 "3297"   245340 "2899" 244817 "9999"       .  34 2014 "130062" "1422"  1356283 "3297"  848855 "3297"   274352 "2899" 256702 "3241"  221759  35 2015 "130062" "1422"  1675021 "3297"  926251 "3241"   387947 "3297" 304472 "2899"  245879  36 2016 "130062" "1422"  1970165 "3297" 1017098 "3241"   375814 "3297" 321078 "2899"  257058  37 2017 "130062" "1422"  2279723 "3297" 1053325 "3297"   362555 "2899" 269991 "1442"       .  38 2018 "130062" "1422"  2309924 "3297" 1223236 "3297"   423382 "2899" 287723 "9999"       .  39 2000 "130086" "2611"   576000 "0811"  401000 "0811"   280000 "."    -30000 "."          .  40 2001 "130086" "2823"   547000 "2499"  358000 "2411"   281000 "."    -21000 "."          .  41 2002 "130086" "2823"   526000 "2499"  247000 "2411"   137000 "9999"  -9000 "9999"  216000  42 2003 "130086" "2823"   534000 "2499"  259000 "2411"   138000 "9999"  -2000 "9999"  172000  43 2004 "130086" "2823"   581996 "2499"  282000 "2411"   170000 "9999"  -1000 "9999"  174000  44 2005 "130086" "2823"   628000 "2499"  294000 "2411"   137000 "9999"  -1000 "9999"  123000  45 2006 "130086" "2823"   672000 "2499"  207000 "2499"   112000 "2411" 111000 "9999"       0  46 2007 "130086" "2823"   722000 "2499"  222000 "2499"   116000 "2411"  88000 "9999"       0  47 2008 "130086" "2823"   798000 "2499"  190000 "2499"   127000 "2411"  86000 "9999"   31000  48 2009 "130086" "2823"   839000 "2499"  159000 "2499"   101000 "2411"  51000 "9999"       0  49 2010 "130086" "2823"   881000 "2499"  177000 "2499"    96000 "2411"  68000 "9999"       0  50 2011 "130086" "2823"  1020000 "2499"  215000 "2499"    71000 "2411"  68000 "9999"       0 end

              Comment


              • #8
                Your thread flips back and forth between different datasets without any explanation of why you are doing that. I am just going to show you some technique in the hope that it helps, but I am not confident that I understand what you want, reason enough for a generic suggestion in my previous.

                Note a serious problem in your last data example: duplicated observations!

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input float id int year str6 compID str18 regionsADJseg01 double geosales1 str18 regionsADJseg02 double geosales2 str18 regionsADJseg03 double geosales3 str18 regionsADJseg04 double geosales4
                  1 2000 "130042" "Americas"   90176 ""              . ""             . ""        .
                  2 2001 "130042" "Americas"  114723 ""              . ""             . ""        .
                  3 2002 "130042" "Americas"  133096 ""              . ""             . ""        .
                  4 2003 "130042" "Americas"  146569 ""              . ""             . ""        .
                  5 2004 "130042" "Americas"  168341 ""              . ""             . ""        .
                  6 2005 "130042" "Americas"  207170 ""              . ""             . ""        .
                  7 2006 "130042" "Americas"  257485 "ROW"       15878 ""             . ""        .
                  8 2007 "130042" "Americas"  290161 "ROW"       28141 ""             . ""        .
                  9 2008 "130042" "Americas"  356303 "EMEA"      53126 "EMEA"      5975 "ROW"     0
                 10 2009 "130042" "Americas"  316177 "EMEA"      55024 "EMEA"      3260 ""        .
                 11 2010 "130042" "Americas"  273900 "EMEA"      51511 "Americas"  3168 "EMEA" 2178
                 12 2011 "130042" "Americas"       0 "Americas" 302724 "EMEA"     61704 "EMEA"    0
                 13 2012 "130042" "Americas"       0 "Americas" 314076 "EMEA"     67183 "ROW"     0
                 14 2013 "130042" "Americas"       . "Americas" 331252 "EMEA"     75234 ""        .
                 18 2017 "130042" "Americas"  533549 ""              . ""             . ""        .
                 19 2018 "130042" "Americas"  593229 ""              . ""             . ""        .
                 19 2018 "130042" "Americas"  593229 ""              . ""             . ""        .
                 20 2000 "130062" "Americas" 1333000 ""              . ""             . ""        .
                 21 2001 "130062" "Americas" 1505691 ""              . ""             . ""        .
                 22 2002 "130062" "Americas" 1497101 ""              . ""             . ""        .
                 23 2003 "130062" "Americas" 1711453 ""              . ""             . ""        .
                 24 2004 "130062" "Americas" 1759613 ""              . ""             . ""        .
                 25 2005 "130062" "Americas" 2004243 ""              . ""             . ""        .
                 26 2006 "130062" "Americas" 2206401 ""              . ""             . ""        .
                 27 2007 "130062" "Americas" 2207141 ""              . ""             . ""        .
                 28 2008 "130062" "Americas" 2120081 ""              . ""             . ""        .
                 29 2009 "130062" "Americas" 1702603 ""              . ""             . ""        .
                 30 2010 "130062" "Americas" 1782857 ""              . ""             . ""        .
                 31 2011 "130062" "Americas" 1672077 "ROW"       41746 ""             . ""        .
                 32 2012 "130062" "Americas" 1996142 "ROW"       41525 ""             . ""        .
                 33 2013 "130062" "Americas" 2113068 "ROW"       42483 ""             . ""        .
                 34 2014 "130062" "Americas" 2912115 "ROW"       45836 ""             . ""        .
                 35 2015 "130062" "Americas" 3493462 "ROW"       46108 ""             . ""        .
                 36 2016 "130062" "Americas" 3761651 "ROW"       57098 ""             . ""        .
                 37 2017 "130062" "Americas" 3901323 "ROW"       64271 ""             . ""        .
                 38 2018 "130062" "Americas" 4166339 "ROW"       77926 ""             . ""        .
                 39 2000 "130086" "Americas" 1063000 "APAC"     115000 "ROW"      49000 ""        .
                 40 2001 "130086" "Americas" 1012000 "APAC"     113000 "ROW"      40000 ""        .
                 41 2002 "130086" "Americas"  989000 "APAC"      87000 "ROW"      41000 ""        .
                 42 2003 "130086" "Americas" 1002852 "APAC"      86000 "ROW"      12000 ""        .
                 43 2004 "130086" "Americas" 1087000 "APAC"      94000 "ROW"      26000 ""        .
                 44 2005 "130086" "Americas" 1122000 "APAC"      49000 "ROW"      10000 ""        .
                 45 2006 "130086" "Americas" 1182000 "APAC"      32000 "ROW"      16000 ""        .
                 46 2007 "130086" "Americas" 1172000 "APAC"      43000 "ROW"      10000 ""        .
                 47 2008 "130086" "Americas" 1220000 "ROW"       12000 "APAC"         0 ""        .
                 48 2009 "130086" "Americas" 1136000 "APAC"      33000 ""             . ""        .
                 49 2010 "130086" "Americas" 1228000 "APAC"      87000 ""             . ""        .
                 50 2011 "130086" "Americas" 1378000 "APAC"     111000 "ROW"          0 ""        .
                 51 2012 "130086" "Americas" 1467000 "APAC"     104000 "ROW"          0 ""        .
                 52 2013 "130086" "Americas" 1428000 "APAC"     280000 ""             . ""        .
                 53 2014 "130086" "Americas"  318000 "APAC"     286000 ""             . ""        .
                 54 2015 "130086" "Americas"  302074 "APAC"     242800 ""             . ""        .
                 55 2016 "130086" "Americas"  507391 "APAC"     280887 ""             . ""        .
                 56 2017 "130086" "Americas"  419403 "APAC"     400193 ""             . ""        .
                 57 2018 "130086" "Americas"  390396 "APAC"     425742 ""             . ""        .
                 58 2000 "130088" "Americas"  759037 ""              . ""             . ""        .
                 59 2001 "130088" "Americas"  849799 ""              . ""             . ""        .
                 60 2002 "130088" "Americas" 1209990 ""              . ""             . ""        .
                 61 2003 "130088" "Americas" 1472885 ""              . ""             . ""        .
                 62 2004 "130088" "Americas" 1738843 ""              . ""             . ""        .
                 63 2005 "130088" "Americas" 2067979 ""              . ""             . ""        .
                 64 2006 "130088" "Americas" 2369612 ""              . ""             . ""        .
                 65 2007 "130088" "Americas" 2703212 ""              . ""             . ""        .
                 66 2008 "130088" "Americas" 3007949 ""              . ""             . ""        .
                 67 2009 "130088" "Americas" 3206937 ""              . ""             . ""        .
                 68 2010 "130088" "Americas" 3638336 ""              . ""             . ""        .
                 69 2011 "130088" "Americas" 4232743 ""              . ""             . ""        .
                 70 2012 "130088" "Americas" 4664120 ""              . ""             . ""        .
                 71 2013 "130088" "Americas" 5164784 ""              . ""             . ""        .
                 72 2014 "130088" "Americas" 5711715 ""              . ""             . ""        .
                 73 2015 "130088" "Americas" 6226507 ""              . ""             . ""        .
                 74 2016 "130088" "Americas" 6779579 ""              . ""             . ""        .
                 75 2017 "130088" "Americas" 7256382 ""              . ""             . ""        .
                 76 2018 "130088" "Americas" 7911046 ""              . ""             . ""        .
                 90 2013 "130104" "Americas"   57737 "EMEA"       9603 "APAC"      8700 ""        .
                 91 2014 "130104" "Americas"   49496 "EMEA"      10525 "APAC"      9666 ""        .
                 92 2015 "130104" "Americas"   54467 "EMEA"      16310 "APAC"     11855 ""        .
                 93 2016 "130104" "Americas"   65172 "EMEA"      24054 "APAC"     10482 ""        .
                 94 2017 "130104" "Americas"  103310 "EMEA"      20733 "APAC"      5246 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 95 2018 "130104" "Americas"  104212 "EMEA"      20296 "APAC"      4525 ""        .
                 96 2000 "130169" "Americas" 6267865 ""              . ""             . ""        .
                 97 2001 "130169" "Americas" 7694952 ""              . ""             . ""        .
                 98 2002 "130169" "Americas" 5462704 ""              . ""             . ""        .
                 99 2003 "130169" "Americas" 6715197 ""              . ""             . ""        .
                100 2004 "130169" "Americas" 7497238 ""              . ""             . ""        .
                101 2005 "130169" "Americas" 8895994 ""              . ""             . ""        .
                102 2006 "130169" "Americas" 9197004 ""              . ""             . ""        .
                end
                
                . rename (regionsADJseg*) regionsADJseg#, addnumber 
                
                . duplicates drop *, force 
                
                Duplicates in terms of id year compID regionsADJseg1 geosales1 regionsADJseg2 geosales2 regionsADJseg3 geosales3 regionsADJseg4 geosales4
                
                (14 observations deleted)
                
                . reshape long geosales regionsADJseg, i(id year compID) j(which) 
                (note: j = 1 2 3 4)
                
                Data                               wide   ->   long
                -----------------------------------------------------------------------------
                Number of obs.                       86   ->     344
                Number of variables                  11   ->       6
                j variable (4 values)                     ->   which
                xij variables:
                      geosales1 geosales2 ... geosales4   ->   geosales
                regionsADJseg1 regionsADJseg2 ... regionsADJseg4->regionsADJseg
                -----------------------------------------------------------------------------
                
                . drop if missing(geosales) 
                (191 observations deleted)
                
                . collapse (sum) geosales , by(compID year regionsADJseg)  
                
                . list
                
                     +-------------------------------------+
                     | year   compID   region~g   geosales |
                     |-------------------------------------|
                  1. | 2000   130042   Americas      90176 |
                  2. | 2001   130042   Americas     114723 |
                  3. | 2002   130042   Americas     133096 |
                  4. | 2003   130042   Americas     146569 |
                  5. | 2004   130042   Americas     168341 |
                     |-------------------------------------|
                  6. | 2005   130042   Americas     207170 |
                  7. | 2006   130042   Americas     257485 |
                  8. | 2006   130042        ROW      15878 |
                  9. | 2007   130042   Americas     290161 |
                 10. | 2007   130042        ROW      28141 |
                     |-------------------------------------|
                 11. | 2008   130042   Americas     356303 |
                 12. | 2008   130042       EMEA      59101 |
                 13. | 2008   130042        ROW          0 |
                 14. | 2009   130042   Americas     316177 |
                 15. | 2009   130042       EMEA      58284 |
                     |-------------------------------------|
                 16. | 2010   130042   Americas     277068 |
                 17. | 2010   130042       EMEA      53689 |
                 18. | 2011   130042   Americas     302724 |
                 19. | 2011   130042       EMEA      61704 |
                 20. | 2012   130042   Americas     314076 |
                     |-------------------------------------|
                 21. | 2012   130042       EMEA      67183 |
                 22. | 2012   130042        ROW          0 |
                 23. | 2013   130042   Americas     331252 |
                 24. | 2013   130042       EMEA      75234 |
                 25. | 2017   130042   Americas     533549 |
                     |-------------------------------------|
                 26. | 2018   130042   Americas     593229 |
                 27. | 2000   130062   Americas    1333000 |
                 28. | 2001   130062   Americas    1505691 |
                 29. | 2002   130062   Americas    1497101 |
                 30. | 2003   130062   Americas    1711453 |
                     |-------------------------------------|
                 31. | 2004   130062   Americas    1759613 |
                 32. | 2005   130062   Americas    2004243 |
                 33. | 2006   130062   Americas    2206401 |
                 34. | 2007   130062   Americas    2207141 |
                 35. | 2008   130062   Americas    2120081 |
                     |-------------------------------------|
                 36. | 2009   130062   Americas    1702603 |
                 37. | 2010   130062   Americas    1782857 |
                 38. | 2011   130062   Americas    1672077 |
                 39. | 2011   130062        ROW      41746 |
                 40. | 2012   130062   Americas    1996142 |
                     |-------------------------------------|
                 41. | 2012   130062        ROW      41525 |
                 42. | 2013   130062   Americas    2113068 |
                 43. | 2013   130062        ROW      42483 |
                 44. | 2014   130062   Americas    2912115 |
                 45. | 2014   130062        ROW      45836 |
                     |-------------------------------------|
                 46. | 2015   130062   Americas    3493462 |
                 47. | 2015   130062        ROW      46108 |
                 48. | 2016   130062   Americas    3761651 |
                 49. | 2016   130062        ROW      57098 |
                 50. | 2017   130062   Americas    3901323 |
                     |-------------------------------------|
                 51. | 2017   130062        ROW      64271 |
                 52. | 2018   130062   Americas    4166339 |
                 53. | 2018   130062        ROW      77926 |
                 54. | 2000   130086       APAC     115000 |
                 55. | 2000   130086   Americas    1063000 |
                     |-------------------------------------|
                 56. | 2000   130086        ROW      49000 |
                 57. | 2001   130086       APAC     113000 |
                 58. | 2001   130086   Americas    1012000 |
                 59. | 2001   130086        ROW      40000 |
                 60. | 2002   130086       APAC      87000 |
                     |-------------------------------------|
                 61. | 2002   130086   Americas     989000 |
                 62. | 2002   130086        ROW      41000 |
                 63. | 2003   130086       APAC      86000 |
                 64. | 2003   130086   Americas    1002852 |
                 65. | 2003   130086        ROW      12000 |
                     |-------------------------------------|
                 66. | 2004   130086       APAC      94000 |
                 67. | 2004   130086   Americas    1087000 |
                 68. | 2004   130086        ROW      26000 |
                 69. | 2005   130086       APAC      49000 |
                 70. | 2005   130086   Americas    1122000 |
                     |-------------------------------------|
                 71. | 2005   130086        ROW      10000 |
                 72. | 2006   130086       APAC      32000 |
                 73. | 2006   130086   Americas    1182000 |
                 74. | 2006   130086        ROW      16000 |
                 75. | 2007   130086       APAC      43000 |
                     |-------------------------------------|
                 76. | 2007   130086   Americas    1172000 |
                 77. | 2007   130086        ROW      10000 |
                 78. | 2008   130086       APAC          0 |
                 79. | 2008   130086   Americas    1220000 |
                 80. | 2008   130086        ROW      12000 |
                     |-------------------------------------|
                 81. | 2009   130086       APAC      33000 |
                 82. | 2009   130086   Americas    1136000 |
                 83. | 2010   130086       APAC      87000 |
                 84. | 2010   130086   Americas    1228000 |
                 85. | 2011   130086       APAC     111000 |
                     |-------------------------------------|
                 86. | 2011   130086   Americas    1378000 |
                 87. | 2011   130086        ROW          0 |
                 88. | 2012   130086       APAC     104000 |
                 89. | 2012   130086   Americas    1467000 |
                 90. | 2012   130086        ROW          0 |
                     |-------------------------------------|
                 91. | 2013   130086       APAC     280000 |
                 92. | 2013   130086   Americas    1428000 |
                 93. | 2014   130086       APAC     286000 |
                 94. | 2014   130086   Americas     318000 |
                 95. | 2015   130086       APAC     242800 |
                     |-------------------------------------|
                 96. | 2015   130086   Americas     302074 |
                 97. | 2016   130086       APAC     280887 |
                 98. | 2016   130086   Americas     507391 |
                 99. | 2017   130086       APAC     400193 |
                100. | 2017   130086   Americas     419403 |
                     |-------------------------------------|
                101. | 2018   130086       APAC     425742 |
                102. | 2018   130086   Americas     390396 |
                103. | 2000   130088   Americas     759037 |
                104. | 2001   130088   Americas     849799 |
                105. | 2002   130088   Americas    1209990 |
                     |-------------------------------------|
                106. | 2003   130088   Americas    1472885 |
                107. | 2004   130088   Americas    1738843 |
                108. | 2005   130088   Americas    2067979 |
                109. | 2006   130088   Americas    2369612 |
                110. | 2007   130088   Americas    2703212 |
                     |-------------------------------------|
                111. | 2008   130088   Americas    3007949 |
                112. | 2009   130088   Americas    3206937 |
                113. | 2010   130088   Americas    3638336 |
                114. | 2011   130088   Americas    4232743 |
                115. | 2012   130088   Americas    4664120 |
                     |-------------------------------------|
                116. | 2013   130088   Americas    5164784 |
                117. | 2014   130088   Americas    5711715 |
                118. | 2015   130088   Americas    6226507 |
                119. | 2016   130088   Americas    6779579 |
                120. | 2017   130088   Americas    7256382 |
                     |-------------------------------------|
                121. | 2018   130088   Americas    7911046 |
                122. | 2013   130104       APAC       8700 |
                123. | 2013   130104   Americas      57737 |
                124. | 2013   130104       EMEA       9603 |
                125. | 2014   130104       APAC       9666 |
                     |-------------------------------------|
                126. | 2014   130104   Americas      49496 |
                127. | 2014   130104       EMEA      10525 |
                128. | 2015   130104       APAC      11855 |
                129. | 2015   130104   Americas      54467 |
                130. | 2015   130104       EMEA      16310 |
                     |-------------------------------------|
                131. | 2016   130104       APAC      10482 |
                132. | 2016   130104   Americas      65172 |
                133. | 2016   130104       EMEA      24054 |
                134. | 2017   130104       APAC       5246 |
                135. | 2017   130104   Americas     103310 |
                     |-------------------------------------|
                136. | 2017   130104       EMEA      20733 |
                137. | 2018   130104       APAC       4525 |
                138. | 2018   130104   Americas     104212 |
                139. | 2018   130104       EMEA      20296 |
                140. | 2000   130169   Americas    6267865 |
                     |-------------------------------------|
                141. | 2001   130169   Americas    7694952 |
                142. | 2002   130169   Americas    5462704 |
                143. | 2003   130169   Americas    6715197 |
                144. | 2004   130169   Americas    7497238 |
                145. | 2005   130169   Americas    8895994 |
                     |-------------------------------------|
                146. | 2006   130169   Americas    9197004 |
                     +-------------------------------------+
                .

                Naturally you can reshape wide if that is what you need.

                Comment


                • #9
                  Dear Nick,

                  thank you for the instructions and I apologise for the back and forth.
                  I did not want to create another post because my issues are quite similar, just that I am preparing two different datasets and need the same logic of aggregating values for segments that are mentioned more than once in the database.

                  I managed to aggregate the global regions for my geographic sales segments.
                  And indeed, I need the data in wide format because from there, I then calculate an entropy index of geographic dispersion.
                  Your instructions helped me to do it, thank you! The first issue is successfully solved.
                  Here is a short example of the output that I wanted to achieve:
                  With:
                  rq1=Americas
                  rg2=APAC
                  rg3=EMEA
                  rg4=ROW
                  rg5=Undetermined

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input float id int year str6 compID double ttsales float(rg1 rg2 rg3 rg4 rg5 tot5regions) double(term entropy_geo)
                    1 2000 "130042"    90176    90176       0       0      0      0    90176                  0                  0
                    2 2001 "130042"   114723   114723       0       0      0      0   114723                  0                  0
                    3 2002 "130042"   133096   133096       0       0      0      0   133096                  0                  0
                    4 2003 "130042"   146569   146569       0       0      0      0   146569                  0                  0
                    5 2004 "130042"   168341   168341       0       0      0      0   168341                  0                  0
                    6 2005 "130042"   207170   207170       0       0      0      0   207170                  0                  0
                    7 2006 "130042"   273363   257485       0       0  15878      0   273363                  0  .2216625383250351
                    8 2007 "130042"   318302   290161       0       0  28141      0   318302                  0 .29884304176067056
                    9 2008 "130042"   415404   356303       0   59101      0      0   415404                  0  .4090692603449909
                   10 2009 "130042"   374461   316177       0   58284      0      0   374461                  0 .43238180778723223
                   11 2010 "130042"   330757   277068       0   53689      0      0   330757                  0 .44349962113901087
                   12 2011 "130042"   364428   302724       0   61704      0      0   364428                  0 .45480220467183474
                   13 2012 "130042"   381259   314076       0   67183      0      0   381259                  0 .46560306877015495
                   14 2013 "130042"   406486   331252       0   75234      0      0   406486                  0  .4790154736481598
                   15 2014 "130042"   445474   352344       0   86698      0   6432   445474 .01443855309176293  .5652243425594932
                   16 2015 "130042"   530777   335394       0   88121      0 107262   530777 .20208486803309111  .9113241784411525
                   18 2017 "130042"   533549   533549       0       0      0      0   533549                  0                  0
                   19 2018 "130042"   593229   593229       0       0      0      0   593229                  0                  0
                   20 2000 "130062"  1333000  1333000       0       0      0      0  1333000                  0                  0
                  end

                  Now, I will have to follow a different logic for my second data set on industry dispersion, meaning, how many different industries based on 4-digit SIC classification the companies are active in.
                  For that, I cannot follow the exact same procedure as in my first request regarding geographic regions, because I had determined the five segments and thus it was easier to write the code for each segment, as I only had to create five new variables. Based on these five regions, I was able to calculate an entropy index of global diversification.

                  However, it's different for the industry diversification. I followed your instructions up until the collapse command, and it works fine when the data are in long format.
                  To calculate the entropy, I would need the data in wide format again.

                  So, I believe, I would need two steps:
                  (1) generate a new variable for "sicsegments" that hows me if one company had reported sales in more than one industry based on the 4-digit SIC codes, I can see that based on the number of duplicates in "id". (please note, I have manually created the variable "sicseg" in my data example to show the idea that I want to achieve with my code.

                  (2) reshape wide again. Ideally, with the number of segments equalling the number of maximum duplicates. From here, to calculate an entropy index of industry diversification.

                  So my question is:
                  - how does a code have to look like that gives me the results of my manually created variable "sicseg"?
                  - if my thinking is wrong, is there another way to collapse the sales that are reported within the same industry, and work in wide format so that I can calculate the entropy index afterwards?

                  I will attach a data example from my current dataset.
                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input float(id sicseg) int year str6 CompanyID double ttsales str4 siccode double sicsales
                   1 1 2000 "130042"   90176 "7519"   90176
                   2 1 2001 "130042"  114723 "7519"  114723
                   3 1 2002 "130042"  133096 "7519"  133096
                   4 1 2003 "130042"  146569 "7519"  146569
                   5 1 2004 "130042"  168341 "7519"  168341
                   6 1 2005 "130042"  207170 "7519"  207170
                   7 1 2006 "130042"  273363 "7519"  273363
                   8 1 2007 "130042"  318302 "7519"  316282
                   8 2 2007 "130042"  318302 "9999"    2020
                   9 1 2008 "130042"  415404 "7519"  412827
                   9 2 2008 "130042"  415404 "9999"    2577
                  10 1 2009 "130042"  374461 "7519"  316282
                  10 2 2009 "130042"  374461 "9999"    2020
                  11 1 2010 "130042"  330757 "7519"  328190
                  11 2 2010 "130042"  330757 "9999"    2567
                  12 3 2011 "130042"  364428 "7519"  361705
                  12 4 2011 "130042"  364428 "9999"    2723
                  13 1 2012 "130042"  381259 "7519"  379078
                  13 2 2012 "130042"  381259 "9999"    2181
                  14 1 2013 "130042"  406486 "3443"    2150
                  14 2 2013 "130042"  406486 "3499"   38050
                  14 3 2013 "130042"  406486 "4225"  366286
                  15 1 2014 "130042"  445474 "3443"    3527
                  15 2 2014 "130042"  445474 "4225"  441947
                  16 1 2015 "130042"  530777 "3443"    6109
                  16 2 2015 "130042"  530777 "4225"  524668
                  17 1 2016 "130042"  508622 "3443"   98061
                  17 2 2016 "130042"  508622 "3499"   80958
                  17 3 2016 "130042"  508622 "4225"  329603
                  18 1 2017 "130042"  533549 "3443"   98095
                  18 2 2017 "130042"  533549 "4225"  435454
                  19 1 2018 "130042"  593229 "3443"  116205
                  19 2 2018 "130042"  593229 "4225"  477024
                  20 1 2000 "130062" 1333000 "1423" 1202581
                  20 2 2000 "130062" 1333000 "3297"  130419
                  21 1 2001 "130062" 1505691 "1422"   99512
                  21 2 2001 "130062" 1505691 "1423" 1406179
                  22 1 2002 "130062" 1497101 "1423" 1422673
                  22 2 2002 "130062" 1497101 "3297"   74428
                  22 3 2002 "130062" 1497101 "9999"       0
                  23 1 2003 "130062" 1711453 "1423" 1617040
                  23 2 2003 "130062" 1711453 "3297"   94413
                  23 3 2003 "130062" 1711453 "9999"       0
                  24 1 2004 "130062" 1759613 "1423" 1641690
                  24 2 2004 "130062" 1759613 "3297"  117923
                  25 1 2005 "130062" 2004243 "1423" 1859685
                  25 2 2005 "130062" 2004243 "3297"  144558
                  26 1 2006 "130062" 2206401 "1423" 2039840
                  26 2 2006 "130062" 2206401 "3297"  166561
                  26 3 2006 "130062" 2206401 "9999"       0
                  27 1 2007 "130062" 2207141 "1423" 2035469
                  27 2 2007 "130062" 2207141 "3297"  171672
                  27 3 2007 "130062" 2207141 "9999"       0
                  28 1 2008 "130062" 2120081 "1423" 1933248
                  28 2 2008 "130062" 2120081 "3297"  186833
                  28 3 2008 "130062" 2120081 "9999"       0
                  29 1 2009 "130062" 1702603 "1423" 1542692
                  29 2 2009 "130062" 1702603 "3297"  159911
                  29 3 2009 "130062" 1702603 "9999"       0
                  30 1 2010 "130062" 1782857 "1423" 1589677
                  30 2 2010 "130062" 1782857 "3297"  193180
                  30 3 2010 "130062" 1782857 "9999"       0
                  31 1 2011 "130062" 1713823 "1423" 1494700
                  31 2 2011 "130062" 1713823 "3297"  219123
                  31 3 2011 "130062" 1713823 "9999"       0
                  32 1 2012 "130062" 2037667 "1423" 1816954
                  32 2 2012 "130062" 2037667 "3297"  220713
                  32 3 2012 "130062" 2037667 "9999"       0
                  33 1 2013 "130062" 2155551 "1422"  924691
                  33 2 2013 "130062" 2155551 "2899"  244817
                  33 3 2013 "130062" 2155551 "3297"  986043
                  34 1 2014 "130062" 2957951 "1422" 1356283
                  34 2 2014 "130062" 2957951 "2899"  256702
                  34 3 2014 "130062" 2957951 "3241"  221759
                  34 4 2014 "130062" 2957951 "3297" 1123207
                  35 1 2015 "130062" 3539570 "1422" 1675021
                  35 2 2015 "130062" 3539570 "2899"  245879
                  35 3 2015 "130062" 3539570 "3241"  387947
                  35 4 2015 "130062" 3539570 "3297" 1230723
                  36 1 2016 "130062" 3818749 "1422" 1970165
                  36 2 2016 "130062" 3818749 "2899"  257058
                  36 3 2016 "130062" 3818749 "3241"  375814
                  36 4 2016 "130062" 3818749 "3297" 1338176
                  36 5 2016 "130062" 3818749 "9999" -122464
                  37 1 2017 "130062" 3965594 "1422" 2279723
                  37 2 2017 "130062" 3965594 "2899"  269991
                  37 3 2017 "130062" 3965594 "3297" 1415880
                  38 1 2018 "130062" 4244265 "1422" 2309924
                  38 2 2018 "130062" 4244265 "2899"  287723
                  38 3 2018 "130062" 4244265 "3297" 1646618
                  39 1 2000 "130086" 1226878 "."     -30000
                  39 2 2000 "130086" 1226878 "0811"  681000
                  39 3 2000 "130086" 1226878 "2611"  576000
                  40 1 2001 "130086" 1164913 "."     -21000
                  40 2 2001 "130086" 1164913 "2411"  281000
                  40 3 2001 "130086" 1164913 "2499"  358000
                  40 4 2001 "130086" 1164913 "2823"  547000
                  41 1 2002 "130086" 1117431 "2411"  137000
                  41 2 2002 "130086" 1117431 "2499"  247000
                  41 3 2002 "130086" 1117431 "2823"  526000
                  end

                  Thank you for all efforts in supporting me, I highly appreciate it and I hope that I could describe my issues more precisely.

                  Kind regards

                  Comment


                  • #10
                    !Update!

                    I have found my mistakes and have successfully solved the issues! Thank you for the valuable instructions and points.

                    I hope to not take away any more of your time and efforts on this topic.

                    I found the most basic instructions and after solving my issues, it finally worked!
                    (I kept getting error messages that my data were already in wide format, although I only applied reshape long until that point.)

                    Code:
                    duplicates drop *, force 
                    reshape long sicsales sicseg, i(id year CompanyID) j(which) 
                    drop if missing(sicsales) 
                    collapse (sum) sicsales , by(id CompanyID year sicseg ttsales) 
                    sort id
                    quietly by id: gen dup= cond(_N==1,0,_n)
                    reshape wide sicsales sicseg, i(id year CompanyID) j(dup)
                    Thank you again for your patience and support, always highly appreciated!!

                    Kind regards

                    Comment


                    • #11
                      Dear Statalisters,

                      I have two follow-up questions on checking the statistics for an entropy measure of industry diversification based on Nick Cox's code for computing an entropy measure for diversification.
                      I tried a view times to adjust, but was not successful and would thus appreciate quick pointers to where I have done wrong.

                      Thank you in advance!

                      The Code I used for calculating the entropy index is the following:
                      Code:
                      * Calculate total sales across all reported industry segments
                      egen ttseg2=rowtotal(sicsales1 sicsales2 sicsales3 sicsales4 sicsales5 sicsales6 sicsales7 sicsales8 sicsales9 sicsales10)
                      
                      * Calculate entropy index of industry dispersion 
                      display 0 * ln(1/0)
                      gen double indentropy =  0
                      gen double term = .
                      quietly forval j = 1/10 {
                          replace term = sicsales`j'/ttseg2
                          replace indentropy = indentropy + term * ln(1/term) if !inlist(term, 0, .)
                      }
                      order indentropy, last
                      With this, I calculated the "indentropy" variable.

                      (1) Now when running the summarize command, I was wondering about the negative Min value. In general, Min would be 0 for companies that operate in only one industry. But I was wondering about how I should interpret the negative value. There might be an issue within my dataset, I guess?
                      (Data example for the first 4 segments and entropy index attached below)
                      Code:
                      . summarize indentropy
                      
                          Variable |        Obs        Mean    Std. Dev.       Min        Max
                      -------------+---------------------------------------------------------
                        indentropy |     64,992    .2881284    .4012719  -.0100302   2.244585

                      (2) I would like to calculate the average number of industry segments based in 4-digic SIC code that a company operates in. I tried applying the following code, which I had troubles with because of the missing values "." as the code also counts missing values. Is there a way I could circumvent this without replacing all missing values for sicsales with "0"?
                      Code:
                      gen siccount = 0
                      forval j = 1/10 {
                              replace siccount = siccount + ( sicseg`j' > 0 )
                      }

                      Code:
                      * Example generated by -dataex-. To install: ssc install dataex
                      clear
                      input float id int year str6 CompanyID str4 sicseg1 double sicsales1 str4 sicseg2 double sicsales2 str4 sicseg3 double sicsales3 str4 sicseg4 double sicsales4 float ttseg double(indterm indentropy)
                      48470 2008 "69264X" "2821"  31121925 "2911"  16431241 "7389"   410545 ""           .  47963708 .    .6884047262449249
                      96748 2013 "999677" "2869"    665625 "5722"       466 ""            . ""           .    666091 .  .005781980335290376
                      96356 2014 "998066" "6091"      5754 "6282"    312757 "6726"    57979 ""           .    376490 .    .5011795508040067
                       1217 2012 "133241" "7948"    144413 "7993"   3010139 ""            . ""           .   3154552 .    .1858947334691696
                      24214 2000 "317701" "."         2522 "4810"   2275804 "4822" 11083748 ""           .  13362074 .   .45816306120414174
                      75511 2008 "906292" "3728"    476625 "5085"    776970 ""            . ""           .   1253595 .    .6641651405025242
                      54970 2018 "771064" "0119"   2451326 "1542"     42995 "3069"  1900245 "6719"   32682   4427248 .    .7728718157683021
                      10574 2016 "26576H" "6794"    141153 "8742"    184572 ""            . ""           .    325725 .    .6849130481625297
                      68068 2003 "891826" "5399"    197268 "5411"    382933 ""            . ""           .    580201 .    .6410357078755067
                      43863 2018 "675277" "1099"       226 "3341"    187219 "5051"   869879 ""           .   1057324 .    .4718806429456819
                      18184 2016 "294221" "3589"    356951 "5064"     66162 "7359"  1544370 ""           .   1967483 .    .6138221849728341
                      61142 2009 "872726" "3315"   5266655 "5052"     98442 ""            . ""           .   5365097 .   .08399728088023264
                      85004 2002 "929345" "1021"    969583 "1031"    800641 "1499"   371464 "3339"  805887   3327433 .   1.5378562083455793
                       5561 2013 "14812M" "1311"  18602366 "1321"  15332299 "2911"  2076239 "2999" 2830943  86448928 .   1.1672660802455836
                      40429 2016 "540313" "2281"    334752 "2282"    251668 "2297"    17317 "6512"       0    603737 .    .7936179925531478
                      94431 2015 "982371" "7371"   3168864 "7389"    144588 ""            . ""           .   3313452 .   .17933375466620682
                      82179 2006 "921983" "1311"   3001246 "1321"    161146 "1382"  4911861 ""           .   8074253 .    .7495956746095971
                      32762 2006 "502187" "1541"     77570 "1623"   1000389 "5032"   305544 "6552"   44941   1557934 .    1.126352935805622
                      56715 2005 "775672" "3535"    366391 "3563"   1285729 ""            . ""           .   1652120 .     .529139344041083
                      86624 2007 "930682" "2051"   6413210 "5411"    495457 ""            . ""           .   6908667 .   .25943629077550506
                      76869 2005 "910264" "5049"     59571 "5063"     84121 "5085"    47382 ""           .    191074 .   1.0703345361657512
                      85674 2016 "929814" "2038"   7225100 "2099"   4417800 ""            . ""           .  11642900 .    .6637900754258719
                      50483 2003 "702574" "3721"    177118 "3728"     69996 ""            . ""           .    247114 .    .5960002872591195
                      81627 2000 "921552" "3663"     27111 "3699"     11905 "3825"   250548 ""           .    289564 .    .4781854286005693
                      80356 2011 "916741" "6553"    742542 "7261"   1573498 ""            . ""           .   2316040 .    .6273272414121063
                      24606 2018 "319802" "2013"    139714 "2099"    183016 ""            . ""           .    322730 .    .6841186346029393
                        277 2004 "130556" "3089"     51724 "3562"    252365 ""            . ""           .    304089 .    .4560367194512467
                      96680 2018 "999617" "3993"    327764 "3999"    846087 ""            . ""           .   1173851 .    .5922161970621084
                      96678 2016 "999617" "3993"    343748 "3999"    776877 ""            . ""           .   1120625 .    .6164728328057861
                      69629 2010 "898357" "7361"     31470 "8211"    191105 ""            . ""           .    222575 .    .4065104551783263
                      31655 2011 "41340M" "7373"    640267 "7374"    394385 "7375"   106591 ""           .   1141243 .    .9128928717123239
                      73359 2015 "905143" "2812"   1713400 "2821"    429600 "3589"   711400 ""           .   2854400 .    .9376594715541526
                      95858 2005 "993249" "2329"   3879400 "3149"   9035400 "3949"   824900 ""           .  13739700 .     .801566201755191
                      90145 2003 "936977" "1021"     91985 "1041"    149813 "1061"    56479 "1099"  435206    733483 .   1.0919438624553868
                      22004 2011 "31338H" "4512"  15411254 "4581"     22351 ""            . ""           .  15433605 . .0016651050208335485
                      15810 2017 "286738" "6311"  51023000 "6371"  11367000 ""            . ""           .  62390000 .    .4740945070565121
                      66526 2003 "887968" "4499"    270546 "5159"   4139065 ""            . ""           .   4409611 .   .23067617402856005
                      19560 2001 "299540" "4812"    783315 "4899"     66262 "7379"     4432 ""           .    854009 .   .30490091854108897
                      20526 2018 "307852" "4512"  11175002 "4581"    199980 ""            . ""           .  11374982 .   .08846806670379655
                      19707 2009 "300680" "1311"   2985201 "2911"  12723626 "4923"  1253891 ""           .  16962718 .    .7167061306999771
                      79583 2017 "916091" "7371"   4173000 "7376"   3434000 ""            . ""           .   7607000 .    .6884209248312263
                      51774 2017 "72968J" "3569"    115246 "3593"    353232 ""            . ""           .    468478 .     .557898860693073
                      73767 2004 "905320" "2759"     88090 "3421"    169789 ""            . ""           .    257879 .    .6420873175302584
                      87745 2017 "930970" "3351"   1104243 "3643"   3802023 "4911"  2285068 "8999"  295055   7486389 .    1.116064573535043
                      16071 2003 "287489" "3674"   5788286 "3679"   1931110 ""            . ""           .   7719396 .    .5647766559198281
                      83294 2007 "923139" "3661"   3580428 "4812"  13171121 "4833"  8580454 "8999" 6330442  31662446 .   1.2870152635951455
                      76066 2010 "906591" "2899"    542304 "2992"      1759 ""            . ""           .    544063 .  .021767371760907324
                      59328 2009 "867185" "5047"   6365128 "7389"    173208 ""            . ""           .   6538336 .   .12232499447188006
                      29171 2006 "357904" "1389"    519263 "4213"    344087 ""            . ""           .    863350 .    .6719607513850969
                      58324 2000 "865112" "4813"    535179 "4822"     83639 "4899"   350246 ""           .    969064 .    .9071595081062493
                      88913 2007 "932279" "4225"   1203186 "4491"    649877 ""            . ""           .   1853063 .    .6478814845538848
                      27036 2001 "326739" "3613"    125807 "3679"    146342 ""            . ""           .    272149 .    .6902977453654142
                      94080 2009 "981810" "6159"    130648 "6211"    -14902 "6552"   157040 "6719"    2433    275219 .                    .
                      12982 2011 "277164" "5945"     94914 "7311"     49028 "7812"   339979 ""           .    483921 .    .7994832421224852
                      75110 2014 "906088" "2086"   1560943 "2752"    265870 ""            . ""           .   1826813 .    .4149616144299118
                       1992 2011 "135588" "5045"   2426062 "5047"    419677 "5149"   327361 "5984" 7957314  11268591 .    .8556384559035811
                      63540 2013 "87926P" "2911"  19151455 "8742"         0 ""            . ""           .  19151456 .                    0
                      77915 2018 "912160" "1021"    376000 "1041"   6877000 ""            . ""           .   7253000 .    .2038995750287766
                      96586 2000 "998676" "."        45797 "2678"   1084728 "7996"    88284 ""           .   1218809 .   .41716907292591165
                      96576 2009 "998511" "3312"      3482 "3999"   2460515 "4911"    93370 "6552"  212405   5974155 .    .8876428171043772
                      96575 2008 "998511" "3999"   2936427 "4911"     94707 "6552"   101545 "6719" 2836301   5968980 .    .8381426335753285
                      71358 2004 "902179" "3661"   1568460 "3663"  18290774 ""            . ""           .  19859234 .    .2762683239085235
                       8109 2006 "256716" "4613"     42588 "4925"    297238 ""            . ""           .    339826 .   .37739813476337736
                      96572 2005 "998511" "3999"   1568314 "4899"    157218 "4911"   102147 "5013" 1416642   3426044 .    1.124766483989982
                      56003 2014 "772475" "1611"    921875 "1623"    475157 ""            . ""           .   1397032 .    .6379329523891328
                      72578 2018 "904793" "1531"     51535 "1542"      1346 "1799"     1279 ""           .     54160 .   .23298449581332986
                      18145 2018 "294051" "2671"    131143 "5149"    297134 "6221"    67870 "6512"    8893    505040 .    1.003054812419792
                      55756 2016 "772057" "2296"    275336 "2824"    378130 "3825"   263470 "5541"  319494   1236430 .   1.3759206287814525
                      53725 2012 "755695" "4725"    211000 "7999"  15277000 ""            . ""           .  15488000 .    .0656377508070137
                      82246 2010 "921999" "4911"   4316600 "4939"    625600 ""            . ""           .   4942200 .   .37422437280176946
                      94008 2015 "981722" "3841"   3895000 "3842"   6051000 ""            . ""           .   9946000 .    .6694648934583705
                      96551 2003 "998500" "2732"  10020081 "3721"   5689145 ""            . ""           .  15709226 .    .6546469634492531
                      77561 2008 "912049" "1031"   1573580 "1442"   1446665 "1479"  1774447 "6719"       0   4794692 .   1.0950682600978716
                      84074 2007 "923816" "2082"    161742 "2091"    281505 "2834"   234404 "5149" 1713975   2391626 .    .8995514958441745
                      33297 2013 "503113" "1531"    131123 "7011"     12667 "7521"   105166 ""           .    248956 .    .8532407740592687
                      40316 2003 "539860" "3211"    318811 "3221"    302592 "3229"   437598 "3295"  353120   1287240 .                    .
                      39908 2015 "53497W" "1531"     33603 "1542"      7082 "1629"        0 "6513"     977     41662 .    .5626195366778862
                      53096 2012 "749263" "2821"    236878 "3089"    357854 ""            . ""           .    594732 .    .6727360707405245
                      88999 2013 "932300" "4412"  12728771 "4424"    516484 "7363"  1043233 ""           .  14288488 .    .4169295611484579
                      41410 2014 "543995" "1389"    610896 "8711"    652193 ""            . ""           .   1263089 .    .6926125958817004
                      94715 2014 "982696" "3253"    123063 "4813"     22278 "7993"    16995 ""           .    162336 .    .7213730950753721
                       2361 2004 "13849W" "4911"    173275 "4924"    438596 ""            . ""           .    611871 .    .5959398337394831
                      68005 2011 "8916Y6" "5122"    215012 "5169"    901929 ""            . ""           .   1116941 .   .48982885319253866
                      80087 2012 "916548" "5399" 191624000 "5411" 274490000 ""            . ""           . 466113984 .     .679335497684458
                      81832 2018 "921725" "3621"   1816913 "3823"   3028959 ""            . ""           .   4845872 .    .6615327657527879
                      96510 2018 "998306" "2844"   1928549 "8742"    154657 ""            . ""           .   2083206 .   .26447032351513494
                      25247 2012 "32167V" "3531"   1250546 "6159"     16804 ""            . ""           .   1267350 .   .07049037729425887
                      96507 2015 "998306" "2844"   1764535 "8742"    100207 ""            . ""           .   1864742 .   .20937704578798727
                      96505 2013 "998306" "2844"   2157114 "8742"    108839 ""            . ""           .   2265953 .   .19268064068467097
                      96504 2012 "998306" "2844"   2143335 "8742"    112070 ""            . ""           .   2255405 .   .19760040364514558
                      77236 2013 "9111AM" "6221"      7100 "7372"   1206800 "7379"   625300 ""           .   1839200 .    .6647225505875022
                      92805 2015 "951501" "7374"    875057 "7375"    663344 ""            . ""           .   1538401 .    .6814944960904958
                      37748 2013 "51373D" "1221"   1770949 "1222"    105649 ""            . ""           .   1876598 .   .21665743963680173
                      96496 2004 "998306" "2676"   1417447 "2679"    358717 ""            . ""           .   1776164 .    .5031098426124845
                      96494 2002 "998306" "2631"    101491 "2676"   1116877 "2679"   319858 ""           .   1538226 .    .7383441642737287
                      67711 2005 "890928" "1611"     20230 "2833"      7265 "5052"   411041 "7011"    2395    440931 .    .3031199546334257
                      75732 2004 "906453" "3663"    163606 "4812"     59784 ""            . ""           .    223390 .    .5808789016222112
                      54490 2003 "756787" "7389"     22418 "8711"    590977 ""            . ""           .    613395 .    .1568105758155463
                      90887 2004 "938928" "3827"    388660 "3841"    101516 ""            . ""           .    490176 .    .5100904614396276
                      93850 2016 "9813U0" "3732"    221600 "7993"         0 ""            . ""           .    221600 .                    0
                      end

                      Comment

                      Working...
                      X